
Role Purpose:
As a DevOps , You should be strong in both the Dev and Ops part of DevOps. We are looking for someone who has a deep understanding of systems architecture, understands core CS concepts well, and is able to reason about system behaviour rather than merely working with the toolset of the day. We believe that only such a person will be able to set a compelling direction for the team and excite those around them.
If you are someone who fits the description above, you will find that the rewards are well worth the high bar. Being one of the early hires of the Bangalore office, you will have a significant impact on the culture and the team; you will work with a set of energetic and hungry peers who will challenge you, and you will have considerable international exposure and opportunity for impact across departments.
Responsibilities
- Deployment, management, and administration of web services in a public cloud environment
- Design and develop solutions for deploying highly secure, highly available, performant and scalable services in elastically provisioned environments
- Design and develop continuous integration and continuous deployment solutions from development through production
- Own all operational aspects of running web services including automation, monitoring and alerting, reliability and performance
- Have direct impact on running a business by thinking about innovative solutions to operational problems
- Drive solutions and communication for production impacting incidents
- Running technical projects and being responsible for project-level deliveries
- Partner well with engineering and business teams across continents
Required Qualifications
- Bachelor’s or advanced degree in Computer Science or closely related field
- 4 - 6 years professional experience in DevOps, with at least 1/2 years in Linux / Unix
- Very strong in core CS concepts around operating systems, networks, and systems architecture including web services
- Strong scripting experience in Python and Bash
- Deep experience administering, running and deploying AWS based services
- Solid experience with Terraform, Packer and Docker or their equivalents
- Knowledge of security protocols and certificate infrastructure.
- Strong debugging, troubleshooting, and problem solving skills
- Broad experience with cloud hosted applications including virtualization platforms, relational and non relational data stores, reverse proxies, and orchestration platforms
- Curiosity, continuous learning and drive to continually raise the bar
- Strong partnering and communication skills
Preferred Qualifications
- Past experience as a senior developer or application architect strongly preferred.
- Experience building continuous integration and continuous deployment pipelines
- Experience with Zookeeper, Consul, HAProxy, ELK-Stack, Kafka, PostgreSQL.
- Experience working with, and preferably designing, a system compliant to any security framework (PCI DSS, ISO 27000, HIPPA, SOC 2, ...)
- Experience with AWS orchestration services such as ECS and EKS.
- Experience working with AWS ML pipeline services like AWS Sagemak

Similar jobs
We seek a skilled and motivated Azure DevOps engineer to join our dynamic team. The ideal candidate will design, implement, and manage CI/CD pipelines, automate deployments, and optimize cloud infrastructure using Azure DevOps tools and services. You will collaborate closely with development and IT teams to ensure seamless integration and delivery of software solutions in a fast-paced environment.
Responsibilities:
- Design, implement, and manage CI/CD pipelines using Azure DevOps.
- Automate infrastructure provisioning and deployments using Infrastructure as Code (IaC) tools like Terraform, ARM templates, or Azure CLI.
- Monitor and optimize Azure environments to ensure high availability, performance, and security.
- Collaborate with development, QA, and IT teams to streamline the software development lifecycle (SDLC).
- Troubleshoot and resolve issues related to build, deployment, and infrastructure.
- Implement and manage version control systems, primarily using Git.
- Manage containerization and orchestration using tools like Docker and Kubernetes.
- Ensure compliance with industry standards and best practices for security, scalability, and reliability.
- Experience using AWS (that’s just common sense)
- Experience designing and building web environments on AWS, which includes working with services like EC2, ELB, RDS, and S3
- Experience building and maintaining cloud-native applications
- A solid background in Linux/Unix and Windows server system administration
- Experience using https://www.simplilearn.com/tutorials/devops-tutorial/devops-tools" target="_blank">DevOps tools in a cloud environment, such as Ansible, Artifactory, https://www.simplilearn.com/tutorials/docker-tutorial/what-is-docker-container" target="_blank">Docker, GitHub, https://www.simplilearn.com/tutorials/jenkins-tutorial/what-is-jenkins" target="_blank">Jenkins, https://www.simplilearn.com/tutorials/kubernetes-tutorial/what-is-kubernetes" target="_blank">Kubernetes, Maven, and Sonar Qube
- Experience installing and configuring different application servers such as JBoss, Tomcat, and WebLogic
- Experience using monitoring solutions like CloudWatch, ELK Stack, and Prometheus
- An understanding of writing Infrastructure-as-Code (IaC), using tools like CloudFormation or Terraform
- Knowledge of one or more of the most-used programming languages available for today’s cloud computing (i.e., SQL data, XML data, R math, Clojure math, Haskell functional, Erlang functional, Python procedural, and Go procedural languages)
- Experience in troubleshooting distributed systems
- Proficiency in script development and scripting languages
- The ability to be a team player
- The ability and skill to train other people in procedural and technical topics
- Strong communication and collaboration skills
As a special aside, an AWS engineer who works in DevOps should also have experience with:
- The theory, concepts, and real-world application of Continuous Delivery (CD), which requires familiarity with tools like AWS CodeBuild, AWS CodeDeploy, and AWS CodePipeline
- An understanding of automation
Why this role exists
Our infrastructure footprint is growing faster than our headcount, and we believe most of that
gap should be closed by automation and AI agents — not by hiring more humans to do toil. We
need someone early in their career who treats manual work as a bug, ships scripts and agents
instead of tickets, and wants to grow into deeper ownership over the next two years.
You will not be the most senior person on the team. You will be the one who multiplies the team.
What you'll own
In your first 1 months
• Take ownership of one slice of our CI/CD pipeline and make it measurably
faster, more reliable, or cheaper. We expect a number on a dashboard to move.
• Build at least three internal automations that replace manual ops toil —
using AI agents (Claude Code, agentic CLIs, scripted LLM workflows) as your force
multiplier.
• Be the first responder for a defined set of alerts. Write the runbooks. Drive
the alert volume down.
• Support senior engineers on AI/ML infrastructure (GPU nodes, inference
services, model deployment) — observe, document, and gradually take on contained
changes under review.
By 3 months you should be
• The go-to person for at least two production systems.
• Shipping routine infrastructure changes without needing senior review.
• Treating "manual" as a code smell.
Required (we will reject without these)
• 0–3 years hands-on experience with one major cloud (AWS, GCP, or
Azure — one is fine, depth beats breadth).
• Fluent in Linux command line, bash, and at least one scripting language
(Python or Go preferred).
• Have shipped something to production that real users hit. A side project
counts; a graded coursework lab does not.
• Comfortable with Docker — you can explain what an image vs. a
container is and why it matters.
• Working knowledge of networking fundamentals: DNS, HTTP/HTTPS,
TLS, ports, basic subnets — enough to debug "it works on my machine."
• Git fluency: branches, merges, rebases, conflict resolution.
• CI/CD pipelines — you have authored or substantially modified pipelines
in GitHub Actions, GitLab CI, ArgoCD, Jenkins, or similar. Not just "I clicked Re-run."
• Kubernetes basics — kubectl for real work, can read pod logs,
understand deployments and services, can debug a CrashLoopBackOff without
panicking. You do not need to have run a cluster; you do need to have lived inside one.
• Active user of AI coding agents (Claude Code, Cursor, Copilot, agentic
CLIs, etc.). You should be able to walk us through specific tasks where they made you
faster, and specific tasks where they failed you and how you noticed. "I have tried it" is
not enough.
Bonus (real plus, not required)
• Infrastructure as Code: Terraform, Pulumi, or Ansible.
• Observability: Prometheus/Grafana, Datadog, OpenTelemetry, any APM.
• Have built or extended an LLM-based agent — a custom MCP server, a
scripted multi-step workflow, an internal tool that calls models in a loop. Anything beyond
chat-with-Claude.
• Exposure to GPU workloads, model serving (vLLM, Triton, TGI, etc.), or
ML pipelines.
What we don't care about
• Whether your degree is in CS — or whether you have a degree at all.
• Brand-name companies on your resume.
• Certifications. They are fine. They do not substitute for having shipped.
How we work
• We default to automation. If you do something manually twice, the third
time you script it or hand it to an agent.
• AI agents are part of the workflow, not a novelty. Expect interview
questions about exactly how you use them — and where you have caught them being
wrong.
• Small, reversible changes beat big-bang rollouts.
• Postmortems are blameless and written down.
• We push back on each other. If you only execute, you will be unhappy
here.
How to apply
Send:
• Your resume.
• A short note (≤200 words) describing one infra or automation problem you
solved, and how AI agents factored in — or did not, and why. We read these. Generic
notes get rejected.
Internal note — delete before posting externally
• Comp band, location policy, team name, and reporting line marked
[CONFIRM] need to be filled in before this goes external.
• The Required list is intentionally tight: CI/CD and Kubernetes basics
promoted from bonus. Expect this to filter ~80% of typical junior DevOps applicants. The
remaining pool will skew toward people who have actually shipped infra at a startup, not
bootcamp grads or pure cloud-cert holders.
• IaC, observability, agent-building, and GPU/ML serving stay as bonus.
Promoting any of these to required at 0–3 yrs collapses the pool to near-zero or forces
hiring senior people at junior comp. If you want IaC required, re-level this to mid (3–5
yrs) and raise the band.
• Screening implication: the resume screen should explicitly check for
CI/CD pipeline authorship and any K8s-touching production work. If neither is on the
resume, reject at screen. Do not waste interview slots.
• Pipeline watch: if fewer than ~15 qualified resumes after 2 weeks of
active sourcing, the first thing to relax is the AI-agent-fluency bar (move to bonus and
screen for it in interview instead). Do not relax the "shipped to production" requirement
— that is the load-bearing filter.
About Us:
Tradelab Technologies Pvt Ltd is not for those seeking comfort—we are for those hungry to make a mark in the trading and fintech industry.
Key Responsibilities
CI/CD and Infrastructure Automation
- Design, implement, and maintain CI/CD pipelines to support fast and reliable releases
- Automate deployments using tools such as Terraform, Helm, and Kubernetes
- Improve build and release processes to support high-performance and low-latency trading applications
- Work efficiently with Linux/Unix environments
Cloud and On-Prem Infrastructure Management
- Deploy, manage, and optimize infrastructure on AWS, GCP, and on-premises environments
- Ensure system reliability, scalability, and high availability
- Implement Infrastructure as Code (IaC) to standardize and streamline deployments
Performance Monitoring and Optimization
- Monitor system performance and latency using Prometheus, Grafana, and ELK stack
- Implement proactive alerting and fault detection to ensure system stability
- Troubleshoot and optimize system components for maximum efficiency
Security and Compliance
- Apply DevSecOps principles to ensure secure deployment and access management
- Maintain compliance with financial industry regulations such as SEBI
- Conduct vulnerability assessments and maintain logging and audit controls
Required Skills and Qualifications
- 2+ years of experience as a DevOps Engineer in a software or trading environment
- Strong expertise in CI/CD tools (Jenkins, GitLab CI/CD, ArgoCD)
- Proficiency in cloud platforms such as AWS and GCP
- Hands-on experience with Docker and Kubernetes
- Experience with Terraform or CloudFormation for IaC
- Strong Linux administration and networking fundamentals (TCP/IP, DNS, firewalls)
- Familiarity with Prometheus, Grafana, and ELK stack
- Proficiency in scripting using Python, Bash, or Go
- Solid understanding of security best practices including IAM, encryption, and network policies
Good to Have (Optional)
- Experience with low-latency trading infrastructure or real-time market data systems
- Knowledge of high-frequency trading environments
- Exposure to FIX protocol, FPGA, or network optimization techniques
- Familiarity with Redis or Nginx for real-time data handling
Why Join Us?
- Work with a team that expects and delivers excellence.
- A culture where risk-taking is rewarded, and complacency is not.
- Limitless opportunities for growth—if you can handle the pace.
- A place where learning is currency, and outperformance is the only metric that matters.
- The opportunity to build systems that move markets, execute trades in microseconds, and redefine fintech.
This isn’t just a job—it’s a proving ground. Ready to take the leap? Apply now.
Please Apply - https://zrec.in/L51Qf?source=CareerSite
About Us
Infra360 Solutions is a services company specializing in Cloud, DevSecOps, Security, and Observability solutions. We help technology companies adapt DevOps culture in their organization by focusing on long-term DevOps roadmap. We focus on identifying technical and cultural issues in the journey of successfully implementing the DevOps practices in the organization and work with respective teams to fix issues to increase overall productivity. We also do training sessions for the developers and make them realize the importance of DevOps. We provide these services - DevOps, DevSecOps, FinOps, Cost Optimizations, CI/CD, Observability, Cloud Security, Containerization, Cloud Migration, Site Reliability, Performance Optimizations, SIEM and SecOps, Serverless automation, Well-Architected Review, MLOps, Governance, Risk & Compliance. We do assessments of technology architecture, security, governance, compliance, and DevOps maturity model for any technology company and help them optimize their cloud cost, streamline their technology architecture, and set up processes to improve the availability and reliability of their website and applications. We set up tools for monitoring, logging, and observability. We focus on bringing the DevOps culture to the organization to improve its efficiency and delivery.
Job Description
Job Title: Senior DevOps Engineer (Infrastructure/SRE)
Department: Technology
Location: Gurgaon
Work Mode: On-site
Working Hours: 10 AM - 7 PM
Terms: Permanent
Experience: 4-6 years
Education: B.Tech/MCA
Notice Period: Immediately
About Us
At Infra360.io, we are a next-generation cloud consulting and services company committed to delivering comprehensive, 360-degree solutions for cloud, infrastructure, DevOps, and security. We partner with clients to transform and optimize their technology landscape, ensuring resilience, scalability, cost efficiency and innovation.
Our core services include Cloud Strategy, Site Reliability Engineering (SRE), DevOps, Cloud Security Posture Management (CSPM), and related Managed Services. We specialize in driving operational excellence across multi-cloud environments, helping businesses achieve their goals with agility and reliability.
We thrive on ownership, collaboration, problem-solving, and excellence, fostering an environment where innovation and continuous learning are at the forefront. Join us as we expand and redefine what’s possible in cloud technology and infrastructure.
Role Summary
We are looking for a Senior DevOps Engineer (Infrastructure) to design, automate, and manage cloud-based and datacentre infrastructure for diverse projects. The ideal candidate will have deep expertise in a public cloud platform (AWS, GCP, or Azure), with a strong focus on cost optimization, security best practices, and infrastructure automation using tools like Terraform and CI/CD pipelines.
This role involves designing scalable architectures (containers, serverless, and VMs), managing databases, and ensuring system observability with tools like Prometheus and Grafana. Strong leadership, client communication, and team mentoring skills are essential. Experience with VPN technologies and configuration management tools (Ansible, Helm) is also critical. Multi-cloud experience and familiarity with APM tools are a plus.
Ideal Candidate Profile
- Solid 4-6 years of experience as a DevOps engineer with a proven track record of architecting and automating solutions on Cloud
- Experience in troubleshooting production incidents and handling high-pressure situations.
- Strong leadership skills and the ability to mentor team members and provide guidance on best practices.
- Bachelor's or Master's degree in Computer Science, Engineering, or a related field.
- Extensive experience with Kubernetes, Terraform, ArgoCD, and Helm.
- Strong with at least one public cloud AWS/GCP/Azure
- Strong with Cost Optimization and Security Best practices
- Strong with Infrastructure automation using Terraform and CI/CD automation
- Strong with Configuration Management using Ansible, Helm etc
- Good with designing architectures (Containers, Serverless, VMs etc)
- Hands-on Experience working on Multiple Projects
- Strong with Client communication and requirements gathering
- Databases management experience
- Good experience with Prometheus, Grafana & Alert Manager
- Able to manage multiple clients and take ownership of client issues.
- Experience with Git and coding best practices
- Proficiency in cloud networking, including VPCs, DNS, VPNs (OpenVPN, OpenSwan, Pritunl, Site-to-Site VPNs), load balancers, and firewalls, ensuring secure and efficient connectivity.
- Strong understanding of cloud security best practices, identity and access management (IAM), and compliance requirements for modern infrastructure.
Good to have
- Multi-cloud experience with AWS, GCP & Azure
- Experience with APM & Observability tools like - Newrelic, Datadog, and OpenTelemetry
- Proficiency in scripting languages (Python, Go) for automation and tooling to improve infrastructure and application reliability.
Key Responsibilities
- Design and Development:
- Architect, design, and develop high-quality, scalable, and secure cloud-based software solutions.
- Collaborate with product and engineering teams to translate business requirements into technical specifications.
- Write clean, maintainable, and efficient code, following best practices and coding standards.
- Cloud Infrastructure:
- Develop and optimise cloud-native applications, leveraging cloud services like AWS, Azure, or Google Cloud Platform (GCP).
- Implement and manage CI/CD pipelines for automated deployment and testing.
- Ensure the security, reliability, and performance of cloud infrastructure.
- Technical Leadership:
- Mentor and guide junior engineers, providing technical leadership and fostering a collaborative team environment.
- Participate in code reviews, ensuring adherence to best practices and high-quality code delivery.
- Lead technical discussions and contribute to architectural decisions.
- Problem Solving and Troubleshooting:
- Identify, diagnose, and resolve complex software and infrastructure issues.
- Perform root cause analysis for production incidents and implement preventative measures.
- Continuous Improvement:
- Stay up-to-date with the latest industry trends, tools, and technologies in cloud computing and software engineering.
- Contribute to the continuous improvement of development processes, tools, and methodologies.
- Drive innovation by experimenting with new technologies and solutions to enhance the platform.
- Collaboration:
- Work closely with DevOps, QA, and other teams to ensure smooth integration and delivery of software releases.
- Communicate effectively with stakeholders, including technical and non-technical team members.
- Client Interaction & Management:
- Will serve as a direct point of contact for multiple clients.
- Able to handle the unique technical needs and challenges of two or more clients concurrently.
- Involve both direct interaction with clients and internal team coordination.
- Production Systems Management:
- Must have extensive experience in managing, monitoring, and debugging production environments.
- Will work on troubleshooting complex issues and ensure that production systems are running smoothly with minimal downtime.
Experience: 8-10yrs
Notice Period: max 15days
Must-haves*
1. Knowledge about Database/NoSQL DB hosting fundamentals (RDS multi-AZ, DynamoDB, MongoDB, and such)
2. Knowledge of different storage platforms on AWS (EBS, EFS, FSx) - mounting persistent volumes with Docker Containers
3. In-depth knowledge of Security principles on AWS (WAF, DDoS, Security Groups, NACL's, IAM groups, and SSO)
4. Knowledge on CI/CD platforms is required (Jenkins, GitHub actions, etc.) - Migration of AWS Code pipelines to GitHub actions
5. Knowledge of vast variety of AWS services (SNS, SES, SQS, Athena, Kinesis, S3, ECS, EKS, etc.) is required
6. Knowledge on Infrastructure as Code tool is required We use Cloudformation. (Terraform is a plus), ideally, we would like to migrate to Terraform from CloudFormation
7. Setting CloudWatch Alarms and SMS/Email Slack alerts.
8. Some Knowledge on configuring any kind of monitoring tool such as Prometheus, Dynatrace, etc. (We currently use Datadog, CloudWatch)
9. Experience with any CDN provider configurations (Cloudflare, Fastly, or CloudFront)
10. Experience with either Python or Go scripting language.
11. Experience with Git branching strategy
12. Containers hosting knowledge on both Windows and Linux
The below list is *Nice to Have*
1. Integration experience with Code Quality tools (SonarQube, NetSparker, etc) with CI/CD
2. Kubernetes
3. CDN's other than CloudFront (Cloudflare, Fastly, etc)
4. Collaboration with multiple teams
5. GitOps
- Proven experience in handling large infrastructure and distributed systems like Kafka, Yarn, Elastic Search, etc..
- Familiarity with Python-related technologies and frameworks like Django or Pyramid.
- Experience with Unix/Linux operating systems internals and administration (e.g. filesystems, inodes, system calls, etc) or networking (e.g. TCP/IP, routing, network topologies, and hardware, SDN, etc)
- Familiarity with at least one of the cloud computing infrastructures - GCP / Azure / AWS
- Familiarity with task queue frameworks like Celery or Pika is a plus.
- Source code management and Implementation of security best practices.
- Experienced in building monitoring/metrics & alerting tool (APM tool), a custom dashboard for each Application stack against the supported environment
- Good understanding & implementation experience using 12-factor App principles
- Awareness of Cloud Security concepts
- Awareness of Information Security concepts and Best Practices
- Administration and Support for Azure DevOps Server/Services
- Migration from Azure DevOps Server to Azure DevOps Services (SaaS)
- Process Template Customization and Deployment model
- Migration, Upgrade, Monitor, and Maintenance of ADS Instance
- Automation using REST API to build Extensions and Custom Reporting
- Expert in all Modules of Azure DevOps Server/Service (Work Item, SCM/VC, Build, Release, Test, Reporting Management)"
- CICD Orchestration tools and other SCM/VC tools
- Microsoft MCSD Application Lifecycle Management certified
- A bachelor or master degree with a minimum of 6 years relevant work experience in Azure DevOps Server/Services (SaaS)
- Good communication skills
- Strong knowledge of application lifecycle workflows and processes involved in the design, development, deployment, test, and maintenance of software systems in the Windows environment
- Visual Studio and the .NET Framework experience is required "
- Administration and Support for Azure DevOps Server/Services
- Migration from Azure DevOps Server to Azure DevOps Services (SaaS)
- Process Template Customization and Deployment model
- Work with the user community to adopt new features, enable new use cases, and help resolve any issues
- Create customizations and tools to help support the team’s needs (PM, Dev, Test, & Ops)
- Take the lead in the validation of the application.
- Monitor the health of the solution and take proactive steps to ensure reliable availability and performance
- Manage patches and updates for tooling solutions and related hosting environments including the operating system
- Automate the process for Maintenance"












