
DevOps Engineer (Cloud & Infrastructure)
š Noida | š Full-Time | š§ Experience: 2ā4 years
About TestMu AI
TestMu AI (formerly LambdaTest) is an AI-native platform designed to move software testing beyond simple automation into the era of agentic intelligence. It provides end-to-end AI agents that manage the entire Quality Engineering lifecycle.
- Full-Stack AI Agents:Ā Autonomously plan, author, execute, and analyze tests across the SDLC.
- Comprehensive Coverage:Ā Supports web, mobile, and enterprise applications.
- Real-World Testing:Ā Scale execution across real devices, browsers, and custom environments.
About the Role
This isn't a role for someone who just wants to "maintain" systems. As aĀ DevOps EngineerĀ at TestMu AI, you are the architect of the automated highways that power our AI agents. You will step into a fast-paced environment where you bridge the gap between cloud-native automation and core infrastructure.
You will manage complex CI/CD pipelines, troubleshoot deep-seated Linux issues, and ensure our hybrid-cloud environment (AWS/Azure) is as resilient as the code it runs.
Key Responsibilities: The Pillars of Growth
A. DevOps & Automation (50% Focus)
- Platform Orchestration: Lead the migration to modular, self-healingĀ TerraformĀ andĀ HelmĀ templates.
- Agentic CI/CD: ArchitectĀ GitHub ActionsĀ workflows that treat AI agents as first-class citizens, automating environment promotion and risk scoring.
- Kubernetes Mastery: Advanced management ofĀ DockerĀ andĀ K8sĀ clusters to support scalable production workloads.
- Predictive Observability: UseĀ Prometheus, Grafana, and ELKĀ to move from reactive alerts to autonomous anomaly detection.
B. Networking & Data Center Mastery (30% Focus)
- Hybrid Networking: Design and troubleshoot VPCs and subnets inĀ Azure/AWS/GCP, paired with physicalĀ VLANsĀ and switches in our data centers.
- Bare-Metal Lifecycle: Automate hardware provisioning, RAID setup, and firmware updates for our real-device cloud.
- Remote Admin: Master out-of-band management (iDRAC, iLO, IPMI) to ensure 100% remote operational capability.
- Core Protocols: Own the lifecycle of DNS, DHCP, Load Balancing, and IPAM across distributed environments.
C. Development & Scripting (20% Focus)
- Backend Integration: Debug and optimizeĀ Python or GoĀ code; understanding how logic interacts with system-level resources.
- Advanced Scripting: Write idempotentĀ Bash/PythonĀ scripts to automate complex, multi-server operational tasks.
- Agentic Tooling: Support the integration of LLM-based developer tools into DevOps workflows to eliminate "toil".
The Interview Journey
We value your ability to solve problems under pressure more than your ability to memorize documentation.
- Technical Round 1 (DevOps Leads): A live session focused on real-world debugging scenarios and Linux fundamentals.
- Technical Round 2 (Hiring Manager / Pod Lead): An assessment of your architectural thinking, automation strategy, and team alignment.
- Technical Round 3 (SVP Engineering / VP DevOps): Strategic discussion on scalability, infrastructure vision, and technical leadership.
- Final Round (CEO): Mission alignment, cultural fit, and the "big picture" at TestMu AI.
Growth Timeline
This is a high-visibility role. You will receive direct mentorship from our senior engineering leadership. As you master our production environment, you will have a clear path to move intoĀ Senior DevOps EngineerĀ orĀ Infrastructure ArchitectĀ roles as our pods scale.
Perks That Matter
Health Cover: Comprehensive insurance for you and your family.
Fresh Meals: Daily catered meals at the office.
Transport: Safe cab facilities for eligible shifts.
Pod Budgets: Dedicated engagement budgets for team building and offsites.

About TestMu AI (Formely LambdaTest)
About
TestMu AI (formerly LambdaTest) is the worldās first full-stack Agentic AI Quality Engineering Platform.
We built TestMu AI for a reality where software is written by AI and must be shipped at machine speed.
Tech stack
Connect with the team
Similar jobs
JOB DETAILS:
- Job Title: Senior Devops Engineer 2
- Industry: Ride-hailing
- Experience: 5-7 years
- Working Days: 5 days/week
- Work Mode: ONSITE
- Job Location: Bangalore
- CTC Range: Best in Industry
Required Skills: Cloud & Infrastructure Operations, Kubernetes & Container Orchestration, Monitoring, Reliability & Observability, Proficiency with Terraform, Ansible etc., Strong problem-solving skills with scripting (Python/Go/Shell)
Ā
Criteria:
1.Ā Ā Ā Candidate must be from a product-based or scalable app-based start-ups company with experience handling large-scale production traffic.
2.Ā Ā Ā Minimum 5 yrs of experience working as a DevOps/Infrastructure Consultant
3.Ā Ā Ā Own end-to-end infrastructure right from non-prod to prod environment including self-managed
4.Ā Ā Ā Candidate must have experience in database migration from scratchĀ
5.Ā Ā Ā Must have a firm hold on the container orchestration tool Kubernetes
6.Ā Ā Ā Must have expertise in configuration management tools like Ansible, Terraform, Chef / Puppet
7.Ā Ā Ā Understanding programming languages like GO/Python, and Java
8.Ā Ā Ā Working on databases like Mongo/Redis/Cassandra/Elasticsearch/Kafka.
9.Ā Ā Ā Working experience on Cloud platform - AWS
10. Candidate shouldĀ have Minimum 1.5 years stability per organization, and a clear reason for relocation.
Ā
DescriptionĀ
Job Summary:
As a DevOps Engineer at company, you will be working on building and operating infrastructure at scale, designing and implementing a variety of tools to enable product teams to build and deploy their services independently, improving observability across the board, and designing for security, resiliency, availability, and stability. If the prospect of ensuring system reliability at scale and exploring cutting-edge technology to solve problems, excites you, then this is your fit.
Ā
Job Responsibilities:
ā Own end-to-end infrastructure right from non-prod to prod environment including self-managed DBs
ā Codify our infrastructure
ā Do what it takes to keep the uptime above 99.99%
ā Understand the bigger picture and sail through the ambiguities
ā Scale technology considering cost and observability and manage end-to-end processes
ā Understand DevOps philosophy and evangelize the principles across the organization
ā Strong communication and collaboration skills to break down the silos
Ā
Job Requirements:
ā B.Tech. / B.E. degree in Computer Science or equivalent software engineering degree/experience
ā Minimum 5 yrs of experience working as a DevOps/Infrastructure Consultant
ā Must have a firm hold on the container orchestration tool Kubernetes
ā Must have expertise in configuration management tools like Ansible, Terraform, Chef / Puppet
ā Strong problem-solving skills, and ability to write scripts using any scripting language
ā Understanding programming languages like GO/Python, and Java
ā Comfortable working on databases like Mongo/Redis/Cassandra/Elasticsearch/Kafka.
Ā
Whatās there for you?
Companyās team handles everything ā infra, tooling, and self-manages a bunch of databases, such as
ā 150+ microservices with event-driven architecture across different tech stacks Golang/ java/ node
ā More than 100,000 Request per second on our edge gateways
ā ~20,000 events per second on self-managed Kafka
ā 100s of TB of data on self-managed databases
ā 100s of real-time continuous deployment to production
ā Self-managed infra supporting
ā 100% OSS
We are looking for a highly skilled DevOps/Cloud Engineer with over 6 years of experience in infrastructure automation, cloud platforms, networking, and security. If you are passionate about designing scalable systems and love solving complex cloud and DevOps challengesāthis opportunity is for you.
Key Responsibilities
- Design, deploy, and manage cloud-native infrastructure using Kubernetes (K8s), Helm, Terraform, and Ansible
- Automate provisioning and orchestration workflows for cloud and hybrid environments
- Manage and optimize deployments on AWS, Azure, and GCP for high availability and cost efficiency
- Troubleshoot and implement advanced network architectures including VPNs, firewalls, load balancers, and routing protocols
- Implement and enforce security best practices: IAM, encryption, compliance, and vulnerability management
- Collaborate with development and operations teams to improve CI/CD workflows and system observability
Required Skills & Qualifications
- 6+ years of experience in DevOps, Infrastructure as Code (IaC), and cloud-native systems
- Expertise in Helm, Terraform, and Kubernetes
- Strong hands-on experience with AWS and Azure
- Solid understanding of networking, firewall configurations, and security protocols
- Experience with CI/CD tools like Jenkins, GitHub Actions, or similar
- Strong problem-solving skills and a performance-first mindset
Why Join Us?
- Work on cutting-edge cloud infrastructure across diverse industries
- Be part of a collaborative, forward-thinking team
- Flexible hybrid work model ā work from anywhere while staying connected
- Opportunity to take ownership and lead critical DevOps initiatives
Job Overview:
We are looking for a seasoned OpenStack Administrator with strong expertise in managing large-scale production environments. The ideal candidate should have hands-on experience with Linux, Kubernetes, and OpenShift, and be capable of performing routine maintenance, upgrades, and troubleshooting in complex cloud infrastructures.
The candidate must also be comfortable working with Red Hat support, managing escalations, and communicating effectively with both internal teams and external clients.
Key Skills & Qualifications:
- Proven experience managing OpenStack infrastructure in production.
- Strong proficiency in Linux system administration (RHEL/CentOS preferred).
- Hands-on experience with Kubernetes and OpenShift.
- Experience with system monitoring, log management, and troubleshooting tools.
- Familiarity with RH support portal, managing cases, and following up on resolutions.
- Excellent problem-solving skills and ability to work under pressure.
- Strong client communication skills and ability to articulate technical issues clearly.
- Proven ability to work in and manage large-scale production environments.
Candidates with OpenStack certification will be preferred.
ā Good understanding of how the web works
ā Experience with at least one language like Java, Python etc
ā Good with Shell scripting
ā Experience with *Nix based operating systems
ā Experience with k8s, containers
ā Fairly good understanding of AWS/GCP/Azure
ā Troubleshoot and fix outages and performance issues in infrastructure stack
ā Identify gap and design automation tools for all feasible functions in infrastructure
ā Good verbal and written communication skills
ā Drive SLA/SLO of team
Benefits
This is an opportunity to work on a fairly complex set of systems and improve
them. You will get a chance to learn things like āhow to think about code
simplicityā, āhow to write for maintainabilityā and several other things.
ā Comprehensive health insurance policy.
ā Flexible working hours and a very friendly work environment.
ā Flexibility to work either in the office (post Covid) or remotely.
⢠Support software build and release efforts:
⢠Create, set up, and maintain builds
⢠Review build results and resolve build problems
⢠Create and Maintain build servers
⢠Plan, manage, and control product releases
⢠Validate, archive, and escrow product releases
⢠Maintain and administer configuration management tools, including source control, defect management, project management, and other systems.
⢠Develop scripts and programs to automate process and integrate tools.
⢠Resolve help desk requests from worldwide product development staff.
⢠Participate in team and process improvement projects.
⢠Interact with product development teams to plan and implement tool and build improvements.
⢠Perform other duties as assigned.
While the job description describes what is anticipated as the requirements of the position, the job requirements are subject to change based upon any changing needs and requirements of the business.
Required Skills
⢠TFS 2017 vNext Builds or AzureDevOps Builds Process
⢠Must to have PowerShell 3.0+ Scripting knowledge
⢠Exposure on Build Tools like MSbuild, NANT, XCode.
⢠Exposure on Creating and Maintaining vCenter/VMware vSphere 6.5
⢠Hands On experiences on above Win2k12 OS and basic info on MacOS
⢠Good to have Shell or Batch Script (optional)
Required Experience
Candidates for this position should hold the following qualifications to be considered as a suitable applicant. Please note that except where specified as āpreferred,ā or as a āplus,ā all points listed below are considered minimum requirements.
⢠Bachelors Degree in a related discipline is strongly preferred
⢠3 or more years experience with Software Configuration Management tools, concepts, and processes.
⢠Exposure to Source control systems such as TFS, GIT, or Subversion (Optional)
⢠Familiarity with object-oriented concepts and programming in C# and Power Shell Scripting.
⢠Experience working on AzureDevOps Builds or vNext Builds or Jenkins Builds
⢠Experience working with developers to resolve development issues related to source control systems.
Key Responsibilities:
- Work with the development team to plan, execute and monitor deployments
- Capacity planning for product deployments
- Adopt best practices for deployment and monitoring systems
- Ensure the SLAs for performance, up time are met
- Constantly monitor systems, suggest changes to improve performance and decrease costs.
- Ensure the highest standards of security
Key Competencies (Functional):
Ā
- Proficiency in coding in atleast one scripting language - bash, Python, etc
- Has personally managed a fleet of servers (> 15)
- Understand different environments production, deployment and staging
- Worked in micro service / Service oriented architecture systems
- Has worked with automated deployment systems ā Ansible / Chef / Puppet.
- Can write MySQL queries
We are looking for a Senior Platform Engineer responsible for handling our GCP/AWS clouds. TheĀ candidate will be responsible for automating the deployment of cloud infrastructure and services toĀ support application development and hosting (architecting, engineering, deploying, and operationallyĀ managing the underlying logical and physical cloud computing infrastructure).
Job Description:
ā Collaborate with teams to build and deliver solutions implementing serverless, microservice-based, IaaS, PaaS, and containerized architectures in GCP/AWS environments.
āResponsible for deploying highly complex, distributed transaction processing systems.
ā Work on continuous improvement of the products through innovation and learning. Someone with a knack for benchmarking and optimization
ā Hiring, developing, and cultivating a high and reliable cloud support team ā Building and operating complex CI/CD pipelines at scale
ā Work with GCP Services, Private Service Connect, Cloud Run, Cloud Functions, Pub/Sub, Cloud Storage, Networking
ā Collaborate with Product Management and Product Engineering teams to drive excellence in Google Cloud products and features.
ā Ensures efficient data storage and processing functions by company security policies and best practices in cloud security.
ā Ensuring scaled database setup/monitoring with near zero downtime
JOB DETAILS
What You'll Do
ļ· 4 ā 6 years of application development with design, development, implementation, and
support experience, including the following:
o C#
o JavaScript
o HTML
o SQL
o Messaging/RabbitMQ
o Asynchronous communication patterns
ļ· Experience with Visual Studio and Git
ļ· A working understanding of build and release automation, preferably with Azure DevOps
ļ· Excellent understanding of object-oriented concepts and .Net framework
ļ· Experience in creating reusable libraries in C#
ļ· Ability to troubleshoot and isolate/solve complex bugs, connectivity issues, or OS related
issues
ļ· Ability to write complex SQL queries and stored procedures in Oracle and/or MS SQL
ļ· Proven ability to use design patterns to accomplish scalable architecture
ļ· Understanding of event-driven architecture
ļ· Experience with message brokers such as RabbitMQ
ļ· Experience in the development of REST APIs
ļ· Understanding of basic steps of an Agile SDLC
ļ· Excellent communication (both written and verbal) and interpersonal skills
ļ· Demonstrated accountability and ownership of assigned tasks
ļ· Demonstrated leadership and ability to work as a leader on large and complex tasks











