11+ Root cause analysis Jobs in Delhi, NCR and Gurgaon | Root cause analysis Job openings in Delhi, NCR and Gurgaon
Apply to 11+ Root cause analysis Jobs in Delhi, NCR and Gurgaon on CutShort.io. Explore the latest Root cause analysis Job opportunities across top companies like Google, Amazon & Adobe.
Required Skills: Advanced AWS Infrastructure Expertise, CI/CD Pipeline Automation, Monitoring, Observability & Incident Management, Security, Networking & Risk Management, Infrastructure as Code & Scripting
Criteria:
- 5+ years of DevOps/SRE experience in cloud-native, product-based companies (B2C scale preferred)
- Strong hands-on AWS expertise across core and advanced services (EC2, ECS/EKS, Lambda, S3, CloudFront, RDS, VPC, IAM, ELB/ALB, Route53)
- Proven experience designing high-availability, fault-tolerant cloud architectures for large-scale traffic
- Strong experience building & maintaining CI/CD pipelines (Jenkins mandatory; GitHub Actions/GitLab CI a plus)
- Prior experience running production-grade microservices deployments and automated rollout strategies (Blue/Green, Canary)
- Hands-on experience with monitoring & observability tools (Grafana, Prometheus, ELK, CloudWatch, New Relic, etc.)
- Solid hands-on experience with MongoDB in production, including performance tuning, indexing & replication
- Strong scripting skills (Bash, Shell, Python) for automation
- Hands-on experience with IaC (Terraform, CloudFormation, or Ansible)
- Deep understanding of networking fundamentals (VPC, subnets, routing, NAT, security groups)
- Strong experience in incident management, root cause analysis & production firefighting
Description
Role Overview
Company is seeking an experienced Senior DevOps Engineer to design, build, and optimize cloud infrastructure on AWS, automate CI/CD pipelines, implement monitoring and security frameworks, and proactively identify scalability challenges. This role requires someone who has hands-on experience running infrastructure at B2C product scale, ideally in media/OTT or high-traffic applications.
Key Responsibilities
1. Cloud Infrastructure — AWS (Primary Focus)
- Architect, deploy, and manage scalable infrastructure using AWS services such as EC2, ECS/EKS, Lambda, S3, CloudFront, RDS, ELB/ALB, VPC, IAM, Route53, etc.
- Optimize cloud cost, resource utilization, and performance across environments.
- Design high-availability, fault-tolerant systems for streaming workloads.
2. CI/CD Automation
- Build and maintain CI/CD pipelines using Jenkins, GitHub Actions, or GitLab CI.
- Automate deployments for microservices, mobile apps, and backend APIs.
- Implement blue/green and canary deployments for seamless production rollouts.
3. Observability & Monitoring
- Implement logging, metrics, and alerting using tools like Grafana, Prometheus, ELK, CloudWatch, New Relic, etc.
- Perform proactive performance analysis to minimize downtime and bottlenecks.
- Set up dashboards for real-time visibility into system health and user traffic spikes.
4. Security, Compliance & Risk Highlighting
• Conduct frequent risk assessments and identify vulnerabilities in:
o Cloud architecture
o Access policies (IAM)
o Secrets & key management
o Data flows & network exposure
• Implement security best practices including VPC isolation, WAF rules, firewall policies, and SSL/TLS management.
5. Scalability & Reliability Engineering
- Analyze traffic patterns for OTT-specific load variations (weekends, new releases, peak hours).
- Identify scalability gaps and propose solutions across:
- o Microservices
- o Caching layers
- o CDN distribution (CloudFront)
- o Database workloads
- Perform capacity planning and load testing to ensure readiness for 10x traffic growth.
6. Database & Storage Support
- Administer and optimize MongoDB for high-read/low-latency use cases.
- Design backup, recovery, and data replication strategies.
- Work closely with backend teams to tune query performance and indexing.
7. Automation & Infrastructure as Code
- Implement IaC using Terraform, CloudFormation, or Ansible.
- Automate repetitive infrastructure tasks to ensure consistency across environments.
Required Skills & Experience
Technical Must-Haves
- 5+ years of DevOps/SRE experience in cloud-native, product-based companies.
- Strong hands-on experience with AWS (core and advanced services).
- Expertise in Jenkins CI/CD pipelines.
- Solid background working with MongoDB in production environments.
- Good understanding of networking: VPCs, subnets, security groups, NAT, routing.
- Strong scripting experience (Bash, Python, Shell).
- Experience handling risk identification, root cause analysis, and incident management.
Nice to Have
- Experience with OTT, video streaming, media, or any content-heavy product environments.
- Familiarity with containers (Docker), orchestration (Kubernetes/EKS), and service mesh.
- Understanding of CDN, caching, and streaming pipelines.
Personality & Mindset
- Strong sense of ownership and urgency—DevOps is mission critical at OTT scale.
- Proactive problem solver with ability to think about long-term scalability.
- Comfortable working with cross-functional engineering teams.
Why Join company?
• Build and operate infrastructure powering millions of monthly users.
• Opportunity to shape DevOps culture and cloud architecture from the ground up.
• High-impact role in a fast-scaling Indian OTT product.
We seek a skilled and motivated Azure DevOps engineer to join our dynamic team. The ideal candidate will design, implement, and manage CI/CD pipelines, automate deployments, and optimize cloud infrastructure using Azure DevOps tools and services. You will collaborate closely with development and IT teams to ensure seamless integration and delivery of software solutions in a fast-paced environment.
Responsibilities:
- Design, implement, and manage CI/CD pipelines using Azure DevOps.
- Automate infrastructure provisioning and deployments using Infrastructure as Code (IaC) tools like Terraform, ARM templates, or Azure CLI.
- Monitor and optimize Azure environments to ensure high availability, performance, and security.
- Collaborate with development, QA, and IT teams to streamline the software development lifecycle (SDLC).
- Troubleshoot and resolve issues related to build, deployment, and infrastructure.
- Implement and manage version control systems, primarily using Git.
- Manage containerization and orchestration using tools like Docker and Kubernetes.
- Ensure compliance with industry standards and best practices for security, scalability, and reliability.
- Minimum 3+ yrs of Experience in DevOps with AWS Platform
- • Strong AWS knowledge and experience
- • Experience in using CI/CD automation tools (Git, Jenkins, Configuration deployment tools ( Puppet/Chef/Ansible)
- • Experience with IAC tools Terraform
- • Excellent experience in operating a container orchestration cluster (Kubernetes, Docker)
- • Significant experience with Linux operating system environments
- • Experience with infrastructure scripting solutions such as Python/Shell scripting
- • Must have experience in designing Infrastructure automation framework.
- • Good experience in any of the Setting up Monitoring tools and Dashboards ( Grafana/kafka)
- • Excellent problem-solving, Log Analysis and troubleshooting skills
- • Experience in setting up centralized logging for system (EKS, EC2) and application
- • Process-oriented with great documentation skills
- • Ability to work effectively within a team and with minimal supervision
The candidate must have 2-3 years of experience in the domain. The responsibilities include:
● Deploying system on Linux-based environment using Docker
● Manage & maintain the production environment
● Deploy updates and fixes
● Provide Level 1 technical support
● Build tools to reduce occurrences of errors and improve customer experience
● Develop software to integrate with internal back-end systems
● Perform root cause analysis for production errors
● Investigate and resolve technical issues
● Develop scripts to automate visualization
● Design procedures for system troubleshooting and maintenance
● Experience working on Linux-based infrastructure
● Excellent understanding of MERN Stack, Docker & Nginx (Good to have Node Js)
● Configuration and managing databases such as Mongo
● Excellent troubleshooting
● Experience of working with AWS/Azure/GCP
● Working knowledge of various tools, open-source technologies, and cloud services
● Awareness of critical concepts in DevOps and Agile principles
● Experience of CI/CD Pipeline
Job Description
• Minimum 3+ yrs of Experience in DevOps with AWS Platform
• Strong AWS knowledge and experience
• Experience in using CI/CD automation tools (Git, Jenkins, Configuration deployment tools ( Puppet/Chef/Ansible)
• Experience with IAC tools Terraform
• Excellent experience in operating a container orchestration cluster (Kubernetes, Docker)
• Significant experience with Linux operating system environments
• Experience with infrastructure scripting solutions such as Python/Shell scripting
• Must have experience in designing Infrastructure automation framework.
• Good experience in any of the Setting up Monitoring tools and Dashboards ( Grafana/kafka)
• Excellent problem-solving, Log Analysis and troubleshooting skills
• Experience in setting up centralized logging for system (EKS, EC2) and application
• Process-oriented with great documentation skills
• Ability to work effectively within a team and with minimal supervision
**THIS IS A 100% WORK FROM OFFICE ROLE**
We are looking for an experienced DevOps engineer that will help our team establish DevOps practice. You will work closely with the technical lead to identify and establish DevOps practices in the company.
You will help us build scalable, efficient cloud infrastructure. You’ll implement monitoring for automated system health checks. Lastly, you’ll build our CI pipeline, and train and guide the team in DevOps practices.
ROLE and RESPONSIBILITIES:
• Understanding customer requirements and project KPIs
• Implementing various development, testing, automation tools, and IT infrastructure
• Planning the team structure, activities, and involvement in project management
activities.
• Managing stakeholders and external interfaces
• Setting up tools and required infrastructure
• Defining and setting development, test, release, update, and support processes for
DevOps operation
• Have the technical skill to review, verify, and validate the software code developed in
the project.
• Troubleshooting techniques and fixing the code bugs
• Monitoring the processes during the entire lifecycle for its adherence and updating or
creating new processes for improvement and minimizing the wastage
• Encouraging and building automated processes wherever possible
• Identifying and deploying cybersecurity measures by continuously performing
vulnerability assessment and risk management
• Incidence management and root cause analysis
• Coordination and communication within the team and with customers
• Selecting and deploying appropriate CI/CD tools
• Strive for continuous improvement and build continuous integration, continuous
development, and constant deployment pipeline (CI/CD Pipeline)
• Mentoring and guiding the team members
• Monitoring and measuring customer experience and KPIs
• Managing periodic reporting on the progress to the management and the customer
Essential Skills and Experience Technical Skills
• Proven 3+years of experience as DevOps
• A bachelor’s degree or higher qualification in computer science
• The ability to code and script in multiple languages and automation frameworks
like Python, C#, Java, Perl, Ruby, SQL Server, NoSQL, and MySQL
• An understanding of the best security practices and automating security testing and
updating in the CI/CD (continuous integration, continuous deployment) pipelines
• An ability to conveniently deploy monitoring and logging infrastructure using tools.
• Proficiency in container frameworks
• Mastery in the use of infrastructure automation toolsets like Terraform, Ansible, and command line interfaces for Microsoft Azure, Amazon AWS, and other cloud platforms
• Certification in Cloud Security
• An understanding of various operating systems
• A strong focus on automation and agile development
• Excellent communication and interpersonal skills
• An ability to work in a fast-paced environment and handle multiple projects
simultaneously
OTHER INFORMATION
The DevOps Engineer will also be expected to demonstrate their commitment:
• to gedu values and regulations, including equal opportunities policy.
• the gedu’s Social, Economic and Environmental responsibilities and minimise environmental impact in the performance of the role and actively contribute to the delivery of gedu’s Environmental Policy.
• to their Health and Safety responsibilities to ensure their contribution to a safe and secure working environment for staff, students, and other visitors to the campus.
DevOps Engineer
The DevOps team is one of the core technology teams of Lumiq.ai and is responsible for managing network activities, automating Cloud setups and application deployments. The team also interacts with our customers to work out solutions. If you are someone who is always pondering how to make things better, how technologies can interact, how various tools, technologies, and concepts can help a customer or how you can use various technologies to improve user experience, then Lumiq is the place of opportunities.
Job Description
- Explore about the newest innovations in scalable and distributed systems.
- Helps in designing the architecture of the project, solutions to the existing problems and future improvements to be done.
- Make the cloud infrastructure and services smart by implementing automation and trigger based solutions.
- Interact with Data Engineers and Application Engineers to create continuous integration and deployment frameworks and pipelines.
- Playing around with large clusters on different clouds to tune your jobs or to learn.
- Researching about new technologies, proving the concepts and planning how to integrate or update.
- Be part of discussions of other projects to learn or to help.
Responsibilities
- 2+years of experience as DevOps Engineer.
- You understand actual networking to Software defined networking.
- You like containers and open source orchestration system like Kubernetes, Mesos.
- Should have experience to secure system by creating robust access policy and network restrictions enforcement.
- Should have knowledge about how applications work are very important to design distributed systems.
- Should have experience to open source projects and have discussed the shortcomings or problems with the community on several occasions.
- You understand that provisioning a Virtual Machine is not DevOps.
- You know you are not a SysAdmin but DevOps Engineer who is the person behind developing operations for the system to run efficiently and scalably.
- Exposure on Private Cloud, Subnets, VPNs, Peering, Load Balancers and have worked with them.
- You check logs before screaming about error.
- Multiple Screens makes you more efficient.
- You are a doer who don’t say the word impossible.
- You understand the value of documentation of your work.
- You understand the Big Data ecosystem and how can you leverage cloud for it.
- You know these buddies - #airflow, #aws, #azure, #gcloud, #docker, #kubernetes, #mesos, #acs
- Must have a minimum of 3 years of experience in managing AWS resources and automating CI/CD pipelines.
- Strong scripting skills in PowerShell, Python or Bash be able to build and administer CI/CD pipelines.
- Knowledge of infrastructure tools like Cloud Formation, Terraform, Ansible.
- Experience with microservices and/or event-driven architecture.
- Experience using containerization technologies (Docker, ECS, Kubernetes, Mesos or Vagrant).
- Strong practical Windows and Linux system administration skills in the cloud.
- Understanding of DNS, NFS, TCP/IP and other protocols.
- Knowledge of secure SDLC, OWASP top 10 and CWE/SANS top 25.
- Deep understanding of Web Sockets and their functioning. Hands on experience of ElasticCache, Redis, ECS or EKS. Installation, configuration and management of Apache or Nginx web server, Apache/Tomcat Application Server, configure SSL certificates, setup reverse proxy.
- Exposure to RDBMS (MySQL, SQL Server, Aurora, etc.) is a plus.
- Exposure to programming languages like JAVA, PHP, SQL is a plus.
- AWS Developer or AWS SysOps Administrator certification is a plus.
- AWS Solutions Architect Certification experience is a plus.
- Experience building Blue/Green, Canary or other zero down time deployment strategies, advanced understanding of VPC, EC2 Route53 IAM, Lambda is a plus.
Exposure to development and implementation practices in a modern systems environment together with exposure to working in a project team particularly with reference to industry methodologies, e.g. Agile, continuous delivery, etc
- At least 3-5 years of experience building and maintaining AWS infrastructure (VPC, EC2, Security Groups, IAM, ECS, CodeDeploy, CloudFront, S3)
- Strong understanding of how to secure AWS environments and meet compliance requirements
- Experience using DevOps methodology and Infrastructure as Code
- Automation / CI/CD tools – Bitbucket Pipelines, Jenkins
- Infrastructure as code – Terraform, Cloudformation, etc
- Strong experience deploying and managing infrastructure with Terraform
- Automated provisioning and configuration management – Ansible, Chef, Puppet
- Experience with Docker, GitHub, Jenkins, ELK and deploying applications on AWS
- Improve CI/CD processes, support software builds and CI/CD of the development departments
- Develop, maintain, and optimize automated deployment code for development, test, staging and production environments





