4+ Reliability engineering Jobs in Delhi, NCR and Gurgaon | Reliability engineering Job openings in Delhi, NCR and Gurgaon
Apply to 4+ Reliability engineering Jobs in Delhi, NCR and Gurgaon on CutShort.io. Explore the latest Reliability engineering Job opportunities across top companies like Google, Amazon & Adobe.
About MyOperator
MyOperator is a Business AI Operator, a category leader that unifies WhatsApp, Calls, and AI-powered chat & voice bots into one intelligent business communication platform. Unlike fragmented communication tools, MyOperator combines automation, intelligence, and workflow integration to help businesses run WhatsApp campaigns, manage calls, deploy AI chatbots, and track performance — all from a single, no-code platform. Trusted by 12,000+ brands including Amazon, Domino's, Apollo, and Razorpay, MyOperator enables faster responses, higher resolution rates, and scalable customer engagement — without fragmented tools or increased headcount.
About the Role
We are seeking a Site Reliability Engineer (SRE) with a minimum of 2 years of experience who is passionate about monitoring, observability, and ensuring system reliability. The ideal candidate will have strong expertise in Grafana, Prometheus, Opensearch, and AWS CloudWatch, with the ability to design insightful dashboards and proactively optimize system performance.
Key Responsibilities
- Design, develop, and maintain monitoring and alerting systems using Grafana, Prometheus, and AWS CloudWatch.
- Create and optimize dashboards to provide actionable insights into system and application performance.
- Collaborate with development and operations teams to ensure high availability and reliability of services.
- Proactively identify performance bottlenecks and drive improvements.
- Continuously explore and adopt new monitoring/observability tools and best practices.
Required Skills & Qualifications
- Minimum 2 years of experience in SRE, DevOps, or related roles.
- Hands-on expertise in Grafana, Prometheus, and AWS CloudWatch.
- Proven experience in dashboard creation, visualization, and alerting setup.
- Strong understanding of system monitoring, logging, and metrics collection.
- Excellent problem-solving and troubleshooting skills.
- Quick learner with a proactive attitude and adaptability to new technologies.
Good to Have (Optional)
- Experience with AWS services beyond CloudWatch.
- Familiarity with containerization (Docker, Kubernetes) and CI/CD pipelines.
- Scripting knowledge (Python, Bash, or similar).
Why Join Us
At MyOperator, you will play a key role in ensuring the reliability, scalability, and performance of systems that power AI-driven business communication for leading global brands. You’ll work in a fast-paced, innovation-driven environment where your expertise will directly impact thousands of businesses worldwide.
We are hiring a Site Reliability Engineer (SRE) to join our high-performance engineering team. In this role, you'll be responsible for driving reliability, performance, scalability, and security across cloud-native systems while bridging the gap between development and operations.
Key Responsibilities
- Design and implement scalable, resilient infrastructure on AWS
- Take ownership of the SRE function – availability, latency, performance, monitoring, incident response, and capacity planning
- Partner with product and engineering teams to improve system reliability, observability, and release velocity
- Set up, maintain, and enhance CI/CD pipelines using Jenkins, GitHub Actions, or AWS CodePipeline
- Conduct load and stress testing, identify performance bottlenecks, and implement optimization strategies
Required Skills & Qualifications
- Proven hands-on experience in cloud infrastructure design (AWS strongly preferred)
- Strong background in DevOps and SRE principles
- Proficiency with performance testing tools like JMeter, Gatling, k6, or Locust
- Deep understanding of cloud security and best practices for reliability engineering
- AWS Solution Architect Certification – Associate or Professional (preferred)
- Solid problem-solving skills and a proactive approach to systems improvement
Why Join Us?
- Work with cutting-edge technologies in a cloud-native, fast-paced environment
- Collaborate with cross-functional teams driving meaningful impact
- Hybrid work culture with flexibility and autonomy
- Open, inclusive work environment focused on innovation and excellence

My client is a leader in the Ed Tech space
Job Title: Sr. Reliability Engineer/ Engineering Manager
Location: Powai (Mumbai)/ Bangaluru/ Delhi NCR
- System maintenance and administration of Applications and Software
- Integration with different systems and apps
- Must have knowledge on Microservices
- Should be proficient in scripting language Python, Good to have knowledge on Java
- Excellent over Root Cause Analysis
- Should be a technical as well as process-oriented candidate
- Excellent Debugging and Monitoring skills
- Responsible for daily trouble ticket resolution, client interaction and customer support via email and video calls
- Responsible for the setup for the Continuous Integration build and deployment for DEV and UAT environments
- Responsible for operational support and problem resolution for application users and internal operations team
- Identifying, tracking, managing and resolving project issues effectively and efficiently
- Supporting projects during the normalization phase after Go-Live and ensuring the smooth transitions to operations
- Review and implement the processes related to IT Support and Operations before handing over to Support Services
- Strong ability to work independently on complex issues
- Collaborate efficiently with internal experts to resolve customer issues quickly
- Both Proactive & reactive in work approach
- Should be willing to work in 24/7 work environment
Ideal candidate has:
- 6-10 years
- Product Background
● Research, propose and evaluate with a 5-year vision, the architecture, design, technologies,
processes and profiles related to Telco Cloud.
● Participate in the creation of a realistic technical-strategic roadmap of the network to transform
it to Telco Cloud and be prepared for 5G.
● Using your deep technical expertise, you will provide detailed feedback to Product Management
and Engineering, as well as contribute directly to the platform code base to enhance both the
Customer experience of the service, as well as the SRE quality of life.
● The individual must be aware of trends in network infrastructure as well as within the network
engineering and OSS community. What technologies are being developed or launched?
● The individual should stay current with infrastructure trends in the telco network cloud domain.
● Be responsible for the Engineering of Lab and Production Telco Cloud environments, including
patches, upgrades, and reliability and performance improvements.
Required Minimum Qualifications: (Education and Technical Skills/Knowledge)
● Software Engineering degree, MS in Computer Science or equivalent experience
● Years of experiences as an SRE, DevOps, Development and/or Support related role
● 0-5 years of professional experience for a junior position
● At least 8 years of professional experience for a senior position
● Unix server administration and tuning : Linux / RedHat / CentOS / Ubuntu
● You have deep knowledge in Networking Layers 1-4
● Cloud / Virtualization (at least two): Helm, Docker, Kubernetes, AWS, Azure, Google Cloud,
OpenStack, OpenShift, VMware vSphere / Tanzu
● You have in-depth knowledge of cloud storage solutions on top of AWS, GCP, Azure and/or
on-prem private cloud, such as Ceph, CephFS, GlusterFS
● DevOps: Jenkins, Git, Azure DevOps, Ansible, Terraform
● Backend Knowledge Bash, Python, Go (other knowledge of Scripting Language is a plus).
● PaaS Level solutions such as Keycloak for IAM, Prometheus, Grafana, ELK, DBaaS (such as MySQL,
Cassandra)
About the Organisation:
The team at Coredge.io is a combination of experienced and young professionals alike having
many years of experience in working with Edge computing, Telecom application development
and Kubernetes. The company has continuously collaborated with the open source community,
universities and major industry players in furthering its goal of providing the industry with an
indispensable tool to offer improved services to its customers. Coredge.io has a global market
presence with its offices in US and New Delhi, India.


