
General Description:Owns all technical aspects of software development for assigned applications
Participates in the design and development of systems & application programs
Functions as Senior member of an agile team and helps drive consistent development practices – tools, common components, and documentation
Required Skills:
In depth experience configuring and administering EKS clusters in AWS.
In depth experience in configuring **Splunk SaaS** in AWS environments especially in **EKS**
In depth understanding of OpenTelemetry and configuration of **OpenTelemetry Collectors**
In depth knowledge of observability concepts and strong troubleshooting experience.
Experience in implementing comprehensive monitoring and logging solutions in AWS using **CloudWatch**.
Experience in **Terraform** and Infrastructure as code.
Experience in **Helm**Strong scripting skills in Shell and/or python.
Experience with large-scale distributed systems and architecture knowledge (Linux/UNIX and Windows operating systems, networking, storage) in a cloud computing or traditional IT infrastructure environment.
Must have a good understanding of cloud concepts (Storage /compute/network).
Experience collaborating with several cross functional teams to architect observability pipelines for various GCP services like GKE, cloud run Big Query etc.
Experience with Git and GitHub.Experience with code build and deployment using GitHub actions, and Artifact Registry.
Proficient in developing and maintaining technical documentation, ADRs, and runbooks.

Similar jobs

Job Details
- Job Title: Lead DevOps Engineer
- Industry: Consumer Internet, Technology & Travel and Tourism Platform
- Function - IT
- Experience Required: 7-10 years
- Employment Type: Full Time
- Job Location: Bengaluru
- CTC Range: Best in Industry
Criteria:
- Strong Lead DevOps / Infrastructure Engineer Profiles.
- Must have 7+ years of hands-on experience working as a DevOps / Infrastructure Engineer.
- Candidate’s current title must be Lead DevOps Engineer (or equivalent Lead role) in the current organization
- Must have minimum 2+ years of team management / technical leadership experience, including mentoring engineers, driving infrastructure decisions, or leading DevOps initiatives.
- Must have strong hands-on experience with Kubernetes (container orchestration) including deployment, scaling, and cluster management.
- Must have experience with Infrastructure as Code (IaC) tools such as Terraform, Ansible, Chef, or Puppet.
- Must have strong scripting and automation experience using Python, Go, Bash, or similar scripting languages.
- Must have working experience with distributed databases or data systems such as MongoDB, Redis, Cassandra, Elasticsearch, or Puppet.
- Must have strong hands-on experience in Observability & Monitoring, CI/CD architecture, and Networking concepts in production environments.
- (Company) – Must be from B2C Product Companies only.
- (Education) – B.E/ B.Tech
Preferred
- Experience working in microservices architecture and event-driven systems.
- Exposure to cloud infrastructure, scalability, reliability, and cost optimization practices.
- (Skills) – Understanding of programming languages such as Go, Python, or Java.
- (Environment) – Experience working in high-growth startup or large-scale production environments.
Job Description
As a DevOps Engineer, you will be working on building and operating infrastructure at scale, designing and implementing a variety of tools to enable product teams to build and deploy their services independently, improving observability across the board, and designing for security, resiliency, availability, and stability. If the prospect of ensuring system reliability at scale and exploring cutting-edge technology to solve problems, excites you, then this is your fit.
Job Responsibilities:
- Own end-to-end infrastructure right from non-prod to prod environment including self-managed DBs
- Codify our infrastructure
- Do what it takes to keep the uptime above 99.99%
- Understand the bigger picture and sail through the ambiguities
- Scale technology considering cost and observability and manage end-to-end processes
- Understand DevOps philosophy and evangelize the principles across the organization
- Strong communication and collaboration skills to break down the silos
Job Title : DevOps Engineer – Fintech (Product-Based)
Experience : 5+ Years
Location : Mumbai
Job Type : Full-Time | Product Company
Role Summary :
We are hiring a DevOps Engineer with strong product-based experience to manage infrastructure for a Fintech platform built on stateful microservices.
The role involves working across hybrid cloud + on-prem, with deep expertise in Kubernetes, Helm, GitOps, IaC, and Cloud Networking.
Mandatory Skills :
Product-based experience, deep Kubernetes (managed & self-managed), custom Helm Chart development, ArgoCD/FluxCD (GitOps), strong AWS/Azure cloud networking & security, IaC module development (Terraform/Pulumi/CloudFormation), experience with stateful microservices (DBs/queues/caches), multi-tenant deployments, HA/load balancing/SSL/TLS/cert management.
Key Responsibilities :
- Deploy and manage stateful microservices in production.
- Handle both managed & self-managed Kubernetes clusters.
- Develop and maintain custom Helm Charts.
- Implement GitOps pipelines using ArgoCD/FluxCD.
- Architect and operate secure infra on AWS/Azure (VPC, IAM, networking).
- Build reusable IaC modules using Terraform/CloudFormation/Pulumi.
- Design multi-tenant cluster deployments.
- Manage HA, load balancers, certificates, DNS, and networking.
Mandatory Skills :
- Product-based company experience.
- Strong Kubernetes (EKS/AKS/GKE + self-managed).
- Custom Helm Chart development.
- GitOps tools : ArgoCD/FluxCD.
- AWS/Azure cloud networking & security.
- IaC module development (Terraform/Pulumi/CloudFormation).
- Experience with stateful components (DBs, queues, caches).
- Understanding of multi-tenant deployments, HA, SSL/TLS, ingress, LB.
About MyOperator
MyOperator is a Business AI Operator, a category leader that unifies WhatsApp, Calls, and AI-powered chat & voice bots into one intelligent business communication platform. Unlike fragmented communication tools, MyOperator combines automation, intelligence, and workflow integration to help businesses run WhatsApp campaigns, manage calls, deploy AI chatbots, and track performance — all from a single, no-code platform. Trusted by 12,000+ brands including Amazon, Domino's, Apollo, and Razorpay, MyOperator enables faster responses, higher resolution rates, and scalable customer engagement — without fragmented tools or increased headcount.
Job Summary
We are looking for a skilled and motivated DevOps Engineer with 3+ years of hands-on experience in AWS cloud infrastructure, CI/CD automation, and Kubernetes-based deployments. The ideal candidate will have strong expertise in Infrastructure as Code, containerization, monitoring, and automation, and will play a key role in ensuring high availability, scalability, and security of production systems.
Key Responsibilities
- Design, deploy, manage, and maintain AWS cloud infrastructure, including EC2, RDS, OpenSearch, VPC, S3, ALB, API Gateway, Lambda, SNS, and SQS.
- Build, manage, and operate Kubernetes (EKS) clusters and containerized workloads.
- Containerize applications using Docker and manage deployments with Helm charts
- Develop and maintain CI/CD pipelines using Jenkins for automated build and deployment processes
- Provision and manage infrastructure using Terraform (Infrastructure as Code)
- Implement and manage monitoring, logging, and alerting solutions using Prometheus and Grafana
- Write and maintain Python scripts for automation, monitoring, and operational tasks
- Ensure high availability, scalability, performance, and cost optimization of cloud resources
- Implement and follow security best practices across AWS and Kubernetes environments
- Troubleshoot production issues, perform root cause analysis, and support incident resolution
- Collaborate closely with development and QA teams to streamline deployment and release processes
Required Skills & Qualifications
- 3+ years of hands-on experience as a DevOps Engineer or Cloud Engineer.
- Strong experience with AWS services, including:
- EC2, RDS, OpenSearch, VPC, S3
- Application Load Balancer (ALB), API Gateway, Lambda
- SNS and SQS.
- Hands-on experience with AWS EKS (Kubernetes)
- Strong knowledge of Docker and Helm charts
- Experience with Terraform for infrastructure provisioning and management
- Solid experience building and managing CI/CD pipelines using Jenkins
- Practical experience with Prometheus and Grafana for monitoring and alerting
- Proficiency in Python scripting for automation and operational tasks
- Good understanding of Linux systems, networking concepts, and cloud security
- Strong problem-solving and troubleshooting skills
Good to Have (Preferred Skills)
- Exposure to GitOps practices
- Experience managing multi-environment setups (Dev, QA, UAT, Production)
- Knowledge of cloud cost optimization techniques
- Understanding of Kubernetes security best practices
- Experience with log aggregation tools (e.g., ELK/OpenSearch stack)
Language Preference
- Fluency in English is mandatory.
- Fluency in Hindi is preferred.
🚀 Hiring: Azure DevOps Engineer – Immediate Joiners Only! 🚀
📍 Location: Pune (Hybrid)
💼 Experience: 5+ Years
🕒 Mode of Work: Hybrid
Are you a proactive and skilled Azure DevOps Engineer looking for your next challenge? We are hiring immediate joiners to join our dynamic team! If you are passionate about CI/CD, cloud automation, and SRE best practices, we want to hear from you.
🔹 Key Skills Required:
✅ Cloud Expertise: Proficiency in any cloud (Azure preferred)
✅ CI/CD Pipelines: Hands-on experience in designing and managing pipelines
✅ Containers & IaC: Strong knowledge of Docker, Terraform, Kubernetes
✅ Incident Management: Quick issue resolution and RCA
✅ SRE & Observability: Experience with SLI/SLO/SLA, monitoring, tracing, logging
✅ Programming: Proficiency in Python, Golang
✅ Performance Optimization: Identifying and resolving system bottlenecks
Role: Full-Time, Long-Term Required: Docker, GCP, CI/CD Preferred: Experience with ML pipelines
OVERVIEW
We are seeking a DevOps engineer to join as a core member of our technical team. This is a long-term position for someone who wants to own infrastructure and deployment for a production machine learning system. You will ensure our prediction pipeline runs reliably, deploys smoothly, and scales as needed.
The ideal candidate thinks about failure modes obsessively, automates everything possible, and builds systems that run without constant attention.
CORE TECHNICAL REQUIREMENTS
Docker (Required): Deep experience with containerization. Efficient Dockerfiles, layer caching, multi-stage builds, debugging container issues. Experience with Docker Compose for local development.
Google Cloud Platform (Required): Strong GCP experience: Cloud Run for serverless containers, Compute Engine for VMs, Artifact Registry for images, Cloud Storage, IAM. You can navigate the console but prefer scripting everything.
CI/CD (Required): Build and maintain deployment pipelines. GitHub Actions required. You automate testing, building, pushing, and deploying. You understand the difference between continuous integration and continuous deployment.
Linux Administration (Required): Comfortable on the command line. SSH, diagnose problems, manage services, read logs, fix things. Bash scripting is second nature.
PostgreSQL (Required): Database administration basics—backups, monitoring, connection management, basic performance tuning. Not a DBA, but comfortable keeping a production database healthy.
Infrastructure as Code (Preferred): Terraform, Pulumi, or similar. Infrastructure should be versioned, reviewed, and reproducible—not clicked together in a console.
WHAT YOU WILL OWN
Deployment Pipeline: Maintaining and improving deployment scripts and CI/CD workflows. Code moves from commit to production reliably with appropriate testing gates.
Cloud Run Services: Managing deployments for model fitting, data cleansing, and signal discovery services. Monitor health, optimize cold starts, handle scaling.
VM Infrastructure: PostgreSQL and Streamlit on GCP VMs. Instance management, updates, backups, security.
Container Registry: Managing images in GitHub Container Registry and Google Artifact Registry. Cleanup policies, versioning, access control.
Monitoring and Alerting: Building observability. Logging, metrics, health checks, alerting. Know when things break before users tell us.
Environment Management: Configuration across local and production. Secrets management. Environment parity where it matters.
WHAT SUCCESS LOOKS LIKE
Deployments are boring—no drama, no surprises. Systems recover automatically from transient failures. Engineers deploy with confidence. Infrastructure changes are versioned and reproducible. Costs are reasonable and resources scale appropriately.
ENGINEERING STANDARDS
Automation First: If you do something twice, automate it. Manual processes are bugs waiting to happen.
Documentation: Runbooks, architecture diagrams, deployment guides. The next person can understand and operate the system.
Security Mindset: Secrets never in code. Least-privilege access. You think about attack surfaces.
Reliability Focus: Design for failure. Backups are tested. Recovery procedures exist and work.
CURRENT ENVIRONMENT
GCP (Cloud Run, Compute Engine, Artifact Registry, Cloud Storage), Docker, Docker Compose, GitHub Actions, PostgreSQL 16, Bash deployment scripts with Python wrapper.
WHAT WE ARE LOOKING FOR
Ownership Mentality: You see a problem, you fix it. You do not wait for assignment.
Calm Under Pressure: When production breaks, you diagnose methodically.
Communication: You explain infrastructure decisions to non-infrastructure people. You document what you build.
Long-Term Thinking: You build systems maintained for years, not quick fixes creating tech debt.
EDUCATION
University degree in Computer Science, Engineering, or related field preferred. Equivalent demonstrated expertise also considered.
TO APPLY
Include: (1) CV/resume, (2) Brief description of infrastructure you built or maintained, (3) Links to relevant work if available, (4) Availability and timezone.
Job Responsibilities:
Work & Deploy updates and fixes Provide Level 2 technical support Support implementation of fully automated CI/CD pipelines as per dev requirement Follow the escalation process through issue completion, including providing documentation after resolution Follow regular Operations procedures and complete all assigned tasks during the shift. Assist in root cause analysis of production issues and help write a report which includes details about the failure, the relevant log entries, and likely root cause Setup of CICD frameworks (Jenkins / Azure DevOps Server), Containerization using Docker, etc Implement continuous testing, Code Quality, Security using DevOps tooling Build a knowledge base by creating and updating documentation for support
Skills Required:
DevOps, Linux, AWS, Ansible, Jenkins, GIT, Terraform, CI, CD, Cloudformation, Typescript
Now, more than ever, the Toast team is committed to our customers. We’re taking steps to help restaurants navigate these unprecedented times with technology, resources, and community. Our focus is on building a restaurant platform that helps restaurants adapt, take control, and get back to what they do best: building the businesses they love. And because our technology is purpose-built for restaurants by restaurant people, restaurants can trust that we’ll deliver on their needs for today while investing in experiences that will power their restaurant of the future.
At Toast, our Site Reliability Engineers (SREs) are responsible for keeping all customer-facing services and other Toast production systems running smoothly. SREs are a blend of pragmatic operators and software craftspeople who apply sound software engineering principles, operational discipline, and mature automation to our environments and our codebase. Our decisions are based on instrumentation and continuous observability, as well as predictions and capacity planning.
About this roll* (Responsibilities)
- Gather and analyze metrics from both operating systems and applications to assist in performance tuning and fault finding
- Partner with development teams to improve services through rigorous testing and release procedures
- Participate in system design consulting, platform management, and capacity planning
- Create sustainable systems and services through automation and uplift
- Balance feature development speed and reliability with well-defined service level objectives
Troubleshooting and Supporting Escalations:
- Gather and analyze metrics from both operating systems and applications to assist in performance tuning and fault finding
- Diagnose performance bottlenecks and implement optimizations across infrastructure, databases, web, and mobile applications
- Implement strategies to increase system reliability and performance through on-call rotation and process optimization
- Perform and run blameless RCAs on incidents and outages aggressively, looking for answers that will prevent the incident from ever happening again
Do you have the right ingredients? (Requirements)
- Extensive industry experience with at least 7+ years in SRE and/or DevOps roles
- Polyglot technologist/generalist with a thirst for learning
- Deep understanding of cloud and microservice architecture and the JVM
- Experience with tools such as APM, Terraform, Ansible, GitHub, Jenkins, and Docker
- Experience developing software or software projects in at least four languages, ideally including two of Go, Python, and Java
- Experience with cloud computing technologies ( AWS cloud provider preferred)
Bread puns are encouraged but not required

Roles and Responsibilities:
• Gather and analyse cloud infrastructure requirements
• Automating system tasks and infrastructure using a scripting language (Shell/Python/Ruby
preferred), with configuration management tools (Ansible/ Puppet/Chef), service registry and
discovery tools (Consul and Vault, etc), infrastructure orchestration tools (Terraform,
CloudFormation), and automated imaging tools (Packer)
• Support existing infrastructure, analyse problem areas and come up with solutions
• An eye for monitoring – the candidate should be able to look at complex infrastructure and be
able to figure out what to monitor and how.
• Work along with the Engineering team to help out with Infrastructure / Network automation needs.
• Deploy infrastructure as code and automate as much as possible
• Manage a team of DevOps
Desired Profile:
• Understanding of provisioning of Bare Metal and Virtual Machines
• Working knowledge of Configuration management tools like Ansible/ Chef/ Puppet, Redfish.
• Experience in scripting languages like Ruby/ Python/ Shell Scripting
• Working knowledge of IP networking, VPN's, DNS, load balancing, firewalling & IPS concepts
• Strong Linux/Unix administration skills.
• Self-starter who can implement with minimal guidance
• Hands-on experience setting up CICD from SCRATCH in Jenkins
• Experience with Managing K8s infrastructure
What you will do:
- Handling Configuration Management, Web Services Architectures, DevOps Implementation, Build & Release Management, Database management, Backups and monitoring
- Logging, metrics and alerting management
- Creating Docker files
- Performing root cause analysis for production errors
What you need to have:
- 12+ years of experience in Software Development/ QA/ Software Deployment with 5+ years of experience in managing high performing teams
- Proficiency in VMware, AWS & cloud applications development, deployment
- Good knowledge in Java, Node.js
- Experience working with RESTful APIs, JSON etc
- Experience with Unit/ Functional automation is a plus
- Experience with MySQL, Mango DB, Redis, Rabbit MQ
- Proficiency in Jenkins. Ansible, Terraform/Chef/Ant
- Proficiency in Linux based Operating Systems
- Proficiency of Cloud Infrastructure like Dockers, Kubernetes
- Strong problem solving and analytical skills
- Good written and oral communication skills
- Sound understanding in areas of Computer Science such as algorithms, data structures, object oriented design, databases
- Proficiency in monitoring and observability
We are looking for an experienced DevOps (Development and Operations) professional to join our growing organization. In this position, you will be responsible for finding and reporting bugs in web and mobile apps & assist Sr DevOps to manage infrastructure projects and processes. Keen attention to detail, problem-solving abilities, and a solid knowledge base are essential.
As a DevOps, you will work in a Kubernetes based microservices environment.
Experience in Microsoft Azure cloud and Kubernetes is preferred, not mandatory.
Ultimately, you will ensure that our products, applications and systems work correctly.
Responsibilities:
- Detect and track software defects and inconsistencies
- Apply quality engineering principals throughout the Agile product lifecycle
- Handle code deployments in all environments
- Monitor metrics and develop ways to improve
- Consult with peers for feedback during testing stages
- Build, maintain, and monitor configuration standards
- Maintain day-to-day management and administration of projects
- Manage CI and CD tools with team
- Follow all best practices and procedures as established by the company
- Provide support and documentation
Required Technical and Professional Expertise
- Minimum 2+ years if DevOps
- Have experience in SaaS infrastructure development and Web Apps
- Experience in delivering microservices at scale; designing microservices solutions
- Proven Cloud experience/delivery of applications on Azure
- Proficient in configuration Management tools such as Ansible or any of Terraform Puppet, Chef, Salt, etc
- Hands-on experience in Networking/network configuration, Application performance monitoring, Container performance, and security.
- Understanding of Kubernetes, Python along with scripting languages like bash/shell
- Good to have experience in Linux internals, Linux packaging, Release Engineering (Branching, versioning, tagging), Artifact repository, Artifactory, Nexus, and CI/CD tooling (Concourse CI, Travis, Jenkins)
- Must be a proactive person
- You love collaborative environments that use agile methodologies to encourage creative design thinking and find innovative ways to develop with cutting edge technologies
- An ambitious individual who can work under their own direction towards agreed targets/goals and with a creative approach to work.
- An intuitive individual with an ability to manage change and proven time management
- Proven interpersonal skills while contributing to team effort by accomplishing related results as needed.









![[x]cube LABS](/_next/image?url=https%3A%2F%2Fcdnv2.cutshort.io%2Fcompany-static%2F639877aa0ad87e002533a1c5%2Fuser_uploaded_data%2Flogos%2Fx_whiteB_eeCk0gqs.png&w=256&q=75)
