Role Overview
We are looking for a hands-on DevOps Engineer who will own infrastructure, deployment, reliability, and cloud cost optimisation. You will work closely with backend, AI, and product teams to ensure the platform is secure, scalable, and always available.
This is a high-ownership role with real impact on uptime, performance, and developer velocity.
Key Responsibilities
Infrastructure & Cloud Management
- Design, deploy, and manage cloud infrastructure (GCP preferred; AWS acceptable)
- Manage Compute Engine, Cloud Run, Kubernetes (GKE), Cloud SQL, storage, and networking
- Ensure high availability, fault tolerance, and scalability
CI/CD & Deployment
- Build and maintain CI/CD pipelines for backend and AI services
- Automate deployments, rollbacks, and environment management (dev, staging, prod)
- Improve release reliability and deployment speed
Monitoring, Reliability & Security
- Set up monitoring, alerting, and logging (uptime, CPU, memory, errors, latency)
- Proactively identify and resolve performance bottlenecks and incidents
- Implement security best practices: IAM, secrets management, backups, and access controls
Cost Optimisation & Performance
- Monitor and optimise cloud costs (compute, databases, storage)
- Implement autoscaling, right-sizing, and resource optimisation
- Work with engineering teams to balance performance with cost efficiency
Required Qualifications & Skills
- 3–6 years of hands-on DevOps / Cloud Engineering experience
- Strong experience with GCP (or AWS with willingness to transition)
- Experience with Docker, Kubernetes, and containerised workloads
- Experience with CI/CD tools (GitHub Actions, GitLab CI, or similar)
- Ability to troubleshoot production issues under pressure
- Experience with AI/ML workloads and GPU-based deployments

About wwwwebnyayai
About
Similar jobs
Job Title : Senior DevOps Engineer (Only Mumbai Candidates)
Experience : 5+ Years
Location : Mumbai (On-site)
Notice Period : Immediate to 15 Days
Interview Process : 1 Internal Round + 1 Client Round
Mandatory Skills :
Multi-Cloud (AWS/GCP/Azure – any two), Kubernetes, Terraform, Helm (writing Helm Charts), CI/CD (GitLab CI/Jenkins/GitHub Actions), GitOps (ArgoCD/FluxCD), Multi-tenant deployments, Stateful microservices on Kubernetes, Enterprise Linux.
Role Overview :
We are looking for a Senior DevOps Engineer to design, build, and manage scalable cloud infrastructure and DevOps pipelines for product-based platforms.
The ideal candidate should have strong experience with Kubernetes, Terraform, Helm Charts, CI/CD, and GitOps practices.
Key Responsibilities :
- Design and manage scalable cloud infrastructure across AWS/GCP/Azure.
- Deploy and manage microservices on Kubernetes clusters.
- Build and maintain Infrastructure as Code using Terraform and Helm.
- Implement CI/CD pipelines using GitLab CI, Jenkins, or GitHub Actions.
- Implement GitOps workflows using ArgoCD or FluxCD.
- Ensure secure, scalable, and reliable DevOps architecture.
- Implement monitoring and logging using Prometheus, Grafana, or ELK.
Good to Have :
- Packer, OpenShift/Rancher/K3s, On-prem deployments, PaaS experience, scripting (Bash/Python), Terraform modules.
Key Responsibilities:
- Design, implement, and maintain scalable, secure, and cost-effective infrastructure on AWS and Azure
- Set up and manage CI/CD pipelines for smooth code integration and delivery using tools like GitHub Actions, Bitbucket Runners, AWS Code build/deploy, Azure DevOps, etc.
- Containerize applications using Docker and manage orchestration with Kubernetes, ECS, Fargate, AWS EKS, Azure AKS.
- Manage and monitor production deployments to ensure high availability and performance
- Implement and manage CDN solutions using AWS CloudFront and Azure Front Door for optimal content delivery and latency reduction
- Define and apply caching strategies at application, CDN, and reverse proxy layers for performance and scalability
- Set up and manage reverse proxies and Cloudflare WAF to ensure application security and performance
- Implement infrastructure as code (IaC) using Terraform, CloudFormation, or ARM templates
- Administer and optimize databases (RDS, PostgreSQL, MySQL, etc.) including backups, scaling, and monitoring
- Configure and maintain VPCs, subnets, routing, VPNs, and security groups for secure and isolated network setups
- Implement monitoring, logging, and alerting using tools like CloudWatch, Grafana, ELK, or Azure Monitor
- Collaborate with development and QA teams to align infrastructure with application needs
- Troubleshoot infrastructure and deployment issues efficiently and proactively
- Ensure cloud cost optimization and usage tracking
Required Skills & Experience:
- 3-4 years of hands-on experience in a DevOps
- Strong expertise with both AWS and Azure cloud platforms
- Proficient in Git, branching strategies, and pull request workflows
- Deep understanding of CI/CD concepts and experience with pipeline tools
- Proficiency in Docker, container orchestration (Kubernetes, ECS/EKS/AKS)
- Good knowledge of relational databases and experience in managing DB backups, performance, and migrations
- Experience with networking concepts including VPC, subnets, firewalls, VPNs, etc.
- Experience with Infrastructure as Code tools (Terraform preferred)
- Strong working knowledge of CDN technologies: AWS CloudFront and Azure Front Door
- Understanding of caching strategies: edge caching, browser caching, API caching, and reverse proxy-level caching
- Experience with Cloudflare WAF, reverse proxy setups, SSL termination, and rate-limiting
- Familiarity with Linux system administration, scripting (Bash, Python), and automation tools
- Working knowledge of monitoring and logging tools
- Strong troubleshooting and problem-solving skills
Good to Have (Bonus Points):
- Experience with serverless architecture (e.g., AWS Lambda, Azure Functions)
- Exposure to cost monitoring tools like CloudHealth, Azure Cost Management
- Experience with compliance/security best practices (SOC2, ISO, etc.)
- Familiarity with Service Mesh (Istio, Linkerd) and API gateways
- Knowledge of Secrets Management tools (e.g., HashiCorp Vault, AWS Secrets Manager)
Expert troubleshooting skills.
Expertise in designing highly secure cloud services and cloud infrastructure using AWS
(EC2, RDS, S3, ECS, Route53)
Experience with DevOps tools including Docker, Ansible, Terraform.
• Experience with monitoring tools such as DataDog, Splunk.
Experience building and maintaining large scale infrastructure in AWS including
experience leveraging one or more coding languages for automation.
Experience providing 24X7 on call production support.
Understanding of best practices, industry standards and repeatable, supportable
processes.
Knowledge and working experience of container-based deployments such as Docker,
Terraform, AWS ECS.
of TCP/IP, DNS, Certs & Networking Concepts.
Knowledge and working experience of the CI/CD development pipeline and experience
of the CI/CD maturity model. (Jenkins)
Knowledge and working experience
Strong core Linux OS skills, shell scripting, python scripting.
Working experience of modern engineering operations duties, including providing the
necessary tools and infrastructure to support high performance Dev and QA teams.
Database, MySQL administration skills is a plus.
Prior work in high load and high-traffic infrastructure is a plus.
Clear vision of and commitment to providing outstanding customer service.
Managing cloud-based serverless infrastructure on AWS, GCP(firebase) with IaC
(Terraform, CloudFormation etc.,)
Deploying and maintaining products, services, and network components with a focus
on security, reliability, and zero downtime
Automating and streamlining existing processes to aid the development team
Working with the development team to create ephemeral environments, simplifying
the development lifecycle
Driving forward our blockchain infrastructure by creating and managing validators for
a wide variety of new and existing blockchains
Requirements:
1-3+ years in a SRE / DevOps / DevSecOps or Infrastructure Engineering role
Strong working knowledge of Amazon Web Services (AWS) or GCP or similar cloud
ecosystem
Experience working with declarative Infrastructure-as-Code frameworks(Terraform,
CloudFormation)
Experience with containerization technologies and tools (Docker, Kubernetes), CI/CD
pipelines and Linux/Unix administration
Bonus points - if you know more about crypto, staking, defi, proof-of-stake,
validators, delegations
Benefits:
Competitive CTC on par with market along with ESOPs/Tokens
What we look for:
As a DevOps Developer, you will contribute to a thriving and growing AIGovernance Engineering team. You will work in a Kubernetes-based microservices environment to support our bleeding-edge cloud services. This will include custom solutions, as well as open source DevOps tools (build and deploy automation, monitoring and data gathering for our software delivery pipeline). You will also be contributing to our continuous improvement and continuous delivery while increasing maturity of DevOps and agile adoption practices.
Responsibilities:
- Ability to deploy software using orchestrators /scripts/Automation on Hybrid and Public clouds like AWS
- Ability to write shell/python/ or any unix scripts
- Working Knowledge on Docker & Kubernetes
- Ability to create pipelines using Jenkins or any CI/CD tool and GitOps tool like ArgoCD
- Working knowledge of Git as a source control system and defect tracking system
- Ability to debug and troubleshoot deployment issues
- Ability to use tools for faster resolution of issues
- Excellent communication and soft skills
- Passionate and ability work and deliver in a multi-team environment
- Good team player
- Flexible and quick learner
- Ability to write docker files, Kubernetes yaml files / Helm charts
- Experience with monitoring tools like Nagios, Prometheus and visualisation tools such as Grafana.
- Ability to write Ansible, terraform scripts
- Linux System experience and Administration
- Effective cross-functional leadership skills: working with engineering and operational teams to ensure systems are secure, scalable, and reliable.
- Ability to review deployment and operational environments, i.e., execute initiatives to reduce failure, troubleshoot issues across the entire infrastructure stack, expand monitoring capabilities, and manage technical operations.
The DevOps Engineer's core responsibilities include automated configuration and management
of infrastructure, continuous integration and delivery of distributed systems at scale in a Hybrid
environment.
Must-Have:
● You have 4-10 years of experience in DevOps
● You have experience in managing IT infrastructure at scale
● You have experience in automation of deployment of distributed systems and in
infrastructure provisioning at scale.
● You have in-depth hands-on experience on Linux and Linux-based systems, Linux
scripting
● You have experience in Server hardware, Networking, firewalls
● You have experience in source code management, configuration management,
continuous integration, continuous testing, continuous monitoring
● You have experience with CI/CD and related tools
* You have experience with Monitoring tools like ELK, Grafana, Prometheus
● You have experience with containerization, container orchestration, management
● Have a penchant for solving complex and interesting problems.
● Worked in startup-like environments with high levels of ownership and commitment.
● BTech, MTech or Ph.D. in Computer Science or related Technical Discipline
We are looking for a DevOps Engineer for managing the interchange of data between the server and the users. Your primary responsibility will be the development of all server-side logic, definition, and maintenance of the central database, and ensuring high performance and responsiveness to request from the frontend. You will also be responsible for integrating the front-end elements built by your co-workers into the application. Therefore, a basic understanding of frontend technologies is necessary as well.
What we are looking for
- Must have strong knowledge of Kubernetes and Helm3
- Should have previous experience in Dockerizing the applications.
- Should be able to automate manual tasks using Shell or Python
- Should have good working knowledge on AWS and GCP clouds
- Should have previous experience working on Bitbucket, Github, or any other VCS.
- Must be able to write Jenkins Pipelines and have working knowledge on GitOps and ArgoCD.
- Have hands-on experience in Proactive monitoring using tools like NewRelic, Prometheus, Grafana, Fluentbit, etc.
- Should have a good understanding of ELK Stack.
- Exposure on Jira, confluence, and Sprints.
What you will do:
- Mentor junior Devops engineers and improve the team’s bar
- Primary owner of tech best practices, tech processes, DevOps initiatives, and timelines
- Oversight of all server environments, from Dev through Production.
- Responsible for the automation and configuration management
- Provides stable environments for quality delivery
- Assist with day-to-day issue management.
- Take lead in containerising microservices
- Develop deployment strategies that allow DevOps engineers to successfully deploy code in any environment.
- Enables the automation of CI/CD
- Implement dashboard to monitors various
- 1-3 years of experience in DevOps
- Experience in setting up front end best practices
- Working in high growth startups
- Ownership and Be Proactive.
- Mentorship & upskilling mindset.
- systems and applications
what you’ll get- Health Benefits
- Innovation-driven culture
- Smart and fun team to work with
- Friends for life
- Hands on experience in AWS provisioning of AWS services like EC2, S3,EBS, AMI, VPC, ELB, RDS, Auto scaling groups, Cloud Formation.
- Good experience on Build and release process and extensively involved in the CICD using
Jenkins
- Experienced on configuration management tools like Ansible.
- Designing, implementing and supporting fully automated Jenkins CI/CD
- Extensively worked on Jenkins for continuous Integration and for end to end Automation for all Builds and Deployments.
- Proficient with Docker based container deployments to create shelf environments for dev teams and containerization of environment delivery for releases.
- Experience working on Docker hub, creating Docker images and handling multiple images primarily for middleware installations and domain configuration.
- Good knowledge in version control system in Git and GitHub.
- Good experience in build tools
- Implemented CI/CD pipeline using Jenkins, Ansible, Docker, Kubernetes ,YAML and Manifest
We (the Software Engineer team) are looking for a motivated, experienced person with a data-driven approach to join our Distribution Team in Bangalore to help design, execute and improve our test sets and infrastructure for producing high-quality Hadoop software.
A Day in the life
You will be part of a team that makes sure our releases are predictable and deliver high value to the customer. This team is responsible for automating and maintaining our test harness, and making test results reliable and repeatable.
You will:
-
work on making our distributed software stack more resilient to high-scale endurance runs and customer simulations
-
provide valuable fixes to our product development teams to the issues you’ve found during exhaustive test runs
-
work with product and field teams to make sure our customer simulations match the expectations and can provide valuable feedback to our customers
-
work with amazing people - We are a fun & smart team, including many of the top luminaries in Hadoop and related open source communities. We frequently interact with the research community, collaborate with engineers at other top companies & host cutting edge researchers for tech talks.
-
do innovative work - Cloudera pushes the frontier of big data & distributed computing, as our track record shows. We work on high-profile open source projects, interacting daily with engineers at other exciting companies, speaking at meet-ups, etc.
-
be a part of a great culture - Transparent and open meritocracy. Everybody is always thinking of better ways to do things, and coming up with ideas that make a difference. We build our culture to be the best workplace in our careers.
You have:
-
strong knowledge in at least 1 of the following languages: Java / Python / Scala / C++ / C#
-
hands-on experience with at least 1 of the following configuration management tools: Ansible, Chef, Puppet, Salt
-
confidence with Linux environments
-
ability to identify critical weak spots in distributed software systems
-
experience in developing automated test cases and test plans
-
ability to deal with distributed systems
-
solid interpersonal skills conducive to a distributed environment
-
ability to work independently on multiple tasks
-
self-driven & motivated, with a strong work ethic and a passion for problem solving
-
innovate and automate and break the code
The right person in this role has an opportunity to make a huge impact at Cloudera and add value to our future decisions. If this position has piqued your interest and you have what we described - we invite you to apply! An adventure in data awaits.
Responsibilities
- Building and maintenance of resilient and scalable production infrastructure
- Improvement of monitoring systems
- Creation and support of development automation processes (CI / CD)
- Participation in infrastructure development
- Detection of problems in architecture and proposing of solutions for solving them
- Creation of tasks for system improvements for system scalability, performance and monitoring
- Analysis of product requirements in the aspect of devops
- Managing a team of DevOps, control of task deliveries
- Incident analysis and fixing
Technology stack
Linux, Bash, Salt/Ansible, LXC, libvirt, IPsec, VXLAN, Open vSwitch, OpenVPN, OSPF, BIRD, Cisco NX-OS, Multicast, PIM, LVM, software RAID, LUKS, PostgreSQL, nginx, haproxy, Prometheus, Grafana, Zabbix, GitLab, Capistrano
Skills and Experience
- Understanding of the distributed systems principles
- Understanding of principles for building a resistant network infrastructure
- Experience of Ubuntu Linux administration (Debian-like will be a plus)
- Strong knowledge of Bash
- Experience of working with LXC-containers
- Understanding and experience with infrastructure as a code approach
- Experience of development idempotent Ansible roles
- Experience with relational databases (PostgeSQL), ability to create simple SQL queries
- Experience with git
- Experience with monitoring and metric collect systems (Prometheus, Grafana, Zabbix)
- Understanding of dynamic routing (OSPF)
Preferred experience
- Experience of working with highload zero-downtown environments
- Experience of coding on Python
- Experience of working with IPsec, VXLAN, Open vSwitch
- Knowledge and experience of working with network equipment Cisco
- Experience of working with Cisco NX-OS
- Knowledge of principles of multicast protocols IGMP, PIM
- Experience of setting multicast on Cisco equipment
- Experience of working with Solarflare Onload
- Experience administering Atlassian products







