
We are seeking an experienced Lead DevOps Engineer with deep expertise in Kubernetes infrastructure design and implementation. This role requires someone who can architect, build, and manage enterprise-grade Kubernetes clusters from the ground up. You’ll lead modernization initiatives, shape infrastructure strategy, and work with cutting-edge cloud-native technologies.
🚀 Key Responsibilities
Infrastructure Design & Implementation
- Architect and design enterprise-grade Kubernetes clusters across AWS, Azure, and GCP.
- Build production-ready Kubernetes infrastructure with HA, scalability, and security best practices.
- Implement Infrastructure as Code with Terraform, Helm, and GitOps workflows.
- Set up monitoring, logging, and observability for Kubernetes workloads.
- Design and execute backup and disaster recovery strategies for containerized applications.
Leadership & Team Management
- Lead a team of 3–4 DevOps engineers, providing technical mentorship.
- Drive best practices in containerization, orchestration, and cloud-native development.
- Collaborate with development teams to optimize deployment strategies.
- Conduct code reviews and maintain infrastructure quality standards.
- Build knowledge-sharing culture with documentation and training.
Operational Excellence
- Manage and scale CI/CD pipelines integrated with Kubernetes.
- Implement security policies (RBAC, network policies, container scanning).
- Optimize cluster performance and cost-efficiency.
- Automate operations to minimize manual interventions.
- Ensure 99.9% uptime for production workloads.
Strategic Planning
- Define the infrastructure roadmap aligned with business needs.
- Evaluate and adopt new cloud-native technologies.
- Perform capacity planning and cloud cost optimization.
- Drive risk assessment and mitigation strategies.
🛠 Must-Have Technical Skills
Kubernetes Expertise
- 6+ years of hands-on Kubernetes experience in production.
- Deep knowledge of Kubernetes architecture (etcd, API server, scheduler, kubelet).
- Advanced Kubernetes networking (CNI, Ingress, Service mesh).
- Strong grasp of Kubernetes storage (CSI, PVs, StorageClasses).
- Experience with Operators and Custom Resource Definitions (CRDs).
Infrastructure as Code
- Terraform (advanced proficiency).
- Helm (developing and managing complex charts).
- Config management tools (Ansible, Chef, Puppet).
- GitOps workflows (ArgoCD, Flux).
Cloud Platforms
- Hands-on experience with at least 2 of the following:
- AWS: EKS, EC2, VPC, IAM, CloudFormation
- Azure: AKS, VNets, ARM templates
- GCP: GKE, Compute Engine, Deployment Manager
CI/CD & DevOps Tools
- Jenkins, GitLab CI, GitHub Actions, Azure DevOps
- Docker (advanced optimization and security practices)
- Container registries (ECR, ACR, GCR, Docker Hub)
- Strong Git workflows and branching strategies
Monitoring & Observability
- Prometheus & Grafana (metrics and dashboards)
- ELK/EFK stack (centralized logging)
- Jaeger/Zipkin (tracing)
- AlertManager (intelligent alerting)
💡 Good-to-Have Skills
- Service Mesh (Istio, Linkerd, Consul)
- Serverless (Knative, OpenFaaS, AWS Lambda)
- Running databases in Kubernetes (Postgres, MongoDB operators)
- ML pipelines (Kubeflow, MLflow)
- Security tools (Aqua, Twistlock, Falco, OPA)
- Compliance (SOC2, PCI-DSS, GDPR)
- Python/Go for automation
- Advanced Shell scripting (Bash/PowerShell)
🎓 Qualifications
- Bachelor’s in Computer Science, Engineering, or related field.
- Certifications (preferred):
- Certified Kubernetes Administrator (CKA)
- Certified Kubernetes Application Developer (CKAD)
- Cloud provider certifications (AWS/Azure/GCP).
Experience
- 6–7 years of DevOps/Infrastructure engineering.
- 4+ years of Kubernetes in production.
- 2+ years in a lead role managing teams.
- Experience with large-scale distributed systems and microservices.

About CoffeeBeans
About
CoffeeBeans Consulting is a technology partner dedicated to driving business transformation. With deep expertise in Cloud, Data, MLOPs, AI, Infrastructure services, Application modernization services, Blockchain, and Big Data, we help organizations tackle complex challenges and seize growth opportunities in today’s fast-paced digital landscape. We’re more than just a tech service provider; we're a catalyst for meaningful change
Tech stack
Candid answers by the company
CoffeeBeans Consulting, founded in 2017, is a high-end technology consulting firm that helps businesses build better products and improve delivery quality through a mix of engineering, product, and process expertise. They work across domains to deliver scalable backend systems, data engineering pipelines, and AI-driven solutions, often using modern stacks like Java, Spring Boot, Python, Spark, Snowflake, Azure, and AWS. With a strong focus on clean architecture, performance optimization, and practical problem-solving, CoffeeBeans partners with clients for both internal and external projects—driving meaningful business outcomes through tech excellence.
Similar jobs
Job Title: Data Architect - Azure DevOps
Job Location: Mumbai (Andheri East)
About the company:
MIRACLE HUB CLIENT, is a predictive analytics and artificial intelligence company headquartered in Boston, US with offices across the globe. We build prediction models and algorithms to solve high priority business problems. Working across multiple industries, we have designed and developed breakthrough analytic products and decision-making tools by leveraging predictive analytics, AI, machine learning, and deep domain expertise.
Skill-sets Required:
- Design Enterprise Data Models
- Azure Data Specialist
- Security and Risk
- GDPR and other compliance knowledge
- Scrum/Agile
Job Role:
- Design and implement effective database solutions and models to store and retrieve company data
- Examine and identify database structural necessities by evaluating client operations, applications, and programming.
- Assess database implementation procedures to ensure they comply with internal and external regulations
- Install and organize information systems to guarantee company functionality.
- Prepare accurate database design and architecture reports for management and executive teams.
Desired Candidate Profile:
- Bachelor’s degree in computer science, computer engineering, or relevant field.
- A minimum of 3 years’ experience in a similar role.
- Strong knowledge of database structure systems and data mining.
- Excellent organizational and analytical abilities.
- Outstanding problem solver.
- IMMEDIATE JOINING (A notice period of 1 month is also acceptable)
- Excellent English communication and presentation skills, both verbal and written
- Charismatic, competitive and enthusiastic personality with negotiation skills
Compensation: NO BAR .
- Minimum 3+ yrs of Experience in DevOps with AWS Platform
- • Strong AWS knowledge and experience
- • Experience in using CI/CD automation tools (Git, Jenkins, Configuration deployment tools ( Puppet/Chef/Ansible)
- • Experience with IAC tools Terraform
- • Excellent experience in operating a container orchestration cluster (Kubernetes, Docker)
- • Significant experience with Linux operating system environments
- • Experience with infrastructure scripting solutions such as Python/Shell scripting
- • Must have experience in designing Infrastructure automation framework.
- • Good experience in any of the Setting up Monitoring tools and Dashboards ( Grafana/kafka)
- • Excellent problem-solving, Log Analysis and troubleshooting skills
- • Experience in setting up centralized logging for system (EKS, EC2) and application
- • Process-oriented with great documentation skills
- • Ability to work effectively within a team and with minimal supervision
environment. He/she must demonstrate a high level of ownership, integrity, and leadership
skills and be flexible and adaptive with a strong desire to learn & excel.
Required Skills:
- Strong experience working with tools and platforms like Helm charts, Circle CI, Jenkins,
- and/or Codefresh
- Excellent knowledge of AWS offerings around Cloud and DevOps
- Strong expertise in containerization platforms like Docker and container orchestration platforms like Kubernetes & Rancher
- Should be familiar with leading Infrastructure as Code tools such as Terraform, CloudFormation, etc.
- Strong experience in Python, Shell Scripting, Ansible, and Terraform
- Good command over monitoring tools like Datadog, Zabbix, Elk, Grafana, CloudWatch, Stackdriver, Prometheus, JFrog, Nagios, etc.
- Experience with Linux/Unix systems administration.
We're Hiring: DevOps Tech Lead with 7-9 Years of Experience! 🚀
Are you a seasoned DevOps professional with a passion for cloud technologies and automation? We have an exciting opportunity for a DevOps Tech Lead to join our dynamic team at our Gurgaon office.
🏢 ZoomOps Technolgy Solutions Private Limited
📍 Location: Gurgaon
💼 Full-time position
🔧 Key Skills & Requirements:
✔ 7-9 years of hands-on experience in DevOps roles
✔ Proficiency in Cloud Platforms like AWS, GCP, and Azure
✔ Strong background in Solution Architecture
✔ Expertise in writing Automation Scripts using Python and Bash
✔ Ability to manage IAC tools and CM tools like Terraform, Ansible, pulumi etc..
Responsibilities:
🔹 Lead and mentor the DevOps team, driving innovation and best practices
🔹 Design and implement robust CI/CD pipelines for seamless software delivery
🔹 Architect and optimize cloud infrastructure for scalability and efficiency
🔹 Automate manual processes to enhance system reliability and performance
🔹 Collaborate with cross-functional teams to drive continuous improvement
Join us to work on exciting projects and make a significant impact in the tech space!
Apply now and take the next step in your DevOps career!
Kutumb is the first and largest communities platform for Bharat. We are growing at an exponential trajectory. More than 1 Crore users use Kutumb to connect with their community. We are backed by world-class VCs and angel investors. We are growing and looking for exceptional Infrastructure Engineers to join our Engineering team.
More on this here - https://kutumbapp.com/why-join-us.html">https://kutumbapp.com/why-join-us.html
We’re excited if you have:
- Recent experience designing and building unified observability platforms that enable companies to use the sometimes-overwhelming amount of available data (metrics, logs, and traces) to determine quickly if their application or service is operating as desired
- Expertise in deploying and using open-source observability tools in large-scale environments, including Prometheus, Grafana, ELK (ElasticSearch + Logstash + Kibana), Jaeger, Kiali, and/or Loki
- Familiarity with open standards like OpenTelemetry, OpenTracing, and OpenMetrics
- Familiarity with Kubernetes and Istio as the architecture on which the observability platform runs, and how they integrate and scale. Additionally, the ability to contribute improvements back to the joint platform for the benefit of all teams
- Demonstrated customer engagement and collaboration skills to curate custom dashboards and views, and identify and deploy new tools, to meet their requirements
- The drive and self-motivation to understand the intricate details of a complex infrastructure environment
- Using CICD tools to automatically perform canary analysis and roll out changes after passing automated gates (think Argo & keptn)
- Hands-on experience working with AWS
- Bonus points for knowledge of ETL pipelines and Big data architecture
- Great problem-solving skills & takes pride in your work
- Enjoys building scalable and resilient systems, with a focus on systems that are robust by design and suitably monitored
- Abstracting all of the above into as simple of an interface as possible (like Knative) so developers don't need to know about it unless they choose to open the escape hatch
What you’ll be doing:
- Design and build automation around the chosen tools to make onboarding new services easy for developers (dashboards, alerts, traces, etc)
- Demonstrate great communication skills in working with technical and non-technical audiences
- Contribute new open-source tools and/or improvements to existing open-source tools back to the CNCF ecosystem
Tools we use:
Kops, Argo, Prometheus/ Loki/ Grafana, Kubernetes, AWS, MySQL/ PostgreSQL, Apache Druid, Cassandra, Fluentd, Redis, OpenVPN, MongoDB, ELK
What we offer:
- High pace of learning
- Opportunity to build the product from scratch
- High autonomy and ownership
- A great and ambitious team to work with
- Opportunity to work on something that really matters
- Top of the class market salary and meaningful ESOP ownership
About BootLabs
https://www.google.com/url?q=https://www.bootlabs.in/&sa=D&source=calendar&ust=1667803146567128&usg=AOvVaw1r5g0R_vYM07k6qpoNvvh6" target="_blank">https://www.bootlabs.in/
-We are a Boutique Tech Consulting partner, specializing in Cloud Native Solutions.
-We are obsessed with anything “CLOUD”. Our goal is to seamlessly automate the development lifecycle, and modernize infrastructure and its associated applications.
-With a product mindset, we enable start-ups and enterprises on the cloud
transformation, cloud migration, end-to-end automation and managed cloud services.
-We are eager to research, discover, automate, adapt, empower and deliver quality solutions on time.
-We are passionate about customer success. With the right blend of experience and exuberant youth in our in-house team, we have significantly impacted customers.
Technical Skills:
• Expertise in any one hyper scaler (AWS/AZURE/GCP), including basic services like networking,
data and workload management.
- AWS
Networking: VPC, VPC Peering, Transit Gateway, Route Tables, Security Groups, etc.
Data: RDS, DynamoDB, Elastic Search
Workload: EC2, EKS, Lambda, etc.
- Azure
Data: Azure MySQL, Azure MSSQL, etc.
Workload: AKS, Virtual Machines, Azure Functions
- GCP
Data: Cloud Storage, DataFlow, Cloud SQL, Firestore, BigTable, BigQuery
Workload: GKE, Instances, App Engine, Batch, etc.
• Experience in any one of the CI/CD tools (Gitlab/Github/Jenkins) including runner setup,
templating and configuration.
• Kubernetes experience or Ansible Experience (EKS/AKS/GKE), basics like pod, deployment,
networking, service mesh. Used any package manager like helm.
• Scripting experience (Bash/python), automation in pipelines when required, system service.
• Infrastructure automation (Terraform/pulumi/cloud formation), write modules, setup pipeline and version the code.
Optional:
• Experience in any programming language is not required but is appreciated.
• Good experience in GIT, SVN or any other code management tool is required.
• DevSecops tools like (Qualys/SonarQube/BlackDuck) for security scanning of artifacts, infrastructure and code.
• Observability tools (Opensource: Prometheus, Elasticsearch, Open Telemetry; Paid: Datadog,
24/7, etc)
As DevOps Engineer, you are responsible to setup and maintain GIT repository, DevOps tools like Jenkins, UCD, Docker, Kubernetes, Jfrog Artifactory, Cloud monitoring tools, Cloud security.
- Setup, configure, and maintain GIT repos, Jenkins, UCD, etc. for multi hosting cloud environments.
- Architect and maintain the server infrastructure in AWS. Build highly resilient infrastructure following industry best practices.
- Working on Docker images and maintaining Kubernetes clusters.
- Develop and maintain the automation scripts using Ansible or other available tools.
- Maintain and monitor cloud Kubernetes Clusters and patching when necessary.
- Working on Cloud security tools to keep applications secured.
- Participate in software development lifecycle, specifically infra design, execution, and debugging required to achieve successful implementation of integrated solutions within the portfolio.
- Required Technical and Professional Expertise.
- Minimum 4-6 years of experience in IT industry.
- Expertise in implementing and managing Devops CI/CD pipeline.
- Experience in DevOps automation tools. And Very well versed with DevOps Frameworks, Agile.
- Working knowledge of scripting using shell, Python, Terraform, Ansible or puppet or chef.
- Experience and good understanding in any of Cloud like AWS, Azure, Google cloud.
- Knowledge of Docker and Kubernetes is required.
- Proficient in troubleshooting skills with proven abilities in resolving complex technical issues.
- Experience with working with ticketing tools.
- Middleware technologies knowledge or database knowledge is desirable.
- Experience and well versed with Jira tool is a plus.
We look forward to connecting with you. As you may take time to review this opportunity, we will wait for a reasonable time of around 3-5 days before we screen the collected applications and start lining up job discussions with the hiring manager. However, we assure you that we will attempt to maintain a reasonable time window for successfully closing this requirement. The candidates will be kept informed and updated on the feedback and application status.
- You have experience of 2-4 years in building high-performance consumer-facing mobile applications at Product companies of a decent scale.
- You can write code preferably in Golang and Python.
- You have experience with debugging production issues and writing RCAs.
- You have demonstrable stories of being on-call and how outages have been handled.
- You have experience developing products on Kubernetes and cloud providers like GCP and AWS.
- You have worked with Cloud Native (CNCF) technologies.
- You have experience automating CI/CD pipelines.
- You are an excellent collaborator & communicator. You know that start-ups are a team sport.
- You listen to others, aren’t afraid to speak your mind and always try to ask the right questions.
- You are excited by the prospect of working in a distributed team and company
|
Job Title: |
Senior Cloud Infrastructure Engineer (AWS) |
||
|
Department & Team |
Technology |
Location: |
India /UK / Ukraine |
|
Reporting To: |
Infrastructure Services Manager |
|
Role Purpose: |
|
The purpose of the role is to ensure high systems availability across a multi-cloud environment, enabling the business to continue meeting its objectives.
This role will be mostly AWS / Linux focused but will include a requirement to understand comparative solutions in Azure.
Desire to maintain full hands-on status but to add Team Lead responsibilities in future
Client’s cloud strategy is based around a dual vendor solutioning model, utilising AWS and Azure services. This enables us to access more technologies and helps mitigate risks across our infrastructure.
The Infrastructure Services Team is responsible for the delivery and support of all infrastructure used by Client twenty-four hours a day, seven days a week. The team’s primary function is to install, maintain, and implement all infrastructure-based systems, both On Premise and Cloud Hosted. The Infrastructure Services group already consists of three teams:
1. Network Services Team – Responsible for IP Network and its associated components 2. Platform Services Team – Responsible for Server and Storage systems 3. Database Services Team – Responsible for all Databases
This role will report directly into the Infrastructure Services Manager and will have responsibility for the day to day running of the multi-cloud environment, as well as playing a key part in designing best practise solutions. It will enable the Client business to achieve its stated objectives by playing a key role in the Infrastructure Services Team to achieve world class benchmarks of customer service and support.
|
|
Responsibilities: |
|
Operations · Deliver end to end technical and user support across all platforms (On-premise, Azure, AWS) · Day to day, fully hands-on OS management responsibilities (Windows and Linux operating systems) · Ensure robust server patching schedules are in place and meticulously followed to help reduce security related incidents. · Contribute to continuous improvement efforts around cost optimisation, security enhancement, performance optimisation, operational efficiency and innovation. · Take an ownership role in delivering technical projects, ensuring best practise methods are followed. · Design and deliver solutions around the concept of “Planning for Failure”. Ensure all solutions are deployed to withstand system / AZ failure. · Work closely with Cloud Architects / Infrastructure Services Manager to identify and eliminate “waste” across cloud platforms. · Assist several internal DevOps teams with day to day running of pipeline management and drive standardisation where possible. · Ensure all Client data in all forms are backed up in a cost-efficient way. · Use the appropriate monitoring tools to ensure all cloud / on-premise services are continuously monitored. · Drive utilisation of most efficient methods of resource deployment (Terraform, CloudFormation, Bootstrap) · Drive the adoption, across the business, of serverless / open source / cloud native technologies where applicable. · Ensure system documentation remains up to date and designed according to AWS/Azure best practise templates. · Participate in detailed architectural discussions, calling on internal/external subject matter experts as needed, to ensure solutions are designed for successful deployment. · Take part in regular discussions with business executives to translate their needs into technical and operational plans. · Engaging with vendors regularly in terms of verifying solutions and troubleshooting issues. · Designing and delivering technology workshops to other departments in the business. · Takes initiatives for improvement of service delivery. · Ensure that Client delivers a service that resonates with customer’s expectations, which sets Client apart from its competitors. · Help design necessary infrastructure and processes to support the recovery of critical technology and systems in line with contingency plans for the business. · Continually assess working practices and review these with a view to improving quality and reducing costs. · Champions the new technology case and ensure new technologies are investigated and proposals put forward regarding suitability and benefit. · Motivate and inspire the rest of the infrastructure team and undertake necessary steps to raise competence and capability as required. · Help develop a culture of ownership and quality throughout the Infrastructure Services team.
|
|
Skills & Experience: |
|
· AWS Certified Solutions Architect – Professional - REQUIRED · Microsoft Azure Fundamentals AZ-900 – REQUIRED AS MINIMUM AZURE CERT · Red Hat Certified Engineer (RHCE ) - REQUIRED · Must be able to demonstrate working knowledge of designing, implementing and maintaining best practise AWS solutions. (To lesser extend Azure) · Proven examples of ownership of large AWS project implementations in Enterprise settings. · Experience managing the monitoring of infrastructure / applications using tools including CloudWatch, Solarwinds, New Relic, etc. · Must have practical working knowledge of driving cost optimisation, security enhancement and performance optimisation. · Solid understanding and experience of transitioning IaaS solutions to serverless technology · Must have working production knowledge of deploying infrastructure as code using Terraform. · Need to be able to demonstrate security best-practise when designing solutions in AWS. · Working knowledge around optimising network traffic performance an delivering high availability while keeping a check on costs. · Working experience of ‘On Premise to Cloud’ migrations · Experience of Data Centre technology infrastructure development and management · Must have experience working in a DevOps environment · Good working knowledge around WAN connectivity and how this interacts with the various entry point options into AWS, Azure, etc. · Working knowledge of Server and Storage Devices · Working knowledge of MySQL and SQL Server / Cloud native databases (RDS / Aurora) · Experience of Carrier Grade Networking - On Prem and Cloud · Experience in virtualisation technologies · Experience in ITIL and Project management · Providing senior support to the Service Delivery team. · Good understanding of new and emerging technologies · Excellent presentation skills to both an internal and external audience · The ability to share your specific expertise to the rest of the Technology group · Experience with MVNO or Network Operations background from within the Telecoms industry. (Optional) · Working knowledge of one or more European languages (Optional)
|
|
Behavioural Fit: |
|
· Professional appearance and manner · High personal drive; results oriented; makes things happen; “can do attitude” · Can work and adapt within a highly dynamic and growing environment · Team Player; effective at building close working relationships with others · Effectively manages diversity within the workplace · Strong focus on service delivery and the needs and satisfaction of internal clients · Able to see issues from a global, regional and corporate perspective · Able to effectively plan and manage large projects · Excellent communication skills and interpersonal skills at all levels · Strong analytical, presentation and training skills · Innovative and creative · Demonstrates technical leadership · Visionary and strategic view of technology enablers (creative and innovative) · High verbal and written communication ability, able to influence effectively at all levels · Possesses technical expertise and knowledge to lead by example and input into technical debates · Depth and breadth of experience in infrastructure technologies · Enterprise mentality and global mindset · Sense of humour
|
|
Role Key Performance Indicators: |
|
· Design and deliver repeatable, best in class, cloud solutions. · Pro-actively monitor service quality and take action to scale operational services, in line with business growth. · Generate operating efficiencies, to be agreed with Infrastructure Services Manager. · Establish a “best in sector” level of operational service delivery and insight. · Help create an effective team. |
• Bachelor or Master Degree in Computer Science, Software Engineering from a reputed
University.
• 5 - 8 Years of experience in building scalable, secure and compliant systems.
• More than 2 years of experience in working with GCP deployment for millions of daily visitors
• 5+ years hosting experience in a large heavy-traffic environment
• 5+ years production application support experience in a high uptime environment
• Software development and monitoring knowledge with Automated builds
• Technology:
o Cloud: AWS or Google Cloud
o Source Control: Gitlab or Bitbucket or Github
o Container Concepts: Docker, Microservices
o Continuous Integration: Jenkins, Bamboos
o Infrastructure Automation: Puppet, Chef or Ansible
o Deployment Automation: Jenkins, VSTS or Octopus Deploy
o Orchestration: Kubernets, Mesos, Swarm
o Automation: Node JS or Python
o Linux environment network administration, DNS, firewall and security management
• Ability to be adapt to the startup culture, handle multiple competing priorities, meet
deadlines and troubleshoot problems.












