
We are seeking an experienced Lead DevOps Engineer with deep expertise in Kubernetes infrastructure design and implementation. This role requires someone who can architect, build, and manage enterprise-grade Kubernetes clusters from the ground up. You’ll lead modernization initiatives, shape infrastructure strategy, and work with cutting-edge cloud-native technologies.
🚀 Key Responsibilities
Infrastructure Design & Implementation
- Architect and design enterprise-grade Kubernetes clusters across AWS, Azure, and GCP.
- Build production-ready Kubernetes infrastructure with HA, scalability, and security best practices.
- Implement Infrastructure as Code with Terraform, Helm, and GitOps workflows.
- Set up monitoring, logging, and observability for Kubernetes workloads.
- Design and execute backup and disaster recovery strategies for containerized applications.
Leadership & Team Management
- Lead a team of 3–4 DevOps engineers, providing technical mentorship.
- Drive best practices in containerization, orchestration, and cloud-native development.
- Collaborate with development teams to optimize deployment strategies.
- Conduct code reviews and maintain infrastructure quality standards.
- Build knowledge-sharing culture with documentation and training.
Operational Excellence
- Manage and scale CI/CD pipelines integrated with Kubernetes.
- Implement security policies (RBAC, network policies, container scanning).
- Optimize cluster performance and cost-efficiency.
- Automate operations to minimize manual interventions.
- Ensure 99.9% uptime for production workloads.
Strategic Planning
- Define the infrastructure roadmap aligned with business needs.
- Evaluate and adopt new cloud-native technologies.
- Perform capacity planning and cloud cost optimization.
- Drive risk assessment and mitigation strategies.
🛠 Must-Have Technical Skills
Kubernetes Expertise
- 6+ years of hands-on Kubernetes experience in production.
- Deep knowledge of Kubernetes architecture (etcd, API server, scheduler, kubelet).
- Advanced Kubernetes networking (CNI, Ingress, Service mesh).
- Strong grasp of Kubernetes storage (CSI, PVs, StorageClasses).
- Experience with Operators and Custom Resource Definitions (CRDs).
Infrastructure as Code
- Terraform (advanced proficiency).
- Helm (developing and managing complex charts).
- Config management tools (Ansible, Chef, Puppet).
- GitOps workflows (ArgoCD, Flux).
Cloud Platforms
- Hands-on experience with at least 2 of the following:
- AWS: EKS, EC2, VPC, IAM, CloudFormation
- Azure: AKS, VNets, ARM templates
- GCP: GKE, Compute Engine, Deployment Manager
CI/CD & DevOps Tools
- Jenkins, GitLab CI, GitHub Actions, Azure DevOps
- Docker (advanced optimization and security practices)
- Container registries (ECR, ACR, GCR, Docker Hub)
- Strong Git workflows and branching strategies
Monitoring & Observability
- Prometheus & Grafana (metrics and dashboards)
- ELK/EFK stack (centralized logging)
- Jaeger/Zipkin (tracing)
- AlertManager (intelligent alerting)
💡 Good-to-Have Skills
- Service Mesh (Istio, Linkerd, Consul)
- Serverless (Knative, OpenFaaS, AWS Lambda)
- Running databases in Kubernetes (Postgres, MongoDB operators)
- ML pipelines (Kubeflow, MLflow)
- Security tools (Aqua, Twistlock, Falco, OPA)
- Compliance (SOC2, PCI-DSS, GDPR)
- Python/Go for automation
- Advanced Shell scripting (Bash/PowerShell)
🎓 Qualifications
- Bachelor’s in Computer Science, Engineering, or related field.
- Certifications (preferred):
- Certified Kubernetes Administrator (CKA)
- Certified Kubernetes Application Developer (CKAD)
- Cloud provider certifications (AWS/Azure/GCP).
Experience
- 6–7 years of DevOps/Infrastructure engineering.
- 4+ years of Kubernetes in production.
- 2+ years in a lead role managing teams.
- Experience with large-scale distributed systems and microservices.

About CoffeeBeans
About
CoffeeBeans Consulting is a technology partner dedicated to driving business transformation. With deep expertise in Cloud, Data, MLOPs, AI, Infrastructure services, Application modernization services, Blockchain, and Big Data, we help organizations tackle complex challenges and seize growth opportunities in today’s fast-paced digital landscape. We’re more than just a tech service provider; we're a catalyst for meaningful change
Tech stack
Candid answers by the company
CoffeeBeans Consulting, founded in 2017, is a high-end technology consulting firm that helps businesses build better products and improve delivery quality through a mix of engineering, product, and process expertise. They work across domains to deliver scalable backend systems, data engineering pipelines, and AI-driven solutions, often using modern stacks like Java, Spring Boot, Python, Spark, Snowflake, Azure, and AWS. With a strong focus on clean architecture, performance optimization, and practical problem-solving, CoffeeBeans partners with clients for both internal and external projects—driving meaningful business outcomes through tech excellence.
Similar jobs
About the role:
We are seeking an experienced DevOps Engineer with deep expertise in Jenkins, Docker, Ansible, and Kubernetes to architect and maintain secure, scalable infrastructure and CI/CD pipelines. This role emphasizes security-first DevOps practices, on-premises Kubernetes operations, and integration with data engineering workflows.
🛠 Required Skills & Experience
Technical Expertise
- Jenkins (Expert): Advanced pipeline development, DSL scripting, security integration, troubleshooting
- Docker (Expert): Secure multi-stage builds, vulnerability management, optimisation for Java/Scala/Python
- Ansible (Expert): Complex playbook development, configuration management, automation at scale
- Kubernetes (Expert - Primary Focus): On-premises cluster operations, security hardening, networking, storage management
- SonarQube/Code Quality (Strong): Integration, quality gate enforcement, threshold management
- DevSecOps (Strong): Security scanning, compliance automation, vulnerability remediation, workload governance
- Spark ETL/ETA (Moderate): Understanding of distributed data processing, job configuration, runtime behavior
Core Competencies
- Deep understanding of DevSecOps principles and security-first automation
- Strong troubleshooting and problem-solving abilities across complex distributed systems
- Experience with infrastructure-as-code and GitOps methodologies
- Knowledge of compliance frameworks and security standards
- Ability to mentor teams and drive best practice adoption
🎓Qualifications
- 6 - 10 Years years of hands-on DevOps
- Proven track record with Jenkins, Docker, Kubernetes, and Ansible in production environments
- Experience managing on-premises Kubernetes clusters (bare-metal preferred)
- Strong background in security hardening and compliance automation
- Familiarity with data engineering platforms and big data technologies
- Excellent communication and collaboration skills
🚀 Key Responsibilities
1.CI/CD Pipeline Architecture & Security
- Design, implement, and maintain enterprise-grade CI/CD pipelines in Jenkins with embedded security controls:
- Build greenfield pipelines and enhance/stabilize existing pipeline infrastructure
- Diagnose and resolve build, test, and deployment failures across multi-service environments
- Integrate security gates, compliance checks, and automated quality controls at every pipeline stage
- Manage and optimize SonarQube and static code analysis tooling:
- Enforce code quality and security scanning standards across all services
- Maintain organizational coding standards, vulnerability thresholds, and remediation workflows
- Automate quality gates as integral components of CI/CD processes
- Engineer optimized Docker images for Java, Scala, and Python applications:
- Implement multi-stage builds, layer optimization, and minimal base images
- Conduct image vulnerability scanning and enforce compliance policies
- Apply containerization best practices for security and performance
- Develop comprehensive Ansible automation:
- Create modular, reusable, and secure playbooks for configuration management
- Automate environment provisioning and application lifecycle operations
- Maintain infrastructure-as-code standards and version control
2.Kubernetes Platform Operations & Security
- Lead complete lifecycle management of on-premises/bare-metal Kubernetes clusters:
- Cluster provisioning, version upgrades, node maintenance, and capacity planning
- Configure and manage networking (CNI), persistent storage solutions, and ingress controllers
- Troubleshoot workload performance, resource constraints, and reliability issues
- Implement and enforce Kubernetes security best practices:
- Design and manage RBAC policies, service account isolation, and least-privilege access models
- Apply Pod Security Standards, network policies, secrets encryption, and certificate lifecycle management
- Conduct cluster hardening, security audits, monitoring, and policy governance
- Provide technical leadership to development teams:
- Guide secure deployment patterns and containerized application best practices
- Establish workload governance frameworks for distributed systems
- Drive adoption of security-first mindsets across engineering teams
3.Data Engineering Support
- Collaborate with data engineering teams on Spark-based workloads:
- Support deployment and operational tuning of Spark ETL/ETA jobs
- Understand cluster integration, job orchestration, and performance optimization
- Debug and troubleshoot Spark workflow issues in production environments
Job Description:
• Drive end-to-end automation from GitHub/GitLab/BitBucket to Deployment,
Observability and Enabling the SRE activities
• Guide operations support (setup, configuration, management, troubleshooting) of
digital platforms and applications
• Solid understanding of DevSecOps Workflows that support CI, CS, CD, CM, CT.
• Deploy, configure, and manage SaaS and PaaS cloud platform and applications
• Provide Level 1 (OS, patching) and Level 2 (app server instance troubleshooting)
• DevOps programming: writing scripts, building operations/server instance/app/DB
monitoring tools Set up / manage continuous build and dev project management
environment: JenkinX/GitHub Actions/Tekton, Git, Jira Designing secure networks,
systems, and application architectures
• Collaborating with cross-functional teams to ensure secure product development
• Disaster recovery, network forensics analysis, and pen-testing solutions
• Planning, researching, and developing security policies, standards, and procedures
• Awareness training of the workforce on information security standards, policies, and
best practices
• Installation and use of firewalls, data encryption and other security products and
procedures
• Maturity in understanding compliance, policy and cloud governance and ability to
identify and execute automation.
• At Wesco, we discuss more about solutions than problems. We celebrate innovation
and creativity.
About Company:
The company is a global leader in secure payments and trusted transactions. They are at the forefront of the digital revolution that is shaping new ways of paying, living, doing business and building relationships that pass on trust along the entire payments value chain, enabling sustainable economic growth. Their innovative solutions, rooted in a rock-solid technological base, are environmentally friendly, widely accessible and support social transformation.
- Role Overview
- Senior Engineer with a strong background and experience in cloud related technologies and architectures. Can design target cloud architectures to transform existing architectures together with the in-house team. Can actively hands-on configure and build cloud architectures and guide others.
- Key Knowledge
- 3-5+ years of experience in AWS/GCP or Azure technologies
- Is likely certified on one or more of the major cloud platforms
- Strong experience from hands-on work with technologies such as Terraform, K8S, Docker and orchestration of containers.
- Ability to guide and lead internal agile teams on cloud technology
- Background from the financial services industry or similar critical operational experience
DESIRED SKILLS AND EXPERIENCE
Strong analytical and problem-solving skills
Ability to work independently, learn quickly and be proactive
3-5 years overall and at least 1-2 years of hands-on experience in designing and managing DevOps Cloud infrastructure
Experience must include a combination of:
o Experience working with configuration management tools – Ansible, Chef, Puppet, SaltStack (expertise in at least one tool is a must)
o Ability to write and maintain code in at least one scripting language (Python preferred)
o Practical knowledge of shell scripting
o Cloud knowledge – AWS, VMware vSphere o Good understanding and familiarity with Linux
o Networking knowledge – Firewalls, VPNs, Load Balancers
o Web/Application servers, Nginx, JVM environments
o Virtualization and containers - Xen, KVM, Qemu, Docker, Kubernetes, etc.
o Familiarity with logging systems - Logstash, Elasticsearch, Kibana
o Git, Jenkins, Jira
Task:
- Need to run our software products in different international environments (on premise and cloud providers)
- Support the developers while debugging issues
- Analyse and monitor software during runtime to find bugs, performance issues and plan growth of the system
- Integrate new technologies to support our products while growing in the market
- Develop Continuous Integration and Continuous Deployment Pipelines
- Maintain our on premise hosted servers and applications, like operating system upgrades, software upgrades, introducing new database versions etc.
- Automation of task to reduce amount of human errors and parallelize work
We wish:
- Basic OS knowledge (Debian, CentOS, Suse Enterprise Linux)
- Webserver administration and optimization (Apache, Traefik)
- Database administration and optimization (Mysql/MariaDB, Oracle, Elasticsearch)
- jvm administration and optimization application server administration and optimization (Servicemix, Karaf, Glassfish, Springboot)
- Scripting experience (Perl, Python, PHP, Java)
- Monitoring experience (Icinga/Nagios, Appdynamics, Prometheus, Grafana)
- Knowledge container management (Docker/ContainerD, DC/OS, Kubernetes)
- Experience with automatic deployment processes (Ansible, Gitlab-CI, Helm)
- Define and optimize processes for system maintenance, continuous integration, and continuous delivery
- Excellent communication skill & proficiency in English is necessary
- Leadership skill with team motivational approach
- Good Team player
We Offer:
- Freedom to realise your own ideas & individual career & development opportunity.
- A motivating work environment, flat hierarchical structure, numerous company events which cannot be forgotten and fun at work place with flexibilities.
- Professional challenges and career development opportunities.
Your Contact for this position is Janki Raval .
Would you like to become part of this highly innovative, dynamic, and exciting world?
We look forward to your expressive Resume.
Our client is a call management solutions company, which helps small to mid-sized businesses use its virtual call center to manage customer calls and queries. It is an AI and cloud-based call operating facility that is affordable as well as feature-optimized. The advanced features offered like call recording, IVR, toll-free numbers, call tracking, etc are based on automation and enhances the call handling quality and process, for each client as per their requirements. They service over 6,000 business clients including large accounts like Flipkart and Uber.
- Beng involved in Configuration Management, Web Services Architectures, DevOps Implementation, Build & Release Management, Database management, Backups, and Monitoring.
- Creating and managing CI/ CD pipelines for microservice architectures.
- Creating and managing application configuration.
- Researching and planning architectures and tools for smooth deployments.
- Logging, metrics and alerting management.
What you need to have:
- Proficient in Linux Commands line and troubleshooting.
- Proficient in designing CI/ CD pipelines using jenkins. Experience in deployment using Ansible.
- Experience in microservices architecture deployment, Hands-on experience on Docker, Kubernetes, EKS.
- Knowledge of infrastructure management tools (Infrastructure as cloud) such as terraform, AWS cloudformation etc.
- Proficient in AWS Services. Deployment, Monitoring and troubleshooting applications in AWS.
- Configuration management tools like ansible/chef/puppet.
- Proficient in deployment of applications behind load balancers and proxy servers such as nginx, apache.
- Proficient in bash scripting, python scripting is an advantage.
- Experience with Logging, Monitoring, and Alerting tools like ELK(Elastic-search, Logstash, Kibana), Nagios. Graylog, splunk Prometheus, Grafana is a plus.
- Proficient in Configuration Management.
- Works independently without any supervision
- Work on continuous improvement of the products through innovation and learning. Someone with a knack for benchmarking and optimization
- Experience in deploying highly complex, distributed transaction processing systems.
- Stay abreast with new innovations and the latest technology trends and explore ways of leveraging these for improving the product in alignment with the business.
- As a component owner, where the component impacts across multiple platforms (5-10-member team), work with customers to obtain their requirements and deliver the end-to-end project.
Required Experience, Skills, and Qualifications
- 5+ years of experience as a DevOps Engineer. Experience with the Golang cycle is a plus
- At least one End to End CI/CD Implementation experience
- Excellent Problem Solving and Debugging skills in DevOps area· Good understanding of Containerization (Docker/Kubernetes)
- Hands-on Build/Package tool experience· Experience with AWS services Glue, Athena, Lambda, EC2, RDS, EKS/ECS, ALB, VPC, SSM, Route 53
- Experience with setting up CI/CD pipeline for Glue jobs, Athena, Lambda functions
- Experience architecting interaction with services and application deployments on AWS
- Experience with Groovy and writing Jenkinsfile
- Experience with repository management, code scanning/linting, secure scanning tools
- Experience with deployments and application configuration on Kubernetes
- Experience with microservice orchestration tools (e.g. Kubernetes, Openshift, HashiCorp Nomad)
- Experience with time-series and document databases (e.g. Elasticsearch, InfluxDB, Prometheus)
- Experience with message buses (e.g. Apache Kafka, NATS)
- Experience with key-value stores and service discovery mechanisms (e.g. Redis, HashiCorp Consul, etc)
- 5+ years hands-on experience with designing, deploying and managing core AWS services and infrastructure
- Proficiency in scripting using Bash, Python, Ruby, Groovy, or similar languages
- Experience in source control management, specifically with Git
- Hands-on experience in Unix/Linux and bash scripting
- Experience building, managing Helm-based build and release CI-CD pipelines for Kubernetes platforms (EKS, Openshift, GKE)
- Strong experience with orchestration and config management tools such as Terraform, Ansible or Cloudformation
- Ability to debug, analyze issues leveraging tools like App Dynamics, New Relic and Sumologic
- Knowledge of Agile Methodologies and principles
- Good writing and documentation skills
- Strong collaborator with the ability to work well with core teammates and our colleagues across STS
DevOps Engineer Skills Building a scalable and highly available infrastructure for data science Knows data science project workflows Hands-on with deployment patterns for online/offline predictions (server/serverless)
Experience with either terraform or Kubernetes
Experience of ML deployment frameworks like Kubeflow, MLflow, SageMaker Working knowledge of Jenkins or similar tool Responsibilities Owns all the ML cloud infrastructure (AWS) Help builds out an entirely CI/CD ecosystem with auto-scaling Work with a testing engineer to design testing methodologies for ML APIs Ability to research & implement new technologies Help with cost optimizations of infrastructure.
Knowledge sharing Nice to Have Develop APIs for machine learning Can write Python servers for ML systems with API frameworks Understanding of task queue frameworks like Celery










