Job Title: Lead DevOps Engineer
Experience Required: 8+ years in DevOps or related fields
Employment Type: Full-time
About the Role:
We are seeking a highly skilled and experienced Lead DevOps Engineer. This role will focus on driving the design, implementation, and optimization of our CI/CD pipelines, cloud infrastructure, and operational processes. As a Lead DevOps Engineer, you will play a pivotal role in enhancing the scalability, reliability, and security of our systems while mentoring a team of DevOps engineers to achieve operational excellence.
Key Responsibilities:
Infrastructure Management: Architect, deploy, and maintain scalable, secure, and resilient cloud infrastructure (e.g., AWS, Azure, or GCP).
CI/CD Pipelines: Design and optimize CI/CD pipelines, to improve development velocity and deployment quality.
Automation: Automate repetitive tasks and workflows, such as provisioning cloud resources, configuring servers, managing deployments, and implementing infrastructure as code (IaC) using tools like Terraform, CloudFormation, or Ansible.
Monitoring & Logging: Implement robust monitoring, alerting, and logging systems for enterprise and cloud-native environments using tools like Prometheus, Grafana, ELK Stack, NewRelic or Datadog.
Security: Ensure the infrastructure adheres to security best practices, including vulnerability assessments and incident response processes.
Collaboration: Work closely with development, QA, and IT teams to align DevOps strategies with project goals.
Mentorship: Lead, mentor, and train a team of DevOps engineers to foster growth and technical expertise.
Incident Management: Oversee production system reliability, including root cause analysis and performance tuning.
Required Skills & Qualifications:
Technical Expertise:
Strong proficiency in cloud platforms like AWS, Azure, or GCP.
Advanced knowledge of containerization technologies (e.g., Docker, Kubernetes).
Expertise in IaC tools such as Terraform, CloudFormation, or Pulumi.
Hands-on experience with CI/CD tools, particularly Bitbucket Pipelines, Jenkins, GitLab CI/CD, Github Actions or CircleCI.
Proficiency in scripting languages (e.g., Python, Bash, PowerShell).
Soft Skills:
Excellent communication and leadership skills.
Strong analytical and problem-solving abilities.
Proven ability to manage and lead a team effectively.
Experience:
8+ years of experience in DevOps or Site Reliability Engineering (SRE).
3+ years in a leadership or team lead role, with proven experience managing distributed teams, mentoring team members, and driving cross-functional collaboration.
Strong understanding of microservices, APIs, and serverless architectures.
Nice to Have:
Certifications like AWS Certified Solutions Architect, Kubernetes Administrator, or similar.
Experience with GitOps tools such as ArgoCD or Flux.
Knowledge of compliance standards (e.g., GDPR, SOC 2, ISO 27001).
Perks & Benefits:
Competitive salary and performance bonuses.
Comprehensive health insurance for you and your family.
Professional development opportunities and certifications, including sponsored certifications and access to training programs to help you grow your skills and expertise.
Flexible working hours and remote work options.
Collaborative and inclusive work culture.
Join us to build and scale world-class systems that empower innovation and deliver exceptional user experiences.

Similar jobs
Requirements
- 3+ years work experience writing clean production code
- Well versed with maintaining infrastructure as code (Terraform, Cloudformation etc). High proficiency with Terraform / Terragrunt is absolutely critical
- Experience of setting CI/CD pipelines from scratch
- Experience with AWS(EC2, ECS, RDS, Elastic Cache etc), AWS lambda, Kubernetes, Docker, ServiceMesh
- Experience with ETL pipelines, Bigdata infra
- Understanding of common security issues
Roles / Responsibilities:
- Write terraform modules for deploying different component of infrastructure in AWS like Kubernetes, RDS, Prometheus, Grafana, Static Website
- Configure networking, autoscaling. continuous deployment, security and multiple environments
- Make sure the infrastructure is SOC2, ISO 27001 and HIPAA compliant
- Automate all the steps to provide a seamless experience to developers.
About us
Classplus is India's largest B2B ed-tech start-up, enabling 1 Lac+ educators and content creators to create their digital identity with their own branded apps. Starting in 2018, we have grown more than 10x in the last year, into India's fastest-growing video learning platform.
Over the years, marquee investors like Tiger Global, Surge, GSV Ventures, Blume, Falcon, Capital, RTP Global, and Chimera Ventures have supported our vision. Thanks to our awesome and dedicated team, we achieved a major milestone in March this year when we secured a “Series-D” funding.
Now as we go global, we are super excited to have new folks on board who can take the rocketship higher🚀. Do you think you have what it takes to help us achieve this? Find Out Below!
What will you do?
· Define the overall process, which includes building a team for DevOps activities and ensuring that infrastructure changes are reviewed from an architecture and security perspective
· Create standardized tooling and templates for development teams to create CI/CD pipelines
· Ensure infrastructure is created and maintained using terraform
· Work with various stakeholders to design and implement infrastructure changes to support new feature sets in various product lines.
· Maintain transparency and clear visibility of costs associated with various product verticals, environments and work with stakeholders to plan for optimization and implementation
· Spearhead continuous experimenting and innovating initiatives to optimize the infrastructure in terms of uptime, availability, latency and costs
You should apply, if you
1. Are a seasoned Veteran: Have managed infrastructure at scale running web apps, microservices, and data pipelines using tools and languages like JavaScript(NodeJS), Go, Python, Java, Erlang, Elixir, C++ or Ruby (experience in any one of them is enough)
2. Are a Mr. Perfectionist: You have a strong bias for automation and taking the time to think about the right way to solve a problem versus quick fixes or band-aids.
3. Bring your A-Game: Have hands-on experience and ability to design/implement infrastructure with GCP services like Compute, Database, Storage, Load Balancers, API Gateway, Service Mesh, Firewalls, Message Brokers, Monitoring, Logging and experience in setting up backups, patching and DR planning
4. Are up with the times: Have expertise in one or more cloud platforms (Amazon WebServices or Google Cloud Platform or Microsoft Azure), and have experience in creating and managing infrastructure completely through Terraform kind of tool
5. Have it all on your fingertips: Have experience building CI/CD pipeline using Jenkins, Docker for applications majorly running on Kubernetes. Hands-on experience in managing and troubleshooting applications running on K8s
6. Have nailed the data storage game: Good knowledge of Relational and NoSQL databases (MySQL,Mongo, BigQuery, Cassandra…)
7. Bring that extra zing: Have the ability to program/script is and strong fundamentals in Linux and Networking.
8. Know your toys: Have a good understanding of Microservices architecture, Big Data technologies and experience with highly available distributed systems, scaling data store technologies, and creating multi-tenant and self hosted environments, that’s a plus
Being Part of the Clan
At Classplus, you’re not an “employee” but a part of our “Clan”. So, you can forget about being bound by the clock as long as you’re crushing it workwise😎. Add to that some passionate people working with and around you, and what you get is the perfect work vibe you’ve been looking for!
It doesn’t matter how long your journey has been or your position in the hierarchy (we don’t do Sirs and Ma’ams); you’ll be heard, appreciated, and rewarded. One can say, we have a special place in our hearts for the Doers! ✊🏼❤️
Are you a go-getter with the chops to nail what you do? Then this is the place for you.
• Hands-on experience in Azure.
• Build and maintain CI/CD tools and pipelines.
• Designing and managing highly scalable, reliable, and fault-tolerant infrastructure & networking that forms the backbone of distributed systems at RARA Now.
• Continuously improve code quality, product execution, and customer delight.
• Communicate, collaborate and work effectively across distributed teams in a global environment.
• Operate to strengthen teams across their product with their knowledge base
• Contribute to improving team relatedness, and help build a culture of camaraderie.
• Continuously refactor applications to ensure high-quality design
• Pair with team members on functional and non-functional requirements and spread design philosophy and goals across the team
• Excellent bash, and scripting fundamentals and hands-on with scripting in programming languages such as Python, Ruby, Golang, etc.
• Good understanding of distributed system fundamentals and ability to troubleshoot issues in a larger distributed infrastructure
• Working knowledge of the TCP/IP stack, internet routing, and load balancing
• Basic understanding of cluster orchestrators and schedulers (Kubernetes)
• Deep knowledge of Linux as a production environment, and container technologies. e.g., Docker, Infrastructure as Code such as Terraform, and K8s administration at large scale.
• Have worked on production distributed systems and have an understanding of microservices architecture, RESTful services, and CI/CD.

Platform Services Engineer
DevSecOps Engineer
- Strong Systems Experience- Linux, networking, cloud, APIs
- Scripting language Programming - Shell, Python
- Strong Debugging Capability
- AWS Platform -IAM, Network,EC2, Lambda, S3, CloudWatch
- Knowledge on Terraform, Packer, Ansible, Jenkins
- Observability - Prometheus, InfluxDB, Dynatrace,
- Grafana, Splunk • DevSecOps-CI/CD - Jenkins
- Microservices
- Security & Access Management
- Container Orchestration a plus - Kubernetes, Docker etc.
- Big Data Platforms knowledge EMR, Databricks. Cloudera a plus
• Expertise in any one hyper-scale (AWS/AZURE/GCP), including basic services like networking, data and workload management.
o AWS
Networking: VPC, VPC Peering, Transit Gateway, RouteTables, SecurityGroups, etc.
Data: RDS, DynamoDB, ElasticSearch
Workload: EC2, EKS, Lambda, etc.
o Azure
Networking: VNET, VNET Peering,
Data: Azure MySQL, Azure MSSQL, etc.
Workload: AKS, VirtualMachines, AzureFunctions
o GCP
Networking: VPC, VPC Peering, Firewall, Flowlogs, Routes, Static and External IP Addresses
Data: Cloud Storage, DataFlow, Cloud SQL, Firestore, BigTable, BigQuery
Workload: GKE, Instances, App Engine, Batch, etc.
• Experience in any one of the CI/CD tools (Gitlab/Github/Jenkins) including runner setup, templating and configuration.
• Kubernetes experience or Ansible Experience (EKS/AKS/GKE), basics like pod, deployment, networking, service mesh. Used any package manager like helm.
• Scripting experience (Bash/python), automation in pipelines when required, system service.
• Infrastructure automation (Terraform/pulumi/cloudformation), write modules, setup pipeline and version the code.
Optional
• Experience in any programming language is not required but is appreciated.
• Good experience in GIT, SVN or any other code management tool is required.
• DevSecops tools like (Qualys/SonarQube/BlackDuck) for security scanning of artifacts, infrastructure and code.
• Observability tools (Opensource: Prometheus, Elasticsearch, OpenTelemetry; Paid: Datadog, 24/7, etc)
Azure DeVops
On premises to Azure Migration
Docker, Kubernetes
Terraform CI CD pipeline
9+ Location
Budget – BG, Hyderabad, Remote , Hybrid –
Budget – up to 30 LPA
Roles and Responsibilities
- Primary stakeholder collaborating with Dir Engineering on software/infrastructure architecture, monitoring/alerting framework and all other architectural level technical issues
- Design and manage implementation of Silvermine’s high performance, scalable, extensible and resilient microservices application stack based of existing, partially migrated monolithic application and for new product development. Includes:
- Utilizing either ECS Fargate (no EC2 clusters) or EKS as the orchestration framework – to be tested up to a minimum of 100k concurrent users
- Exploring, designing and implementing use of on demand compute (Lambda) where appropriate
- Scalable and redundant data architecture supporting microservices design principles
- A scalable reverse proxy layer to isolate microservices from managing network connections
- Utilizing CDN capabilities to offload origin load via an intelligent caching strategy
- Leveraging best in breed AWS service offerings to enable team to focus on application stack instead of application scaffolding while minimizing operational complexity and cost
- Monitoring and optimizing of stack for
- Security and monitoring
- Leverage AWS and 3rd party services to monitor the application stack and data; secure them from DDOS attacks and security breaches; and alert the team in the vent of an incident
- Using APM and logging tools:
- Monitor application stack and infrastructure component performance
- Proactively detect, triage and mitigate stack performance issues
- Alert upon exception events
- Provide triaging tools for debugging and Root Cause Analysis.
- Enhance the CI/CD pipeline to support automated testing, a resilient deployment model (e.g., blue-green, canary) and 100% rollback support (including the data layer)
- Development a comprehensive, supportable, repeatable IAC implementation using CloudFormation or Terraform
- Take a leadership role and exhibit expertise in the development of standards, architectural governance, design patterns, best practices and optimization of existing architecture.
- Partner with teams and leaders to provide strategic consultation for business process design/optimization, creating strategic technology road maps, performing rapid prototyping and implementing technical solutions to accelerate the fulfillment of the business strategic vision.
- Staying up to date on emerging technologies (AI, Automation, Cloud etc.) and trends with a clear focus on productivity, ease of use and fit-for-purpose, by researching, testing, and evaluating.
- Providing POCs and product implementation guidelines.
- Applying imagination and innovation by creating, inventing, and implementing new or better approaches, alternatives and breakthrough ideas that are valued by customers within the function.
- Assessing current state of solutions, defining future state needs, identifying gaps and recommending new technology solutions and strategic business execution improvements.
- Overseeing and facilitating the evaluation and selection technology, product standards and the design of standard configurations/implementation patterns.
- Partnering with other architects and solution owners to create standards and set strategies for the enterprise.
- Communicating directly with business colleagues on applying digital workplace technologies to solve identified business challenges.
Skills Required:
- Good mentorship skills to coach and guide the team on AWS DevOps.
- Jenkins, Python, Pipeline as Code, Cloud Formation Templates and Terraform.
- Experience with Dockers, Containers, Lambda and Fargate is a must
- Experience with CI/CD and Release management
- Strong proficiency in PowerShell scripting
- Demonstrable expertise in Java
- Familiarity with REST APIs
Qualifications:
- Minimum of 5 years of relevant experience in Devops.
- Bachelors or Masters in Computer Science or equivalent degree.
- AWS Certifications is added advantage
As an Infrastructure Engineer at Navi, you will be building a resilient infrastructure platform, using modern Infrastructure engineering practices.
You will be responsible for the availability, scaling, security, performance and monitoring of the navi Cloud platform. You’ll be joining a team that follows best practices in infrastructure as code
Your Key Responsibilities
- Build out the Infrastructure components like API Gateway, Service Mesh, Service Discovery, container orchestration platform like kubernetes.
- Developing reusable Infrastructure code and testing frameworks
- Build meaningful abstractions to hide the complexities of provisioning modern infrastructure components
- Design a scalable Centralized Logging and Metrics platform
- Drive solutions to reduce Mean Time To Recovery(MTTR), enable High Availability.
What to Bring
- Good to have experience in managing large scale cloud infrastructure, preferable AWS and Kubernetes
- Experience in developing applications using programming languages like Java, Python and Go
- Experience in handling logs and metrics at a high scale.
- Systematic problem-solving approach, coupled with strong communication skills and a sense of ownership and drive.
DevOps Engineer responsibilities include deploying product updates, identifying production issues, and implementing integrations that meet customer needs. If you have a solid background in working with cloud technologies, set up efficient deployment processes, and are motivated to work with diverse and talented teams, we’d like to meet you.
Ultimately, you will execute and automate operational processes fast, accurately, and securely.
Skills and Experience
-
4+ years of experience in building infrastructure experience with Cloud Providers ( AWS, Azure, GCP)
-
Experience in deploying containerized applications build on NodeJS/PHP/Python to kubernetes cluster.
-
Experience in monitoring production workload with relevant metrics and dashboards.
-
Experience in writing automation scripts using Shell, Python, Terraform, etc.
-
Experience in following security practices while setting up the infrastructure.
-
Self-motivated, able, and willing to help where help is needed
-
Able to build relationships, be culturally sensitive, have goal alignment, have learning agility
Roles and Responsibilities
-
Manage various resources across different cloud providers. (Azure, AWS, and GCP)
-
Monitor and optimize infrastructure cost.
-
Manage various kubernetes clusters with appropriate monitoring and alerting setup.
-
Build CI/CD pipelines to orchestrate provisioning and deployment of various services into kubernetes infrastructure.
-
Work closely with the development team on upcoming features to determine the correct infrastructure and related tools.
-
Assist the support team with escalated customer issues.
-
Develop, improve, and thoroughly document operational practices and procedures.
-
Responsible for setting up good security practices across various clouds.

