
Site Reliability Engineer
Experience - 4 - 8 Years
Location - Bangalore (Hybrid)
We are seeking a highly skilled Site Reliability Engineer (SRE) to design, build, and operate scalable, reliable, and secure cloud-native platforms. The ideal candidate will have strong experience with Kubernetes ecosystems, cloud infrastructure, automation, observability, and GitOps practices.
Key Responsibilities
- Manage and optimize Kubernetes-based platforms, including Cilium, Istio, Ingress Controllers, and related ecosystem components.
- Design, deploy, and maintain infrastructure on Google Cloud Platform (GCP).
- Automate infrastructure provisioning and lifecycle management using Terraform.
- Implement and manage GitOps workflows using ArgoCD and GitLab.
- Deploy and maintain Helm charts for Kubernetes applications.
- Manage secrets, service discovery, and distributed systems using Vault and Consul.
- Build and maintain monitoring, logging, and observability platforms using Prometheus Operator and the Grafana Stack (Grafana, Mimir, Loki, Alloy, Tempo, and Pyroscope).
- Collaborate with development teams to improve platform reliability, performance, scalability, and operational excellence.
- Develop CI/CD pipelines and automation to support modern cloud-native deployments.
Required Skills
- Strong hands-on experience with Kubernetes (K8s) and cloud-native technologies.
- Experience with GCP, Terraform, Helm, and ArgoCD.
- Knowledge of Service Mesh technologies, particularly Istio and Cilium.
- Experience with Vault, Consul, and infrastructure security best practices.
- Strong expertise in observability tools including Prometheus and the Grafana ecosystem.
- Proficiency with GitOps, GitLab, CI/CD pipelines, and automation.
- Good understanding of Linux systems, networking, and troubleshooting in distributed environments.
Preferred Qualifications
- Experience operating large-scale production environments.
- Knowledge of SRE principles, incident management, capacity planning, and reliability engineering.
- Relevant cloud-native certifications (CKA, GCP, Terraform, etc.) are a plus.

About Improving
About
Improving is a leading IT professional services firm committed to helping companies achieve lasting success through modern technology. With core expertise in AI, Data, and Applications, we specialize in transforming legacy systems, building cloud-native platforms, and delivering intelligent, future-ready solutions for today’s complex business needs. Improving’s leaders are equally committed to fostering a great place to work that is inclusive and purpose-centered, empowering Improvers to bring their whole selves to work. Our team is known for its collaborative approach and long-term partnerships that prioritize measurable outcomes. By combining technical excellence with strategic insight, Improving enables all stakeholders to grow, adapt, and lead in an ever-evolving digital landscape.
Tech stack
Similar jobs
Work mode- WFO 5 days
Location: Hyderabad (Onsite)
Experience- 4-6 yrs
- K8s Hands-on experience
- Linux Troubleshooting Skills
- Experience on OnPrem Servers and Management
- Helm
- Docker
- Ingress and Ingress Controllers
- Networking Basics
- Proficient Communication
Must-Have Skills:
- Hands-on experience with airgap Kubernetes clusters, ideally in regulated industries (finance, healthcare, etc.).
- Strong expertise in CI/CD pipelines, programmable infrastructure, and automation.
- Proficiency in Linux troubleshooting, observability (Prometheus, Grafana, ELK), and multi-region disaster recovery.
- Security & compliance knowledge for regulated industries.
- Preferred: Experience with GKE, RKE, Rook-Ceph, and certifications like CKA, CKAD.
Who You Are
- A Kubernetes expert who thrives on scalability, automation, and security.
- Passionate about optimizing infrastructure, CI/CD, and high-availability systems.
- Comfortable troubleshooting Linux, improving observability, and ensuring disaster recovery readiness.
- A problem solver who simplifies complexity and drives cloud-native adoption.
What You’ll Do
- Architect & automate Kubernetes solutions for airgap and multi-region clusters.
- Optimize CI/CD pipelines & cloud-native deployments.
- Work with open-source projects, selecting the right tools for the job.
- Educate & guide teams on modern cloud-native infrastructure best practices.
- Solve real-world scaling, security, and infrastructure automation challenges.
Why Join Us?
- Work on high-impact Kubernetes projects in regulated industries.
- Solve real-world automation & infrastructure challenges with cutting-edge tools.
- Grow in a team that values learning, open-source contributions, and innovation.
Company Overview:
Planview has one mission: to build the future of connected work with market-leading portfolio management and work management solutions. Planview is a recognized innovator and industry leader, our solutions enable organizations to connect the business from ideas to impact, empowering companies to accelerate the achievement of what matters most. Our solutions span every class of work, resource, and organization to address the varying needs of diverse and distributed teams, departments, and enterprises.
As a Sr CloudOps Engineer II, you will oversee teams of Engineers and be a champion for configuration management, technologies in the cloud, and continuous improvement. You will work closely with global leaders to ensure that our applications, infrastructure, and processes are scalable, secure, and supportable. By leveraging your production experience and development skills you will work hand in hand with Engineers (Dev, DevOps, DBOps) to design and implement solutions that improve delivery of value to customers, reduce costs, and eliminate toil.
Responsibilities (What you will do):
- Guide the professional development of Engineers and support the teams to accomplish business goals
- Work closely with leaders in the Israel to align on priorities and architect, deliver, and manage our products
- Build systems that are secure, scalable, and self-healing.
- Manage and improve deployment pipelines.
- Triage and remediate production issues.
- Participate in on-call rotations for escalations.
Qualifications (What you will bring):
- Bachelor's degree is CS or equivalent experience in related field.
- 2+ years managing Engineering teams.
- 8+ years of experience as a site reliability or platform engineer, preferably in a fast-scaling environment
- 5+ years administering Linux and Windows environments.
- 3+ years programming / scripting experience (e.g., Python, JavaScript, PowerShell)
- Strong technical knowledge in OS’s (Linux and Windows), virtualizations, storage systems, networking, and firewall implementations
- Maintaining production environments in the On Premise (90%) and Cloud (10%) (e.g., AWS, Google Cloud, Azure)
- Solid understanding of networking principles and how it applies to data flow and security.
- Automating deployments of cloud based available services (e.g., AWS EC2 / RDS, Docker, Kubernetes)
- Experience managing CI/CD infrastructures, with a strong proficiency in platforms like bitbucket and Jenkins to streamline deployment pipelines and ensure efficient software delivery.
- Management of resources using Infrastructure as Code tools (e.g., CloudFormation, Terraform, Chef)
- Knowledge of observability tools such as LogicMonitor, New Relic, Prometheus, and Coralogix, as well as their implementation.
- Worked within Agile and Lean software development teams.
- Experience working in globally distributed teams.
- Ability to look on the big picture and manage risks.
Experience - 2+ Years
Requirements:
● Should have at least 2+ years of DevOps experience
● Should have experience with Kubernetes
● Should have experience with Terraform/Helm
● Should have experience in building scalable server-side systems
● Should have experience in cloud infrastructure and designing databases
● Having experience with NodeJS/TypeScript/AWS is a bonus
● Having experience with WebRTC is a bonus
Job Description:
• Drive end-to-end automation from GitHub/GitLab/BitBucket to Deployment,
Observability and Enabling the SRE activities
• Guide operations support (setup, configuration, management, troubleshooting) of
digital platforms and applications
• Solid understanding of DevSecOps Workflows that support CI, CS, CD, CM, CT.
• Deploy, configure, and manage SaaS and PaaS cloud platform and applications
• Provide Level 1 (OS, patching) and Level 2 (app server instance troubleshooting)
• DevOps programming: writing scripts, building operations/server instance/app/DB
monitoring tools Set up / manage continuous build and dev project management
environment: JenkinX/GitHub Actions/Tekton, Git, Jira Designing secure networks,
systems, and application architectures
• Collaborating with cross-functional teams to ensure secure product development
• Disaster recovery, network forensics analysis, and pen-testing solutions
• Planning, researching, and developing security policies, standards, and procedures
• Awareness training of the workforce on information security standards, policies, and
best practices
• Installation and use of firewalls, data encryption and other security products and
procedures
• Maturity in understanding compliance, policy and cloud governance and ability to
identify and execute automation.
• At Wesco, we discuss more about solutions than problems. We celebrate innovation
and creativity.
- 3+ years of relevant experience
- 2+ years experience with AWS (EC2, ECS, RDS, Elastic Cache, etc)
- Well versed with maintaining infrastructure as code (Terraform, Cloudformation, etc)
- Experience in setting CI/CD pipelines from scratch
- Knowledge of setting up and securing networks (VPN, Intranet, VPC, Peering, etc)
- Understanding of common security issues
About us
Classplus is India's largest B2B ed-tech start-up, enabling 1 Lac+ educators and content creators to create their digital identity with their own branded apps. Starting in 2018, we have grown more than 10x in the last year, into India's fastest-growing video learning platform.
Over the years, marquee investors like Tiger Global, Surge, GSV Ventures, Blume, Falcon, Capital, RTP Global, and Chimera Ventures have supported our vision. Thanks to our awesome and dedicated team, we achieved a major milestone in March this year when we secured a “Series-D” funding.
Now as we go global, we are super excited to have new folks on board who can take the rocketship higher🚀. Do you think you have what it takes to help us achieve this? Find Out Below!
What will you do?
· Define the overall process, which includes building a team for DevOps activities and ensuring that infrastructure changes are reviewed from an architecture and security perspective
· Create standardized tooling and templates for development teams to create CI/CD pipelines
· Ensure infrastructure is created and maintained using terraform
· Work with various stakeholders to design and implement infrastructure changes to support new feature sets in various product lines.
· Maintain transparency and clear visibility of costs associated with various product verticals, environments and work with stakeholders to plan for optimization and implementation
· Spearhead continuous experimenting and innovating initiatives to optimize the infrastructure in terms of uptime, availability, latency and costs
You should apply, if you
1. Are a seasoned Veteran: Have managed infrastructure at scale running web apps, microservices, and data pipelines using tools and languages like JavaScript(NodeJS), Go, Python, Java, Erlang, Elixir, C++ or Ruby (experience in any one of them is enough)
2. Are a Mr. Perfectionist: You have a strong bias for automation and taking the time to think about the right way to solve a problem versus quick fixes or band-aids.
3. Bring your A-Game: Have hands-on experience and ability to design/implement infrastructure with GCP services like Compute, Database, Storage, Load Balancers, API Gateway, Service Mesh, Firewalls, Message Brokers, Monitoring, Logging and experience in setting up backups, patching and DR planning
4. Are up with the times: Have expertise in one or more cloud platforms (Amazon WebServices or Google Cloud Platform or Microsoft Azure), and have experience in creating and managing infrastructure completely through Terraform kind of tool
5. Have it all on your fingertips: Have experience building CI/CD pipeline using Jenkins, Docker for applications majorly running on Kubernetes. Hands-on experience in managing and troubleshooting applications running on K8s
6. Have nailed the data storage game: Good knowledge of Relational and NoSQL databases (MySQL,Mongo, BigQuery, Cassandra…)
7. Bring that extra zing: Have the ability to program/script is and strong fundamentals in Linux and Networking.
8. Know your toys: Have a good understanding of Microservices architecture, Big Data technologies and experience with highly available distributed systems, scaling data store technologies, and creating multi-tenant and self hosted environments, that’s a plus
Being Part of the Clan
At Classplus, you’re not an “employee” but a part of our “Clan”. So, you can forget about being bound by the clock as long as you’re crushing it workwise😎. Add to that some passionate people working with and around you, and what you get is the perfect work vibe you’ve been looking for!
It doesn’t matter how long your journey has been or your position in the hierarchy (we don’t do Sirs and Ma’ams); you’ll be heard, appreciated, and rewarded. One can say, we have a special place in our hearts for the Doers! ✊🏼❤️
Are you a go-getter with the chops to nail what you do? Then this is the place for you.
- Seeking an Individual carrying around 5+ yrs of experience.
- Must have skills - Jenkins, Groovy, Ansible, Shell Scripting, Python, Linux Admin
- Terraform, AWS deep knowledge to automate and provision EC2, EBS, SQL Server, cost optimization, CI/CD pipeline using Jenkins, Server less automation is plus.
- Excellent writing and communication skills in English. Enjoy writing crisp and understandable documentation
- Comfortable programming in one or more scripting languages
- Enjoys tinkering with tooling. Find easier ways to handle systems by doing some research. Strong awareness around build vs buy.
Experience: 8-10yrs
Notice Period: max 15days
Must-haves*
1. Knowledge about Database/NoSQL DB hosting fundamentals (RDS multi-AZ, DynamoDB, MongoDB, and such)
2. Knowledge of different storage platforms on AWS (EBS, EFS, FSx) - mounting persistent volumes with Docker Containers
3. In-depth knowledge of Security principles on AWS (WAF, DDoS, Security Groups, NACL's, IAM groups, and SSO)
4. Knowledge on CI/CD platforms is required (Jenkins, GitHub actions, etc.) - Migration of AWS Code pipelines to GitHub actions
5. Knowledge of vast variety of AWS services (SNS, SES, SQS, Athena, Kinesis, S3, ECS, EKS, etc.) is required
6. Knowledge on Infrastructure as Code tool is required We use Cloudformation. (Terraform is a plus), ideally, we would like to migrate to Terraform from CloudFormation
7. Setting CloudWatch Alarms and SMS/Email Slack alerts.
8. Some Knowledge on configuring any kind of monitoring tool such as Prometheus, Dynatrace, etc. (We currently use Datadog, CloudWatch)
9. Experience with any CDN provider configurations (Cloudflare, Fastly, or CloudFront)
10. Experience with either Python or Go scripting language.
11. Experience with Git branching strategy
12. Containers hosting knowledge on both Windows and Linux
The below list is *Nice to Have*
1. Integration experience with Code Quality tools (SonarQube, NetSparker, etc) with CI/CD
2. Kubernetes
3. CDN's other than CloudFront (Cloudflare, Fastly, etc)
4. Collaboration with multiple teams
5. GitOps
Cloud native technologies - Kubernetes (EKS, GKE, AKS), AWS ECS, Helm, CircleCI, Harness, Severless platforms (AWS Fargate etc.)
Infrastructure as Code tools - Terraform, CloudFormation, Ansible
Scripting - Python, Bash
Desired Skills & Experience:
Projects/Internships with coding experience in either of Javascript, Python, Golang, Java etc.
Hands-on scripting and software development fluency in any programming language (Python, Go, Node, Ruby).
Basic understanding of Computer Science fundamentals - Networking, Web Architecture etc.
Infrastructure automation experience with knowledge of at least a few of these tools: Chef, Puppet, Ansible, CloudFormation, Terraform, Packer, Jenkins etc.
Bonus points if you have contributed to open source projects, participated in competitive coding platforms like Hackerearth, CodeForces, SPOJ etc.
You’re willing to learn various new technologies and concepts. The “cloud-native” field of software is evolving fast and you’ll need to quickly learn new technologies as required.
Communication: You like discussing a plan upfront, welcome collaboration, and are an excellent verbal and written communicator.
B.E/B.Tech/M.Tech or equivalent experience.









