
Why this role exists
Our infrastructure footprint is growing faster than our headcount, and we believe most of that
gap should be closed by automation and AI agents — not by hiring more humans to do toil. We
need someone early in their career who treats manual work as a bug, ships scripts and agents
instead of tickets, and wants to grow into deeper ownership over the next two years.
You will not be the most senior person on the team. You will be the one who multiplies the team.
What you'll own
In your first 1 months
• Take ownership of one slice of our CI/CD pipeline and make it measurably
faster, more reliable, or cheaper. We expect a number on a dashboard to move.
• Build at least three internal automations that replace manual ops toil —
using AI agents (Claude Code, agentic CLIs, scripted LLM workflows) as your force
multiplier.
• Be the first responder for a defined set of alerts. Write the runbooks. Drive
the alert volume down.
• Support senior engineers on AI/ML infrastructure (GPU nodes, inference
services, model deployment) — observe, document, and gradually take on contained
changes under review.
By 3 months you should be
• The go-to person for at least two production systems.
• Shipping routine infrastructure changes without needing senior review.
• Treating "manual" as a code smell.
Required (we will reject without these)
• 0–3 years hands-on experience with one major cloud (AWS, GCP, or
Azure — one is fine, depth beats breadth).
• Fluent in Linux command line, bash, and at least one scripting language
(Python or Go preferred).
• Have shipped something to production that real users hit. A side project
counts; a graded coursework lab does not.
• Comfortable with Docker — you can explain what an image vs. a
container is and why it matters.
• Working knowledge of networking fundamentals: DNS, HTTP/HTTPS,
TLS, ports, basic subnets — enough to debug "it works on my machine."
• Git fluency: branches, merges, rebases, conflict resolution.
• CI/CD pipelines — you have authored or substantially modified pipelines
in GitHub Actions, GitLab CI, ArgoCD, Jenkins, or similar. Not just "I clicked Re-run."
• Kubernetes basics — kubectl for real work, can read pod logs,
understand deployments and services, can debug a CrashLoopBackOff without
panicking. You do not need to have run a cluster; you do need to have lived inside one.
• Active user of AI coding agents (Claude Code, Cursor, Copilot, agentic
CLIs, etc.). You should be able to walk us through specific tasks where they made you
faster, and specific tasks where they failed you and how you noticed. "I have tried it" is
not enough.
Bonus (real plus, not required)
• Infrastructure as Code: Terraform, Pulumi, or Ansible.
• Observability: Prometheus/Grafana, Datadog, OpenTelemetry, any APM.
• Have built or extended an LLM-based agent — a custom MCP server, a
scripted multi-step workflow, an internal tool that calls models in a loop. Anything beyond
chat-with-Claude.
• Exposure to GPU workloads, model serving (vLLM, Triton, TGI, etc.), or
ML pipelines.
What we don't care about
• Whether your degree is in CS — or whether you have a degree at all.
• Brand-name companies on your resume.
• Certifications. They are fine. They do not substitute for having shipped.
How we work
• We default to automation. If you do something manually twice, the third
time you script it or hand it to an agent.
• AI agents are part of the workflow, not a novelty. Expect interview
questions about exactly how you use them — and where you have caught them being
wrong.
• Small, reversible changes beat big-bang rollouts.
• Postmortems are blameless and written down.
• We push back on each other. If you only execute, you will be unhappy
here.
How to apply
Send:
• Your resume.
• A short note (≤200 words) describing one infra or automation problem you
solved, and how AI agents factored in — or did not, and why. We read these. Generic
notes get rejected.
Internal note — delete before posting externally
• Comp band, location policy, team name, and reporting line marked
[CONFIRM] need to be filled in before this goes external.
• The Required list is intentionally tight: CI/CD and Kubernetes basics
promoted from bonus. Expect this to filter ~80% of typical junior DevOps applicants. The
remaining pool will skew toward people who have actually shipped infra at a startup, not
bootcamp grads or pure cloud-cert holders.
• IaC, observability, agent-building, and GPU/ML serving stay as bonus.
Promoting any of these to required at 0–3 yrs collapses the pool to near-zero or forces
hiring senior people at junior comp. If you want IaC required, re-level this to mid (3–5
yrs) and raise the band.
• Screening implication: the resume screen should explicitly check for
CI/CD pipeline authorship and any K8s-touching production work. If neither is on the
resume, reject at screen. Do not waste interview slots.
• Pipeline watch: if fewer than ~15 qualified resumes after 2 weeks of
active sourcing, the first thing to relax is the AI-agent-fluency bar (move to bonus and
screen for it in interview instead). Do not relax the "shipped to production" requirement
— that is the load-bearing filter.

About Zocket
About
Connect with the team
Company social profiles
Similar jobs
Required Skills
● Experience: Minimum of 5 years of professional experience in a DevOps Engineer
role
● Cloud Proficiency: Proven experience with at least one major cloud provider (AWS,
Azure, or GCP).
● Scripting & Programming: Strong scripting skills in languages such as Bash, Python,
or Go.
● IaC Tools: Hands-on experience with Terraform.
● Container Technology: Expertise in Docker and Kubernetes.
● CI/CD Tools: Proficient with CI/CD platforms like Jenkins, GitLab CI, or Travis CI.
● Configuration Management: Experience with configuration management tools like
Ansible, Chef, or Puppet.
● Version Control: Strong knowledge of Git and branching strategies.
● Problem-Solving:Excellent problem-solving abilities and a commitment to automation
and continuous improvement.
Location: Bangalore preferred / Hybrid as applicable
Experience: 3+ years
Education: B.E/B.Tech in Computer Science, Engineering or a related technical discipline
Salary: Above market standards, flexible for the right candidate
Career growth: Long-term opportunity with potential to lead DevOps architecture and cloud platform operations
About FrontM
FrontM builds software platforms for frontline workforces operating in remote and low-connectivity environments, with a strong focus on the maritime industry. The platform supports communication, collaboration, healthcare, learning, welfare and operational workflows across mobile, web, kiosk and connected device environments.
The platform runs across cloud infrastructure, constrained networks and specialised customer environments, requiring reliable DevOps practices, strong observability, secure architecture and careful operational discipline.
Role Summary
As a Senior DevOps Engineer, you will take ownership of FrontM’s AWS cloud infrastructure, CI/CD pipelines, platform reliability and technical operations. You will work closely with the VP of Delivery, CTO and CEO to maintain secure, scalable and high-availability infrastructure for FrontM’s production systems.
This role requires strong hands-on DevOps experience, broad AWS knowledge, Kubernetes experience and the ability to troubleshoot complex networking and production issues across multi-domain SaaS environments.
Key Responsibilities
Cloud Infrastructure & DevOps Architecture (≈45%)
· Own, maintain and improve AWS cloud infrastructure for FrontM platforms
· Create and maintain Terraform scripts for infrastructure deployment and management
· Manage Kubernetes workloads deployed within AWS EKS
· Support multi-zone AWS infrastructure design for availability, resilience and scale
· Maintain AWS services including Route 53, EC2, API Gateway, VPC, VPN, AWS Cognito, ElastiCache, DynamoDB and Lambda
· Contribute to DevOps architecture planning in line with FrontM’s platform roadmap
CI/CD, Operations & Platform Reliability (≈35%)
· Build, maintain and improve CI/CD pipelines for backend and platform services
· Oversee technical operations with hands-on administration, monitoring and release support
· Ensure continuous server uptime, stability, performance and maintainability
· Debug, respond to and restore system outages in production and staging environments
· Improve observability across infrastructure and applications, including migration from Elastic stack to logz.io
· Support backend stability, scale and performance across Node.js, Java and related services
Security, Networking & Production Support (≈20%)
· Maintain AWS security configurations, access controls and monitoring practices
· Support complex networking requirements across multi-domain SaaS implementations
· Troubleshoot network, infrastructure and access issues with internal teams and customer-side users
· Work with backend teams to support API integrations and infrastructure abstractions for complex requirements
· Document operational procedures, incident findings and technical support steps clearly
Required Technical Skills
Cloud Infrastructure & AWS
· Strong hands-on experience with AWS infrastructure and cloud operations
· Experience with Route 53, EC2, API Gateway, VPC, VPN, AWS Cognito, ElastiCache, DynamoDB and Lambda
· Experience with AWS security setup, monitoring and multi-zone infrastructure
· Ability to manage infrastructure using Terraform
Kubernetes, CI/CD & Observability
· Strong experience with Kubernetes, preferably AWS EKS
· Extensive CI/CD and DevOps experience
· Experience with infrastructure observability and application monitoring tools
· Ability to diagnose production bottlenecks, server failures and performance issues
Backend, Networking & SaaS Operations
· Experience supporting Node.js, Java and backend system procedures for stability and scale
· Good understanding of APIs, integrations and backend service dependencies
· Experience with complex networking and multi-domain SaaS implementations
· Ability to troubleshoot technical issues with non-technical end users
Nice to Have
· Experience with MongoDB clusters in MongoDB Atlas
Personal Attributes
· Strong ownership mindset for uptime, reliability and production stability
· Practical problem-solving approach with the ability to act quickly during incidents
· Clear written and spoken communication in English
· Ability to work independently and coordinate with senior management when required
· Comfortable working in fast-moving engineering teams
· Attention to detail in security, monitoring, documentation and operational processes
Why join FrontM?
Long-Term Career Growth
Opportunity to work on cloud infrastructure used by global maritime and remote workforce customers, with scope to grow into DevOps architecture and platform leadership roles.
Engineering Challenges That Matter
Work on infrastructure that supports applications used in remote, low-bandwidth and operationally demanding environments.
Broad Technical Ownership
Take responsibility across cloud infrastructure, Kubernetes, CI/CD, observability, networking, security and production reliability.
Apply now
Join a team focused on building reliable software infrastructure for real-world use cases and contribute to systems used across the global maritime workforce.
The DevOps Engineer will play a critical role in operationalizing artificial intelligence across Bell Techlogix client environments. This role focuses on building and supporting cloud infrastructure, CI/CD pipelines, and automation frameworks that power AI and machine learning workloads. The ideal candidate has experience supporting AI platforms such as Azure AI, Azure Machine Learning, Azure OpenAI, and ServiceNow or conversational AI platforms, and understands the operational requirements of production AI systems, including reliability, scalability, and security.
Key Responsibilities
•Design, build, and operate cloud infrastructure and platform services that support AI and machine learning workloads in production, SLA-driven managed services environments
•Implement CI/CD and MLOps pipelines to enable automated training, testing, deployment, and rollback of AI and ML models
•Develop and maintain Infrastructure as Code to provision AI-ready environments consistently across dev/test/prod
•Support AI platform operations including monitoring model health, pipeline execution, compute utilization, and data dependencies
•Partner with Machine Learning Engineers and Data Engineers to standardize deployment patterns for AI services and LLM-based solutions
•Enable secure and scalable AI integrations using APIs, messaging, and event-driven architectures
•Implement observability solutions for AI platforms, including logging, metrics, alerting, and drift detection integrations
•Troubleshoot AI platform incidents, perform root cause analysis, and implement remediation to improve reliability and automation coverage
•Apply security best practices for AI environments including secrets management, identity and access controls, network isolation, and policy enforcement
•Support AI-driven automation use cases across platforms such as Microsoft Copilot, ServiceNow, and conversational AI tools
•Collaborate with service desk, security, and architecture teams to continuously improve AI service delivery and operational maturity
Required Qualifications
•Bachelor’s degree in Computer Science, Engineering, or equivalent practical experience
•5+ years of experience in DevOps, cloud engineering, or platform operations, with exposure to AI or data workloads
•Hands-on experience with Microsoft Azure, including compute, networking, storage, and monitoring services
•Experience building CI/CD pipelines using Azure DevOps, GitHub Actions, or similar tools
•Working knowledge of Infrastructure as Code (Terraform and/or Bicep/ARM)
•Scripting experience using PowerShell and/or Python
•Experience supporting production platforms with incident management, change control, and root cause analysis
•Understanding of cloud security fundamentals and enterprise governance requirements
Preferred Qualifications
•Experience with Azure Machine Learning, Azure AI Services, Azure OpenAI, or MLOps frameworks
•Exposure to containerization and orchestration technologies (Docker, Kubernetes, AKS)
•Experience supporting data pipelines or feature stores used by machine learning systems
•Familiarity with ServiceNow, AI-driven ITSM workflows, or automation platforms
•Experience with observability tools
•Knowledge of Responsible AI, data governance, and compliance considerations for AI systems
•Relevant certifications (Microsoft Azure Administrator, Azure DevOps Engineer, Azure AI Engineer)
About us
Classplus is India's largest B2B ed-tech start-up, enabling 1 Lac+ educators and content creators to create their digital identity with their own branded apps. Starting in 2018, we have grown more than 10x in the last year, into India's fastest-growing video learning platform.
Over the years, marquee investors like Tiger Global, Surge, GSV Ventures, Blume, Falcon, Capital, RTP Global, and Chimera Ventures have supported our vision. Thanks to our awesome and dedicated team, we achieved a major milestone in March this year when we secured a “Series-D” funding.
Now as we go global, we are super excited to have new folks on board who can take the rocketship higher🚀. Do you think you have what it takes to help us achieve this? Find Out Below!
What will you do?
· Define the overall process, which includes building a team for DevOps activities and ensuring that infrastructure changes are reviewed from an architecture and security perspective
· Create standardized tooling and templates for development teams to create CI/CD pipelines
· Ensure infrastructure is created and maintained using terraform
· Work with various stakeholders to design and implement infrastructure changes to support new feature sets in various product lines.
· Maintain transparency and clear visibility of costs associated with various product verticals, environments and work with stakeholders to plan for optimization and implementation
· Spearhead continuous experimenting and innovating initiatives to optimize the infrastructure in terms of uptime, availability, latency and costs
You should apply, if you
1. Are a seasoned Veteran: Have managed infrastructure at scale running web apps, microservices, and data pipelines using tools and languages like JavaScript(NodeJS), Go, Python, Java, Erlang, Elixir, C++ or Ruby (experience in any one of them is enough)
2. Are a Mr. Perfectionist: You have a strong bias for automation and taking the time to think about the right way to solve a problem versus quick fixes or band-aids.
3. Bring your A-Game: Have hands-on experience and ability to design/implement infrastructure with GCP services like Compute, Database, Storage, Load Balancers, API Gateway, Service Mesh, Firewalls, Message Brokers, Monitoring, Logging and experience in setting up backups, patching and DR planning
4. Are up with the times: Have expertise in one or more cloud platforms (Amazon WebServices or Google Cloud Platform or Microsoft Azure), and have experience in creating and managing infrastructure completely through Terraform kind of tool
5. Have it all on your fingertips: Have experience building CI/CD pipeline using Jenkins, Docker for applications majorly running on Kubernetes. Hands-on experience in managing and troubleshooting applications running on K8s
6. Have nailed the data storage game: Good knowledge of Relational and NoSQL databases (MySQL,Mongo, BigQuery, Cassandra…)
7. Bring that extra zing: Have the ability to program/script is and strong fundamentals in Linux and Networking.
8. Know your toys: Have a good understanding of Microservices architecture, Big Data technologies and experience with highly available distributed systems, scaling data store technologies, and creating multi-tenant and self hosted environments, that’s a plus
Being Part of the Clan
At Classplus, you’re not an “employee” but a part of our “Clan”. So, you can forget about being bound by the clock as long as you’re crushing it workwise😎. Add to that some passionate people working with and around you, and what you get is the perfect work vibe you’ve been looking for!
It doesn’t matter how long your journey has been or your position in the hierarchy (we don’t do Sirs and Ma’ams); you’ll be heard, appreciated, and rewarded. One can say, we have a special place in our hearts for the Doers! ✊🏼❤️
Are you a go-getter with the chops to nail what you do? Then this is the place for you.
" Skills : Strong experience in Ansible, Cloud, Linux, Python or Shell or Bash scripting
" Experience : 3 - 6 Years
" Location : Bangalore
Good to have cloud skills - Docker / Kubernetes
Scripting skills - Any of Shell / Perl/ bash/Python
Good to have Terraform
Position Level: Senior Engineer
Company Overview:
AskSid.ai is a 4 years old start-up based in Bangalore, is fast growing and cofounded by
two ex-Mindtree employees each with 20+ years of experience. We were rated the No1
emerging SaaS company in India and won the NASSCOM EMERGE 50- League of 10
awards in 2019. Also got rated as the most innovative AI company in India for 2020 by
CII and Accenture Ventures. As a growing company, we are looking for passionate
engineers who aspire to build world class technology products of internet scale.
Job purpose:
Setup, optimize, and maintain Kubernetes clusters on Microsoft Azure Cloud.
Responsibilities
● Setup, maintain, optimize, and secure various Kubernetes clusters on MS Azure
Cloud
● Setup and maintain containers, container availability, auto-scalability, storage
management, DNS, Proxy setup and maintain firewall, app gateway, and load
balancers on MS Azure Cloud.
● Build and manage backup, restore, and DR activities
Knowledge and skills
Education and Experience
- Engineering in computer science
- 3-5 years of experience in setup and management of Kubernetes infrastructure
- Expert level skills in analytical & problem solving
- Ability to communicate clearly in English
- Microsoft Azure
- Kubernetes, AKS services as well as custom clusters on bare metal infrastructure
- Linux internals & services
- Docker, Docker Registry
- NGINX, Load Balancing, Firewall, Security, PKI
- Shell & Awk Script, Azure Templates & scripting, Python Scripting
- Knowledge of NoSQL Databases
- Experience working in advanced iterative methodologies such as Agile and Safe.
- Experience with containers and orchestration (Docker, Kubernetes).
- Comfortable working in complex and demanding environments with high degree of change.
- Ability to view system perspective and to perform thorough investigations.
- Experience in frequent delivery to production.
- Microservice-based architecture (Jenkins, Docker, CI/CD, ELK)
- Experience with modern software components (Mongo, Elasticsearch, Kafka).
- Expertise in software development methodologies.
- Understanding of protocols/technologies like HTTP, SSL, LDAP, SSH, SAML, etc.
- Possession of a deep knowledge of development workflows with Git.
- Experience with MySQL or another relational database Environment.
- Automation testing (component, integration and end2end)
Job Location: Jaipur
Experience Required: Minimum 3 years
About the role:
As a DevOps Engineer for Punchh, you will be working with our developers, SRE, and DevOps teams implementing our next generation infrastructure. We are looking for a self-motivated, responsible, team player who love designing systems that scale. Punchh provides a rich engineering environment where you can be creative, learn new technologies, solve engineering problems, all while delivering business objectives. The DevOps culture here is one with immense trust and responsibility. You will be given the opportunity to make an impact as there are no silos here.
Responsibilities:
- Deliver SLA and business objectives through whole lifecycle design of services through inception to implementation.
- Ensuring availability, performance, security, and scalability of AWS production systems
- Scale our systems and services through continuous integration, infrastructure as code, and gradual refactoring in an agile environment.
- Maintain services once a project is live by monitoring and measuring availability, latency, and overall system and application health.
- Write and maintain software that runs the infrastructure that powers the Loyalty and Data platform for some of the world’s largest brands.
- 24x7 in shifts on call for Level 2 and higher escalations
- Respond to incidents and write blameless RCA’s/postmortems
- Implement and practice proper security controls and processes
- Providing recommendations for architecture and process improvements.
- Definition and deployment of systems for metrics, logging, and monitoring on platform.
Must have:
- Minimum 3 Years of Experience in DevOps.
- BS degree in Computer Science, Mathematics, Engineering, or equivalent practical experience.
- Strong inter-personal skills.
- Must have experience in CI/CD tooling such as Jenkins, CircleCI, TravisCI
- Must have experience in Docker, Kubernetes, Amazon ECS or Mesos
- Experience in code development in at least one high-level programming language fromthis list: python, ruby, golang, groovy
- Proficient in shell scripting, and most importantly, know when to stop scripting and start developing.
- Experience in creation of highly automated infrastructures with any Configuration Management tools like: Terraform, Cloudformation or Ansible.
- In-depth knowledge of the Linux operating system and administration.
- Production experience with a major cloud provider such Amazon AWS.
- Knowledge of web server technologies such as Nginx or Apache.
- Knowledge of Redis, Memcache, or one of the many in-memory data stores.
- Experience with various load balancing technologies such as Amazon ALB/ELB, HA Proxy, F5.
- Comfortable with large-scale, highly-available distributed systems.
Good to have:
- Understanding of Web Standards (REST, SOAP APIs, OWASP, HTTP, TLS)
- Production experience with Hashicorp products such as Vault or Consul
- Expertise in designing, analyzing troubleshooting large-scale distributed systems.
- Experience in an PCI environment
- Experience with Big Data distributions from Cloudera, MapR, or Hortonworks
- Experience maintaining and scaling database applications
- Knowledge of fundamental systems engineering principles such as CAP Theorem, Concurrency Control, etc.
- Understanding of the network fundamentals: OSI, TCI/IP, topologies, etc.
- Understanding of Auditing of Infrastructure and help org. to control Infrastructure costs.
- Experience in Kafka, RabbitMQ or any messaging bus.
Requirements:-
- Must have good understanding of Python and Shell scripting with industry standard coding conventions
- Must possess good coding debugging skills
- Experience in Design & Development of test framework
- Experience in Automation testing
- Good to have experience in Jenkins framework tool
- Good to have exposure to Continuous Integration process
- Experience in Linux and Windows OS
- Desirable to have Build & Release Process knowledge
- Experience in Automating Manual test cases
- Experienced in automating OS / FW related tasks
- Understanding of BIOS / FW QA is a strong plus
- OpenCV experience is a plus
- Good to have platform exposure
- Must have good Communication skills
- Good Leadership capabilities & collaboration capabilities, as individual will have to work with multiple teams and single handedly maintain the automation framework and enable the Manual validation team
DevOps Engineer Skills Building a scalable and highly available infrastructure for data science Knows data science project workflows Hands-on with deployment patterns for online/offline predictions (server/serverless)
Experience with either terraform or Kubernetes
Experience of ML deployment frameworks like Kubeflow, MLflow, SageMaker Working knowledge of Jenkins or similar tool Responsibilities Owns all the ML cloud infrastructure (AWS) Help builds out an entirely CI/CD ecosystem with auto-scaling Work with a testing engineer to design testing methodologies for ML APIs Ability to research & implement new technologies Help with cost optimizations of infrastructure.
Knowledge sharing Nice to Have Develop APIs for machine learning Can write Python servers for ML systems with API frameworks Understanding of task queue frameworks like Celery










