
- 2+ years work experience in a DevOps or similar role
- Knowledge of OO programming and concepts (Java, C++, C#, Python)
- A drive towards automating repetitive tasks (e.g., scripting via Bash, Python, etc)
- Fluency in one or more scripting languages such as Python or Ruby.
- Familiarity with Microservice-based architectures
- Practical experience with Docker containerization and clustering (Kubernetes/ECS)
- In-depth, hands-on experience with Linux, networking, server, and cloud architectures.
- Experience with CI/CD tools Azure DevOps, AWS cloud formation, Lamda functions, Jenkins, and Ansible
- Experience with AWS, Azure, or another cloud PaaS provider.
- Solid understanding of configuration, deployment, management, and maintenance of large cloud-hosted systems; including auto-scaling, monitoring, performance tuning, troubleshooting, and disaster recovery
- Proficiency with source control, continuous integration, and testing pipelines
- Effective communication skills
Job Responsibilities:
- Deploy and maintain critical applications on cloud-native microservices architecture.
- Implement automation, effective monitoring, and infrastructure-as-code.
- Deploy and maintain CI/CD pipelines across multiple environments.
- Streamline the software development lifecycle by identifying pain points and productivity barriers and determining ways to resolve them.
- Analyze how customers are using the platform and help drive continuous improvement.
- Support and work alongside a cross-functional engineering team on the latest technologies.
- Iterate on best practices to increase the quality & velocity of deployments.
- Sustain and improve the process of knowledge sharing throughout the engineering team
- Identification and prioritization of technical debt that risks instability or creates wasteful operational toil.
- Own daily operational goals with the team.

About Zeus Learning
About
Zeus Learning is a learning technology solutions provider that focuses on the North American and European educational markets, working with several organizations, from the largest publishers to small nonprofits, over the last 19 years. At Zeus Learning, we leverage leading-edge technology and inclusive design to transform the way learning experiences are created.
Our team of 300+ understands the rigorous demands of the changing educational landscape. We believe that a solution is out there and that by keeping focused on the people we’re designing for and asking the right questions, we’ll get there together. Our line of products and services include Learning Management Systems, Assessment Management & TEI Systems, Software Skills Simulation Systems, Virtual Classrooms, Learning Portals and Websites, Mobile Apps, as well as interactive content such as learning games and explorations.
Connect with the team
Company social profiles
Similar jobs
Required Skills: Advanced AWS Infrastructure Expertise, CI/CD Pipeline Automation, Monitoring, Observability & Incident Management, Security, Networking & Risk Management, Infrastructure as Code & Scripting
Criteria:
- 5+ years of DevOps/SRE experience in cloud-native, product-based companies (B2C scale preferred)
- Strong hands-on AWS expertise across core and advanced services (EC2, ECS/EKS, Lambda, S3, CloudFront, RDS, VPC, IAM, ELB/ALB, Route53)
- Proven experience designing high-availability, fault-tolerant cloud architectures for large-scale traffic
- Strong experience building & maintaining CI/CD pipelines (Jenkins mandatory; GitHub Actions/GitLab CI a plus)
- Prior experience running production-grade microservices deployments and automated rollout strategies (Blue/Green, Canary)
- Hands-on experience with monitoring & observability tools (Grafana, Prometheus, ELK, CloudWatch, New Relic, etc.)
- Solid hands-on experience with MongoDB in production, including performance tuning, indexing & replication
- Strong scripting skills (Bash, Shell, Python) for automation
- Hands-on experience with IaC (Terraform, CloudFormation, or Ansible)
- Deep understanding of networking fundamentals (VPC, subnets, routing, NAT, security groups)
- Strong experience in incident management, root cause analysis & production firefighting
Description
Role Overview
Company is seeking an experienced Senior DevOps Engineer to design, build, and optimize cloud infrastructure on AWS, automate CI/CD pipelines, implement monitoring and security frameworks, and proactively identify scalability challenges. This role requires someone who has hands-on experience running infrastructure at B2C product scale, ideally in media/OTT or high-traffic applications.
Key Responsibilities
1. Cloud Infrastructure — AWS (Primary Focus)
- Architect, deploy, and manage scalable infrastructure using AWS services such as EC2, ECS/EKS, Lambda, S3, CloudFront, RDS, ELB/ALB, VPC, IAM, Route53, etc.
- Optimize cloud cost, resource utilization, and performance across environments.
- Design high-availability, fault-tolerant systems for streaming workloads.
2. CI/CD Automation
- Build and maintain CI/CD pipelines using Jenkins, GitHub Actions, or GitLab CI.
- Automate deployments for microservices, mobile apps, and backend APIs.
- Implement blue/green and canary deployments for seamless production rollouts.
3. Observability & Monitoring
- Implement logging, metrics, and alerting using tools like Grafana, Prometheus, ELK, CloudWatch, New Relic, etc.
- Perform proactive performance analysis to minimize downtime and bottlenecks.
- Set up dashboards for real-time visibility into system health and user traffic spikes.
4. Security, Compliance & Risk Highlighting
• Conduct frequent risk assessments and identify vulnerabilities in:
o Cloud architecture
o Access policies (IAM)
o Secrets & key management
o Data flows & network exposure
• Implement security best practices including VPC isolation, WAF rules, firewall policies, and SSL/TLS management.
5. Scalability & Reliability Engineering
- Analyze traffic patterns for OTT-specific load variations (weekends, new releases, peak hours).
- Identify scalability gaps and propose solutions across:
- o Microservices
- o Caching layers
- o CDN distribution (CloudFront)
- o Database workloads
- Perform capacity planning and load testing to ensure readiness for 10x traffic growth.
6. Database & Storage Support
- Administer and optimize MongoDB for high-read/low-latency use cases.
- Design backup, recovery, and data replication strategies.
- Work closely with backend teams to tune query performance and indexing.
7. Automation & Infrastructure as Code
- Implement IaC using Terraform, CloudFormation, or Ansible.
- Automate repetitive infrastructure tasks to ensure consistency across environments.
Required Skills & Experience
Technical Must-Haves
- 5+ years of DevOps/SRE experience in cloud-native, product-based companies.
- Strong hands-on experience with AWS (core and advanced services).
- Expertise in Jenkins CI/CD pipelines.
- Solid background working with MongoDB in production environments.
- Good understanding of networking: VPCs, subnets, security groups, NAT, routing.
- Strong scripting experience (Bash, Python, Shell).
- Experience handling risk identification, root cause analysis, and incident management.
Nice to Have
- Experience with OTT, video streaming, media, or any content-heavy product environments.
- Familiarity with containers (Docker), orchestration (Kubernetes/EKS), and service mesh.
- Understanding of CDN, caching, and streaming pipelines.
Personality & Mindset
- Strong sense of ownership and urgency—DevOps is mission critical at OTT scale.
- Proactive problem solver with ability to think about long-term scalability.
- Comfortable working with cross-functional engineering teams.
Why Join company?
• Build and operate infrastructure powering millions of monthly users.
• Opportunity to shape DevOps culture and cloud architecture from the ground up.
• High-impact role in a fast-scaling Indian OTT product.
JOB DETAILS:
- Job Title: Senior Devops Engineer 1
- Industry: Ride-hailing
- Experience: 4-6 years
- Working Days: 5 days/week
- Work Mode: ONSITE
- Job Location: Bangalore
- CTC Range: Best in Industry
Required Skills: Cloud & Infrastructure Operations, Kubernetes & Container Orchestration, Monitoring, Reliability & Observability, Proficiency with Terraform, Ansible etc., Strong problem-solving skills with scripting (Python/Go/Shell)
Criteria:
1. Candidate must be from a product-based or scalable app-based startups company with experience handling large-scale production traffic.
2. Candidate must have strong Linux expertise with hands-on production troubleshooting and working knowledge of databases and middleware (Mongo, Redis, Cassandra, Elasticsearch, Kafka).
3. Candidate must have solid experience with Kubernetes.
4. Candidate should have strong knowledge of configuration management tools like Ansible, Terraform, and Chef / Puppet. Add on- Prometheus & Grafana etc.
5. Candidate must be an individual contributor with strong ownership.
6. Candidate must have hands-on experience with DATABASE MIGRATIONS and observability tools such as Prometheus and Grafana.
7. Candidate must have working knowledge of Go/Python and Java.
8. Candidate should have working experience on Cloud platform - AWS
9. Candidate should have Minimum 1.5 years stability per organization, and a clear reason for relocation.
Description
Job Summary:
As a DevOps Engineer at company, you will be working on building and operating infrastructure at scale, designing and implementing a variety of tools to enable product teams to build and deploy their services independently, improving observability across the board, and designing for security, resiliency, availability, and stability. If the prospect of ensuring system reliability at scale and exploring cutting-edge technology to solve problems, excites you, then this is your fit.
Job Responsibilities:
- Own end-to-end infrastructure right from non-prod to prod environment including self-managed DBs.
- Understanding the needs of stakeholders and conveying this to developers.
- Working on ways to automate and improve development and release processes.
- Identifying technical problems and developing software updates and ‘fixes’.
- Working with software developers to ensure that development follows established processes and works as intended.
- Do what it takes to keep the uptime above 99.99%.
- Understand DevOps philosophy and evangelize the principles across the organization.
- Strong communication and collaboration skills to break down the silos
Job Requirements:
- B.Tech. / B.E. degree in Computer Science or equivalent software engineering degree/experience.
- Minimum 4 yrs of experience working as a DevOps/Infrastructure Consultant.
- Strong background in operating systems like Linux.
- Understands the container orchestration tool Kubernetes.
- Proficient Knowledge of configuration management tools like Ansible, Terraform, and Chef / Puppet. Add on- Prometheus & Grafana etc.
- Problem-solving attitude, and ability to write scripts using any scripting language.
- Understanding programming languages like GO/Python, and Java.
- Basic understanding of databases and middlewares like Mongo/Redis/Cassandra/Elasticsearch/Kafka.
- Should be able to take ownership of tasks, and must be responsible. - Good communication skills
Job Responsibilities:
- Managing and maintaining the efficient functioning of containerized applications and systems within an organization
- Design, implement, and manage scalable Kubernetes clusters in cloud or on-premise environments
- Develop and maintain CI/CD pipelines to automate infrastructure and application deployments, and track all automation processes
- Implement workload automation using configuration management tools, as well as infrastructure as code (IaC) approaches for resource provisioning
- Monitor, troubleshoot, and optimize the performance of Kubernetes clusters and underlying cloud infrastructure
- Ensure high availability, security, and scalability of infrastructure through automation and best practices
- Establish and enforce cloud security standards, policies, and procedures Work agile technologies
Primary Requirements:
- Kubernetes: Proven experience in managing Kubernetes clusters (min. 2-3 years)
- Linux/Unix: Proficiency in administering complex Linux infrastructures and services
- Infrastructure as Code: Hands-on experience with CM tools like Ansible, as well as the
- knowledge of resource provisioning with Terraform or other Cloud-based utilities
- CI/CD Pipelines: Expertise in building and monitoring complex CI/CD pipelines to
- manage the build, test, packaging, containerization and release processes of software
- Scripting & Automation: Strong scripting and process automation skills in Bash, Python
- Monitoring Tools: Experience with monitoring and logging tools (Prometheus, Grafana)
- Version Control: Proficient with Git and familiar with GitOps workflows.
- Security: Strong understanding of security best practices in cloud and containerized
- environments.
Skills/Traits that would be an advantage:
- Kubernetes administration experience, including installation, configuration, and troubleshooting
- Kubernetes development experience
- Strong analytical and problem-solving skills
- Excellent communication and interpersonal skills
- Ability to work independently and as part of a team
You will be responsible for:
- Managing all DevOps and infrastructure for Sizzle
- We have both cloud and on-premise servers
- Work closely with all AI and backend engineers on processing requirements and managing both development and production requirements
- Optimize the pipeline to ensure ultra fast processing
- Work closely with management team on infrastructure upgrades
You should have the following qualities:
- 3+ years of experience in DevOps, and CI/CD
- Deep experience in: Gitlab, Gitops, Ansible, Docker, Grafana, Prometheus
- Strong background in Linux system administration
- Deep expertise with AI/ML pipeline processing, especially with GPU processing. This doesn’t need to include model training, data gathering, etc. We’re looking more for experience on model deployment, and inferencing tasks at scale
- Deep expertise in Python including multiprocessing / multithreaded applications
- Performance profiling including memory, CPU, GPU profiling
- Error handling and building robust scripts that will be expected to run for weeks to months at a time
- Deploying to production servers and monitoring and maintaining the scripts
- DB integration including pymongo and sqlalchemy (we have MongoDB and PostgreSQL databases on our backend)
- Expertise in Docker-based virtualization including - creating & maintaining custom Docker images, deployment of Docker images on cloud and on-premise services, monitoring of production Docker images with robust error handling
- Expertise in AWS infrastructure, networking, availability
Optional but beneficial to have:
- Experience with running Nvidia GPU / CUDA-based tasks
- Experience with image processing in python (e.g. openCV, Pillow, etc)
- Experience with PostgreSQL and MongoDB (Or SQL familiarity)
- Excited about working in a fast-changing startup environment
- Willingness to learn rapidly on the job, try different things, and deliver results
- Bachelors or Masters degree in computer science or related field
- Ideally a gamer or someone interested in watching gaming content online
Skills:
DevOps, Ansible, CI/CD, GitLab, GitOps, Docker, Python, AWS, GCP, Grafana, Prometheus, python, sqlalchemy, Linux / Ubuntu system administration
Seniority: We are looking for a mid to senior level engineer
Salary: Will be commensurate with experience.
Who Should Apply:
If you have the right experience, regardless of your seniority, please apply.
Work Experience: 3 years to 6 years
● Improve CI/CD tooling using gitlab.
● Implement and improve monitoring and alerting.
● Build and maintain highly available systems.
● Implement the CI pipeline.
● Implement and maintain monitoring stacks.
● Lead and guide the team in identifying and implementing new technologies.
● Implement and own the CI.
● Manage CD tooling.
● Implement and maintain monitoring and alerting.
● Build and maintain highly available production systems.
Skills
● Configuration Management experience such as Kubernetes, Ansible or similar.
● Managing production infrastructure with Terraform, CloudFormation, etc.
● Strong Linux, system administration background.
● Ability to present and communicate the architecture in a visual form. Strong knowledge of AWS,
Azure, GCP.
- Understanding customer requirements and project KPIs
- Implementing various development, testing, automation tools, and IT infrastructure
- Planning the team structure, activities, and involvement in project management activities.
- Managing stakeholders and external interfaces
- Setting up tools and required infrastructure
- Defining and setting development, test, release, update, and support processes for DevOps operation
- Have the technical skill to review, verify, and validate the software code developed in the project.
- Troubleshooting techniques and fixing the code bugs
- Monitoring the processes during the entire lifecycle for its adherence and updating or creating new processes for improvement and minimizing the wastage
- Encouraging and building automated processes wherever possible
- Identifying and deploying cybersecurity measures by continuously performing vulnerability assessment and risk management
- Incidence management and root cause analysis
- Coordination and communication within the team and with customers
- Selecting and deploying appropriate CI/CD tools
- Strive for continuous improvement and build continuous integration, continuous development, and constant deployment pipeline (CI/CD Pipeline)
- Mentoring and guiding the team members
- Monitoring and measuring customer experience and KPIs
- Managing periodic reporting on the progress to the management and the customer
Job description
The role requires you to design development pipelines from the ground up, Creation of Docker Files, design and operate highly available systems in AWS Cloud environments. Also involves Configuration Management, Web Services Architectures, DevOps Implementation, Database management, Backups, and Monitoring.
Key responsibility area
- Ensure reliable operation of CI/CD pipelines
- Orchestrate the provisioning, load balancing, configuration, monitoring and billing of resources in the cloud environment in a highly automated manner
- Logging, metrics and alerting management.
- Creation of Bash/Python scripts for automation
- Performing root cause analysis for production errors.Requirement
- Proficient in Linux Commands line and troubleshooting.
- Proficient in AWS Services. Deployment, Monitoring and troubleshooting applications in AWS.
- Hands-on experience with CI tooling preferably with Jenkins.
- Proficient in deployment using Ansible.
- Knowledge of infrastructure management tools (Infrastructure as cloud) such as terraform, AWS cloudformation etc.
- Proficient in deployment of applications behind load balancers and proxy servers such as nginx, apache.
- Scripting languages: Bash, Python, Groovy.
- Experience with Logging, Monitoring, and Alerting tools like ELK(Elastic-search, Logstash, Kibana), Nagios. Graylog, splunk Prometheus, Grafana is a plus.
Must Have:
Linux, CI/CD(Jenkin), AWS, Scripting(Bash,shell Python, Go), Ngnix, Docker.
Good to have
Configuration Management(Ansible or similar tool), Logging tool( ELK or similar), Monitoring tool(Ngios or similar), IaC(Terraform, cloudformation).This person MUST have:
- B.E Computer Science or equivalent
- 2+ Years of hands-on experience troubleshooting/setting up of the Linux environment, who can write shell scripts for any given requirement.
- 1+ Years of hands-on experience setting up/configuring AWS or GCP services from SCRATCH and maintaining them.
- 1+ Years of hands-on experience setting up/configuring Kubernetes & EKS and ensuring high availability of container orchestration.
- 1+ Years of hands-on experience setting up CICD from SCRATCH in Jenkins & Gitlab.
- Experience configuring/maintaining one monitoring tool.
- Excellent verbal & written communication skills.
- Candidates with certifications - AWS, GCP, CKA, etc will be preferred
- Hands-on experience with databases (Cassandra, MongoDB, MySQL, RDS).
Experience:
- Min 3 years of experience as SRE automation engineer building, running, and maintaining production sites. Not looking for candidates who have experience only as L1/L2.
Location:
- Remotely, anywhere in India
Timings:
- The person is expected to deliver with both high speed and high quality as well as work for 40 Hours per week (~6.5 hours per day, 6 days per week) in shifts which will rotate every month.
Position:
- Full time/Direct
- We have great benefits such as PF, medical insurance, 12 annual company holidays, 12 PTO leaves per year, annual increments, Diwali bonus, spot bonuses and other incentives etc.
- We dont believe in locking in people with large notice periods. You will stay here because you love the company. We have only a 15 days notice period.
As DevOps Engineer Consultant you will be responsible for Continuous Integration, Continuous Development,
Continuous Delivery with a strong understanding of Business-Driven software integration and delivery approach, you will
be reporting into the Technical Lead.
Responsibilities & Duties
• Ideate and create CI and CD process and documentation for same.
• Ideate and create and Code Maintenance using Visual SVN/Jenkins.
• Design and implement new learning tools or knowledge
Job requirements:
• Should be able to research, design Code Maintenance Process from scratch.
• Should be able to research, design Continuous Integration Process from scratch.
• Should be able to research, design Continuous Development Process from scratch.
• Should be able to research, design Continuous Delivery Process from scratch.
• Should be worked on Install Shield for creating Instable.
• In-depth understanding of principles and best practices of Software Configuration Management (SCM) in Agile,
SCRUM and Waterfall methodologies.
• Experienced in Windows, Linux environment. Good knowledge and understanding of database and application
servers’ administration in a global production environment.
• Should have good understand and Knowledge on Windows and Linux Server Deployment
• Should have good understand and Knowledge on application hosting on Windows IIS
• Experienced in Visual SVN, Gitlab CI and Jenkins for CI and for End-to-End automation for all build and CD.
Mostly with product developed using Dot net technology.
• Experienced in working with version control systems like GIT and used Source code management client tools like
Git Bash, GitHub, Git Lab.
• Experience in using MAVEN/ANT/Bamboo as build tools for the building of deployable artifacts.
• Knowledge of using Routed Protocols: FTP, SFTP, SSH, HTTP, HTTPS and Connect directly.
• Experienced in deploying Database Changes to Oracle, db2, MSSQL and MYSQL databases.
• Having work experience in support of multi-platform like Windows, UNIX, Linux, Ubuntu.
• Managed multiple environments for both production and non-production where primary objectives included
automation, build out, integration and cost control.
• Expertise in trouble shooting the problems generated while building, deploying and production support.
• Good understanding of creating and managing the various development and build platforms and deployment
strategies.
• Excellent Knowledge of Application Lifecycle Management, Change & Release Management and ITIL process
• Exposed to all aspects of software development life cycle (SDLC) such as Analysis, Planning, Developing, Testing,
implementing and Post-production analysis of the projects.
• Good interaction with developers, managers, and team members to coordinate job tasks and strong
commitment to work.
• Documented daily meetings, build reports, release notes and many other day-to-day documentation and status
reports.
• Excellent communicative, interpersonal, intuitive and analytic and leadership skills with teamwork work
efficiently in both independent and teamwork environments.
• Enjoy working on all types of planned and unplanned issues/tasks.
• Implementing gitlab CI, gitlab, docker, maven ect.
• Should have knowledge on docker container which can be utilised in deployment process..
• Good Interpersonal Skills, team-working attitude, takes initiatives and very proactive in solving problems and
providing best solutions.
• Integrating various Version control tools, build tools, deployment methodologies (scripting) into Jenkins or (any
other tool), create an end to end orchestration build cycles.
• Troubleshoot build issues, performance and generating metrics on master's performance along with jobs usage.
• Design develop build and packaging tools for continuous integration build and reporting. Automate the build
and release cycles.
• Coordinate all build and release activities, ensure release processes is well documented, source control
repositories including branching and tagging.
• Maintain product release process, including generating and delivering release packages, generate various
metrics for tracking issues against releases and the means of tracking compatibility among products.
• Maintained and managed cloud & test environments and automation for QA, Product Management and Product
Support
As DevOps Engineer, you'll be part of the team building the stage for our Software Engineers to work on, helping to enhance our product performance and reliability.
Responsibilities:
- Build & operate infrastructure to support website, backed cluster, ML projects in the organization.
- Helping teams become more autonomous and allowing the Operation team to focus on improving the infrastructure and optimizing processes.
- Delivering system management tooling to the engineering teams.
- Working on your own applications which will be used internally.
- Contributing to open source projects that we are using (or that we may start).
- Be an advocate for engineering best practices in and out of the company.
- Organizing tech talks and participating in meetups and representing Box8 at industry events.
- Sharing pager duty for the rare instances of something serious happening.
- Collaborate with other developers to understand & setup tooling needed for Continuous Integration/Delivery/Deployment (CI/CD) practices.
Requirements:
- 1+ Years Of Industry Experience Scale existing back end systems to handle ever increasing amounts of traffic and new product requirements.
- Ruby On Rails or Python and Bash/Shell skills.
- Experience managing complex systems at scale.
- Experience with Docker, rkt or similar container engine.
- Experience with Kubernetes or similar clustering solutions.
- Experience with tools such as Ansible or Chef Understanding of the importance of smart metrics and alerting.
- Hands on experience with cloud infrastructure provisioning, deployment, monitoring (we are on AWS and use ECS, ELB, EC2, Elasticache, Elasticsearch, S3, CloudWatch).
- Experience with relational SQL and NoSQL databases, including Postgres and Cassandra.
- Knowledge of data pipeline and workflow management tools: Azkaban, Luigi, Airflow, etc.
- Experience in working on linux based servers.
- Managing large scale production grade infrastructure on AWS Cloud.
- Good Knowledge on scripting languages like ruby, python or bash.
- Experience in creating in deployment pipeline from scratch.
- Expertise in any of the CI tools, preferably Jenkins.
- Good knowledge of docker containers and its usage.
- Using Infra/App Monitoring tools like, CloudWatch/Newrelic/Sensu.
Good to have:
- Knowledge of Ruby on Rails based applications and its deployment methodologies.
- Experience working on Container Orchestration tools like Kubernetes/ECS/Mesos.
- Extra Points For Experience With Front-end development NewRelic GCP Kafka, Elasticsearch.








