Responsibilities
- Building and maintenance of resilient and scalable production infrastructure
- Improvement of monitoring systems
- Creation and support of development automation processes (CI / CD)
- Participation in infrastructure development
- Detection of problems in architecture and proposing of solutions for solving them
- Creation of tasks for system improvements for system scalability, performance and monitoring
- Analysis of product requirements in the aspect of devops
- Managing a team of DevOps, control of task deliveries
- Incident analysis and fixing
Technology stack
Linux, Bash, Salt/Ansible, LXC, libvirt, IPsec, VXLAN, Open vSwitch, OpenVPN, OSPF, BIRD, Cisco NX-OS, Multicast, PIM, LVM, software RAID, LUKS, PostgreSQL, nginx, haproxy, Prometheus, Grafana, Zabbix, GitLab, Capistrano
Skills and Experience
- Understanding of the distributed systems principles
- Understanding of principles for building a resistant network infrastructure
- Experience of Ubuntu Linux administration (Debian-like will be a plus)
- Strong knowledge of Bash
- Experience of working with LXC-containers
- Understanding and experience with infrastructure as a code approach
- Experience of development idempotent Ansible roles
- Experience with relational databases (PostgeSQL), ability to create simple SQL queries
- Experience with git
- Experience with monitoring and metric collect systems (Prometheus, Grafana, Zabbix)
- Understanding of dynamic routing (OSPF)
Preferred experience
- Experience of working with highload zero-downtown environments
- Experience of coding on Python
- Experience of working with IPsec, VXLAN, Open vSwitch
- Knowledge and experience of working with network equipment Cisco
- Experience of working with Cisco NX-OS
- Knowledge of principles of multicast protocols IGMP, PIM
- Experience of setting multicast on Cisco equipment
- Experience of working with Solarflare Onload
- Experience administering Atlassian products

Similar jobs
Work mode- WFO 5 days
Location: Hyderabad (Onsite)
Experience- 7+
- K8s Hands-on experience
- Linux Troubleshooting Skills
- Experience on OnPrem Servers and Management
- Helm
- Docker
- Ingress and Ingress Controllers
- Networking Basics
- Proficient Communication
Must-Have Skills:
- Hands-on experience with airgap Kubernetes clusters, ideally in regulated industries (finance, healthcare, etc.).
- Strong expertise in CI/CD pipelines, programmable infrastructure, and automation.
- Proficiency in Linux troubleshooting, observability (Prometheus, Grafana, ELK), and multi-region disaster recovery.
- Security & compliance knowledge for regulated industries.
- Preferred: Experience with GKE, RKE, Rook-Ceph, and certifications like CKA, CKAD.
Who You Are
- A Kubernetes expert who thrives on scalability, automation, and security.
- Passionate about optimizing infrastructure, CI/CD, and high-availability systems.
- Comfortable troubleshooting Linux, improving observability, and ensuring disaster recovery readiness.
- A problem solver who simplifies complexity and drives cloud-native adoption.
What You’ll Do
- Architect & automate Kubernetes solutions for airgap and multi-region clusters.
- Optimize CI/CD pipelines & cloud-native deployments.
- Work with open-source projects, selecting the right tools for the job.
- Educate & guide teams on modern cloud-native infrastructure best practices.
- Solve real-world scaling, security, and infrastructure automation challenges.
Why Join Us?
- Work on high-impact Kubernetes projects in regulated industries.
- Solve real-world automation & infrastructure challenges with cutting-edge tools.
- Grow in a team that values learning, open-source contributions, and innovation.
Job Title: Senior DevOps Engineer
Experience: 8+ Years
Joining: Immediate Joiner
Location: Bangalore (Onsite/Hybrid – as applicable)
Job Description:
We are looking for a highly experienced Senior DevOps Engineer with 8+ years of hands-on experience to join our team immediately. The ideal candidate will be responsible for designing, implementing, and managing scalable, secure, and highly available infrastructure.
Key Responsibilities:
- Design, build, and maintain CI/CD pipelines for application deployment
- Manage cloud infrastructure (AWS/Azure/GCP) and optimize cost and performance
- Automate infrastructure using Infrastructure as Code (Terraform/CloudFormation)
- Manage containerized applications using Docker and Kubernetes
- Monitor system performance, availability, and security
- Collaborate closely with development, QA, and security teams
- Troubleshoot production issues and perform root cause analysis
- Ensure high availability, disaster recovery, and backup strategies
Required Skills:
- 8+ years of experience in DevOps / Site Reliability Engineering
- Strong expertise in Linux/Unix administration
- Hands-on experience with AWS / Azure / GCP
- CI/CD tools: Jenkins, GitLab CI, GitHub Actions, Azure DevOps
- Containers & orchestration: Docker, Kubernetes
- Infrastructure as Code: Terraform, CloudFormation, Ansible
- Monitoring tools: Prometheus, Grafana, ELK, CloudWatch
- Strong scripting skills (Bash, Python)
- Experience with security best practices and compliance
Good to Have:
- Experience with microservices architecture
- Knowledge of DevSecOps practices
- Cloud certifications (AWS/Azure/GCP)
- Experience in high-traffic production environments
Why Join Us:
- Opportunity to work on scalable, enterprise-grade systems
- Collaborative and growth-oriented work environment
- Competitive compensation and benefits
- Immediate joiners preferred.
- Minimum 3+ yrs of Experience in DevOps with AWS Platform
- • Strong AWS knowledge and experience
- • Experience in using CI/CD automation tools (Git, Jenkins, Configuration deployment tools ( Puppet/Chef/Ansible)
- • Experience with IAC tools Terraform
- • Excellent experience in operating a container orchestration cluster (Kubernetes, Docker)
- • Significant experience with Linux operating system environments
- • Experience with infrastructure scripting solutions such as Python/Shell scripting
- • Must have experience in designing Infrastructure automation framework.
- • Good experience in any of the Setting up Monitoring tools and Dashboards ( Grafana/kafka)
- • Excellent problem-solving, Log Analysis and troubleshooting skills
- • Experience in setting up centralized logging for system (EKS, EC2) and application
- • Process-oriented with great documentation skills
- • Ability to work effectively within a team and with minimal supervision
Senior Software Engineer I - DevOps Engineer
Exceptional software engineering is challenging. Amplifying it to ensure that multiple teams can concurrently create and manage a vast, intricate product escalates the complexity. As a Senior Software Engineer within the Release Engineering team at Sumo Logic, your task will be to develop and sustain automated tooling for the release processes of all our services. You will contribute significantly to establishing automated delivery pipelines, empowering autonomous teams to create independently deployable services. Your role is integral to our overarching strategy of enhancing software delivery and progressing Sumo Logic’s internal Platform-as-a-Service.
What you will do:
• Own the Delivery pipeline and release automation framework for all Sumo services
• Educate and collaborate with teams during both design and development phases to ensure best practices.
• Mentor a team of Engineers (Junior to Senior) and improve software development processes.
• Evaluate, test, and provide technology and design recommendations to executives.
• Write detailed design documents and documentation on system design and implementation.
• Ensuring the engineering teams are set up to deliver quality software quickly and reliably.
• Enhance and maintain infrastructure and tooling for development, testing and debugging
What you already have
• B.S. or M.S. Computer Sciences or related discipline
• Ability to influence: Understand people’s values and motivations and influence them towards making good architectural choices.
• Collaborative working style: You can work with other engineers to come up with good decisions.
• Bias towards action: You need to make things happen. It is essential you don’t become an inhibitor of progress, but an enabler.
• Flexibility: You are willing to learn and change. Admit past approaches might not be the right ones now.
Technical skills:
- 4+ years of experience in the design, development, and use of release automation tooling, DevOps, CI/CD, etc.
- 2+ years of experience in software development in Java/Scala/Golang or similar
- 3+ years of experience on software delivery technologies like jenkins including experience writing and developing CI/CD pipelines and knowledge of build tools like make/gradle/npm etc.
- Experience with cloud technologies, such as AWS/Azure/GCP
- Experience with Infrastructure-as-Code and tools such as Terraform
- Experience with scripting languages such as Groovy, Python, Bash etc.
- Knowledge of monitoring tools such as Prometheus/Grafana or similar tools
- Understanding of GitOps and ArgoCD concepts/workflows
- Understanding of security and compliance aspects of DevSecOps
About Us
Sumo Logic, Inc. empowers the people who power modern, digital business. Sumo Logic enables customers to deliver reliable and secure cloud-native applications through its Sumo Logic SaaS Analytics Log Platform, which helps practitioners and developers ensure application reliability, secure and protect against modern security threats, and gain insights into their cloud infrastructures. Customers worldwide rely on Sumo Logic to get powerful real-time analytics and insights across observability and security solutions for their cloud-native applications. For more information, visit www.sumologic.com.
Sumo Logic Privacy Policy. Employees will be responsible for complying with applicable federal privacy laws and regulations, as well as organizational policies related to data protection.
Senior DevOps Engineer
Experience: Minimum 5 years of relevant experience
Key Responsibilities:
• Hands-on experience with AWS tools and CI/CD pipelines, Redhat Linux
• Strong expertise in DevOps practices and principles
• Experience with infrastructure automation and configuration management
• Excellent problem-solving skills and attention to detail
Nice to Have:
• Redhat certification
The role requires you to design development pipelines from the ground up, Creation of Docker Files, and design and operate highly available systems in AWS Cloud environments. Also involves Configuration Management, Web Services Architectures, DevOps Implementation, Database management, Backups, and Monitoring.
Key responsibility area
- Ensure reliable operation of CI/CD pipelines
- Orchestrate the provisioning, load balancing, configuration, monitoring and billing of resources in the cloud environment in a highly automated manner
- Logging, metrics and alerting management.
- Creation of Bash/Python scripts for automation
- Performing root cause analysis for production errors.
Requirements
- 2 years experience as Team Lead.
- Good Command on kubernetes.
- Proficient in Linux Commands line and troubleshooting.
- Proficient in AWS Services. Deployment, Monitoring and troubleshooting applications in AWS.
- Hands-on experience with CI tooling preferably with Jenkins.
- Proficient in deployment using Ansible.
- Knowledge of infrastructure management tools (Infrastructure as cloud) such as terraform, AWS cloud formation etc.
- Proficient in deployment of applications behind load balancers and proxy servers such as nginx, apache.
- Scripting languages: Bash, Python, Groovy.
- Experience with Logging, Monitoring, and Alerting tools like ELK(Elastic-search, Logstash, Kibana), Nagios. Graylog, splunk Prometheus, Grafana is a plus.
Must-Have:
Linux, CI/CD(Jenkin), AWS, Scripting(Bash, shell Python, Go), Ngnix, Docker.
Good to have
Configuration Management(Ansible or similar tool), Logging tool( ELK or similar), Monitoring tool(Nagios or similar), IaC(Terraform, cloud formation).
● Building and managing multiple application environments on AWS using automation tools like Terraform or
Cloudformation etc.
● Deploy applications with zero downtime via automation with configuration management tools such as Ansible.
● Setting up Infrastructure monitoring tools such as Prometheus, Grafana
● Setting up centralised logging using tools such as ELK.
● Containerisation of applications/microservices.
● Ensure application availability to 99.9% with highly available infrastructure.
● Monitoring performance of applications and databases.
● Ensuring that systems are safe and secure against cyber security threats.
● Working with software developers to ensure that release cycle and deployment processes are followed.
● Evaluating existing applications and platforms, give recommendations for enhancing performance via gap analysis,
identifying the most practical alternative solutions and assisting with modifications.
Skills -
● Strong knowledge of AWS Managed Services such as EC2, RDS, ECS, ECR, S3, Cloudfront, SES, Redshift, Elastic Cache,
AMQP etc.
● Experience in handling production workloads.
● Experience with Nginx web server.
● Experience with NoSql and Sql Databases such as MongoDB, Postgresql etc.
● Experience with Containerisation of applications/micro services using Docker.
● Understanding of system administration in Linux environments.
● Strong Knowledge of Infrastructure as a Code such as Terraform, Cloudformation etc.
● Strong knowledge of configuration management tools such as Ansible, Chef etc.
● Familiarity with tools such as GitLab, Jenkins, Vercel, JIRA etc.
● Proficiency in scripting languages including Bash, Python etc.
● Full understanding of software development lifecycle best practices and agile methodology
● Strong communication and documentation skills.
● An ability to drive to goals and milestones while valuing and maintaining a strong attention to detail
● Excellent judgment, analytical thinking, and problem-solving skills
● Self-motivated individual that possesses excellent time management and organizational skills
Engineering group to plan ongoing feature development, product maintenance.
• Familiar with Virtualization, Containers - Kubernetes, Core Networking, Cloud Native
Development, Platform as a Service – Cloud Foundry, Infrastructure as a Service, Distributed
Systems etc
• Implementing tools and processes for deployment, monitoring, alerting, automation, scalability,
and ensuring maximum availability of server infrastructure
• Should be able to manage distributed big data systems such as hadoop, storm, mongoDB,
elastic search and cassandra etc.,
• Troubleshooting multiple deployment servers, Software installation, Managing licensing etc,.
• Plan, coordinate, and implement network security measures in order to protect data, software, and
hardware.
• Monitor the performance of computer systems and networks, and to coordinate computer network
access and use.
• Design, configure and test computer hardware, networking software, and operating system
software.
• Recommend changes to improve systems and network configurations, and determine hardware or
software requirements related to such changes.











