
Required Skills and Experience
- 4+ years of relevant experience with DevOps tools Jenkins, Ansible, Chef etc
- 4+ years of experience in continuous integration/deployment and software tools development experience with Python and shell scripts etc
- Building and running Docker images and deployment on Amazon ECS
- Working with AWS services (EC2, S3, ELB, VPC, RDS, Cloudwatch, ECS, ECR, EKS)
- Knowledge and experience working with container technologies such as Docker and Amazon ECS, EKS, Kubernetes
- Experience with source code and configuration management tools such as Git, Bitbucket, and Maven
- Ability to work with and support Linux environments (Ubuntu, Amazon Linux, CentOS)
- Knowledge and experience in cloud orchestration tools such as AWS Cloudformation/Terraform etc
- Experience with implementing "infrastructure as code", “pipeline as code” and "security as code" to enable continuous integration and delivery
- Understanding of IAM, RBAC, NACLs, and KMS
- Good communication skills
Good to have:
- Strong understanding of security concepts, methodologies and apply them such as SSH, public key encryption, access credentials, certificates etc.
- Knowledge of database administration such as MongoDB.
- Knowledge of maintaining and using tools such as Jira, Bitbucket, Confluence.
- Work with Leads and Architects in designing and implementation of technical infrastructure, platform, and tools to support modern best practices and facilitate the efficiency of our development teams through automation, CI/CD pipelines, and ease of access and performance.
- Establish and promote DevOps thinking, guidelines, best practices, and standards.
- Contribute to architectural discussions, Agile software development process improvement, and DevOps best practices.

Similar jobs
Job Description
Experience: 5 - 9 years
Location: Bangalore/Pune/Hyderabad
Work Mode: Hybrid(3 Days WFO)
Senior Cloud Infrastructure Engineer for Data Platform
The ideal candidate will play a critical role in designing, implementing, and maintaining cloud infrastructure and CI/CD pipelines to support scalable, secure, and efficient data and analytics solutions. This role requires a strong understanding of cloud-native technologies, DevOps best practices, and hands-on experience with Azure and Databricks.
Key Responsibilities:
Cloud Infrastructure Design & Management
Architect, deploy, and manage scalable and secure cloud infrastructure on Microsoft Azure.
Implement best practices for Azure Resource Management, including resource groups, virtual networks, and storage accounts.
Optimize cloud costs and ensure high availability and disaster recovery for critical systems
Databricks Platform Management
Set up, configure, and maintain Databricks workspaces for data engineering, machine learning, and analytics workloads.
Automate cluster management, job scheduling, and monitoring within Databricks.
Collaborate with data teams to optimize Databricks performance and ensure seamless integration with Azure services.
CI/CD Pipeline Development
Design and implement CI/CD pipelines for deploying infrastructure, applications, and data workflows using tools like Azure DevOps, GitHub Actions, or similar.
Automate testing, deployment, and monitoring processes to ensure rapid and reliable delivery of updates.
Monitoring & Incident Management
Implement monitoring and alerting solutions using tools like Dynatrace, Azure Monitor, Log Analytics, and Databricks metrics.
Troubleshoot and resolve infrastructure and application issues, ensuring minimal downtime.
Security & Compliance
Enforce security best practices, including identity and access management (IAM), encryption, and network security.
Ensure compliance with organizational and regulatory standards for data protection and cloud operations.
Collaboration & Documentation
Work closely with cross-functional teams, including data engineers, software developers, and business stakeholders, to align infrastructure with business needs.
Maintain comprehensive documentation for infrastructure, processes, and configurations.
Required Qualifications
Education: Bachelor’s degree in Computer Science, Engineering, or a related field.
Must Have Experience:
6+ years of experience in DevOps or Cloud Engineering roles.
Proven expertise in Microsoft Azure services, including Azure Data Lake, Azure Databricks, Azure Data Factory (ADF), Azure Functions, Azure Kubernetes Service (AKS), and Azure Active Directory.
Hands-on experience with Databricks for data engineering and analytics.
Technical Skills:
Proficiency in Infrastructure as Code (IaC) tools like Terraform, ARM templates, or Bicep.
Strong scripting skills in Python, or Bash.
Experience with containerization and orchestration tools like Docker and Kubernetes.
Familiarity with version control systems (e.g., Git) and CI/CD tools (e.g., Azure DevOps, GitHub Actions).
Soft Skills:
Strong problem-solving and analytical skills.
Excellent communication and collaboration abilities.
Job Title: Senior Devops Engineer (Full-time)
Location: Mumbai, Onsite
Experience Required: 5+ Years
Required Qualifications
● Experience:
○ 5+ years of hands-on experience as a DevOps Engineer or similar role, with
proven expertise in building and customizing Helm charts from scratch (not just
using pre-existing ones).
○ Demonstrated ability to design and whiteboard DevOps pipelines, including
CI/CD workflows for microservices applications.
○ Experience packaging and deploying applications with stateful dependencies
(e.g., databases, persistent storage) in varied environments: on-prem (air-gapped
and non-air-gapped), single-tenant cloud, multi-tenant cloud, and developer trials.
○ Proficiency in managing deployments in Kubernetes clusters, including offline
installations, upgrades via Helm, and adaptations for client restrictions (e.g., no
additional tools or VMs).
○ Track record of handling client interactions, such as asking probing questions
about infrastructure (e.g., OS versions, storage solutions, network restrictions)
and explaining technical concepts clearly.
● Technical Skills:
○ Strong knowledge of Helm syntax and functionalities (e.g., Go templating, hooks,
subcharts, dependency management).
○ Expertise in containerization with Docker, including image management
(save/load, registries like Harbor or ECR).
○ Familiarity with CI/CD tools such as Jenkins, ArgoCD, GitHub Actions, and
GitOps for automated and manual deployments.
○ Understanding of storage solutions for on-prem and cloud, including object/file
storage (e.g., MinIO, Ceph, NFS, cloud-native like S3/EBS).
○ In-depth knowledge of Kubernetes concepts: StatefulSets, PersistentVolumes,
namespaces, HPA, liveness/readiness probes, network policies, and RBAC.
○ Solid grasp of cloud networking: VPCs (definition, boundaries, virtualization via
SDN, differences from private clouds), bare metal vs. virtual machines
(advantages like resource efficiency, flexibility, and scalability).
○ Ability to work in air-gapped environments, preparing offline artifacts and
ensuring self-contained deployment
NOTE- This is a contractual role for a period of 3-6 months.
Responsibilities:
● Set up and maintain CI/CD pipelines across services and environments
● Monitor system health and set up alerts/logs for performance & errors ● Work closely with backend/frontend teams to improve deployment velocity
● Manage cloud environments (staging, production) with cost and reliability in mind
● Ensure secure access, role policies, and audit logging
● Contribute to internal tooling, CLI automation, and dev workflow improvements
Must-Haves:
● 2–3 years of hands-on experience in DevOps, SRE, or Platform Engineering
● Experience with Docker, CI/CD (especially GitHub Actions), and cloud providers (AWS/GCP)
● Proficiency in writing scripts (Bash, Python) for automation
● Good understanding of system monitoring, logs, and alerting
● Strong debugging skills, ownership mindset, and clear documentation habits
● Infra monitoring tools like Grafana dashboards
Senior Software Engineer I - DevOps Engineer
Exceptional software engineering is challenging. Amplifying it to ensure that multiple teams can concurrently create and manage a vast, intricate product escalates the complexity. As a Senior Software Engineer within the Release Engineering team at Sumo Logic, your task will be to develop and sustain automated tooling for the release processes of all our services. You will contribute significantly to establishing automated delivery pipelines, empowering autonomous teams to create independently deployable services. Your role is integral to our overarching strategy of enhancing software delivery and progressing Sumo Logic’s internal Platform-as-a-Service.
What you will do:
• Own the Delivery pipeline and release automation framework for all Sumo services
• Educate and collaborate with teams during both design and development phases to ensure best practices.
• Mentor a team of Engineers (Junior to Senior) and improve software development processes.
• Evaluate, test, and provide technology and design recommendations to executives.
• Write detailed design documents and documentation on system design and implementation.
• Ensuring the engineering teams are set up to deliver quality software quickly and reliably.
• Enhance and maintain infrastructure and tooling for development, testing and debugging
What you already have
• B.S. or M.S. Computer Sciences or related discipline
• Ability to influence: Understand people’s values and motivations and influence them towards making good architectural choices.
• Collaborative working style: You can work with other engineers to come up with good decisions.
• Bias towards action: You need to make things happen. It is essential you don’t become an inhibitor of progress, but an enabler.
• Flexibility: You are willing to learn and change. Admit past approaches might not be the right ones now.
Technical skills:
- 4+ years of experience in the design, development, and use of release automation tooling, DevOps, CI/CD, etc.
- 2+ years of experience in software development in Java/Scala/Golang or similar
- 3+ years of experience on software delivery technologies like jenkins including experience writing and developing CI/CD pipelines and knowledge of build tools like make/gradle/npm etc.
- Experience with cloud technologies, such as AWS/Azure/GCP
- Experience with Infrastructure-as-Code and tools such as Terraform
- Experience with scripting languages such as Groovy, Python, Bash etc.
- Knowledge of monitoring tools such as Prometheus/Grafana or similar tools
- Understanding of GitOps and ArgoCD concepts/workflows
- Understanding of security and compliance aspects of DevSecOps
About Us
Sumo Logic, Inc. empowers the people who power modern, digital business. Sumo Logic enables customers to deliver reliable and secure cloud-native applications through its Sumo Logic SaaS Analytics Log Platform, which helps practitioners and developers ensure application reliability, secure and protect against modern security threats, and gain insights into their cloud infrastructures. Customers worldwide rely on Sumo Logic to get powerful real-time analytics and insights across observability and security solutions for their cloud-native applications. For more information, visit www.sumologic.com.
Sumo Logic Privacy Policy. Employees will be responsible for complying with applicable federal privacy laws and regulations, as well as organizational policies related to data protection.
● Auditing, monitoring and improving existing infrastructure components of highly available and scaled
product on cloud with Ubuntu servers
● Running daily maintenance tasks and improving it with possible automation
● Deploying new components, server and other infrastructure when needed
● Coming up with innovative ways to automate tasks
● Working with telecom carriers and getting rates and destinations and update regularly on the system
● Working with Docker containers, Tinc, Iptables, HAproxy, ETCD, mySQL, mongoDB, CouchDB and
ansible
You would be bringing below skills to our team :
● Expertise with Docker containers and its networking, Tinc, Iptables, HAproxy, ETCD, and ansible
● Extensive experience with setup, maintenance, monitoring, backup and replication with mySQL
● Expertise with the Ubuntu servers and its OS and server level networking
● Good experience of working with mongoDB, CouchDB
● Good with the networking tools
● Open Source server monitoring solutions like nagios, Zabbix etc.
● Worked on highly scaled, distributed applications running on the Datacenter Ubuntu VPS instances
● Innovative and out of box thinker with multitasking skills working in a small team efficiently
● Working Knowledge of any scripting languages like bash, node or python
● It would be an advantage if have experience with the calling platforms like FreeSWITCH, OpenSIPS or
Kamailio and have basic knowledge of SIP protocol
Electrum is looking for an experienced and proficient DevOps Engineer. This role will provide you with an opportunity to explore what’s possible in a collaborative and innovative work environment. If your goal is to work with a team of talented professionals that is keenly focused on solving complex business problems and supporting product innovation with technology, you might be our new DevOps Engineer. With this position, you will be involved in building out systems for our rapidly expanding team, enabling the whole engineering group to operate more effectively and iterate at top speed in an open, collaborative environment. The ideal candidate will have a solid background in software engineering and a vivid experience in deploying product updates, identifying production issues, and implementing integrations. The ideal candidate has proven capabilities and experience in risk-taking, is willing to take up challenges, and is a strong believer in efficiency and innovation with exceptional communication and documentation skills.
YOU WILL:
- Plan for future infrastructure as well as maintain & optimize the existing infrastructure.
- Conceptualize, architect, and build:
- 1. Automated deployment pipelines in a CI/CD environment like Jenkins;
- 2. Infrastructure using Docker, Kubernetes, and other serverless platforms;
- 3. Secured network utilizing VPCs with inputs from the security team.
- Work with developers & QA team to institute a policy of Continuous Integration with Automated testing Architect, build and manage dashboards to provide visibility into delivery, production application functional, and performance status.
- Work with developers to institute systems, policies, and workflows which allow for a rollback of deployments.
- Triage release of applications/ Hotfixes to the production environment on a daily basis.
- Interface with developers and triage SQL queries that need to be executed in production environments.
- Maintain 24/7 on-call rotation to respond and support troubleshooting of issues in production.
- Assist the developers and on calls for other teams with a postmortem, follow up and review of issues affecting production availability.
- Scale Electum platform to handle millions of requests concurrently.
- Reduce Mean Time To Recovery (MTTR), enable High Availability and Disaster Recovery
PREREQUISITES:
- Bachelor’s degree in engineering, computer science, or related field, or equivalent work experience.
- Minimum of six years of hands-on experience in software development and DevOps, specifically managing AWS Infrastructures such as EC2s, RDS, Elastic cache, S3, IAM, cloud trail, and other services provided by AWS.
- At least 2 years of experience in building and owning serverless infrastructure.
- At least 2 years of scripting experience in Python (Preferable) and Shell Web Application Deployment Systems Continuous Integration tools (Ansible).
- Experience building a multi-region highly available auto-scaling infrastructure that optimizes performance and cost.
- Experience in automating the provisioning of AWS infrastructure as well as automation of routine maintenance tasks.
- Must have prior experience automating deployments to production and lower environments.
- Worked on providing solutions for major automation with scripts or infrastructure.
- Experience with APM tools such as DataDog and log management tools.
- Experience in designing and implementing Essential Functions System Architecture Process; establishing and enforcing Network Security Policy (AWS VPC, Security Group) & ACLs.
- Experience establishing and enforcing:
- 1. System monitoring tools and standards
- 2. Risk Assessment policies and standards
- 3. Escalation policies and standards
- Excellent DevOps engineering, team management, and collaboration skills.
- Advanced knowledge of programming languages such as Python and writing code and scripts.
- Experience or knowledge in - Application Performance Monitoring (APM), and prior experience as an open-source contributor will be preferred.
Now, more than ever, the Toast team is committed to our customers. We’re taking steps to help restaurants navigate these unprecedented times with technology, resources, and community. Our focus is on building a restaurant platform that helps restaurants adapt, take control, and get back to what they do best: building the businesses they love. And because our technology is purpose-built for restaurants by restaurant people, restaurants can trust that we’ll deliver on their needs for today while investing in experiences that will power their restaurant of the future.
At Toast, our Site Reliability Engineers (SREs) are responsible for keeping all customer-facing services and other Toast production systems running smoothly. SREs are a blend of pragmatic operators and software craftspeople who apply sound software engineering principles, operational discipline, and mature automation to our environments and our codebase. Our decisions are based on instrumentation and continuous observability, as well as predictions and capacity planning.
About this roll* (Responsibilities)
- Gather and analyze metrics from both operating systems and applications to assist in performance tuning and fault finding
- Partner with development teams to improve services through rigorous testing and release procedures
- Participate in system design consulting, platform management, and capacity planning
- Create sustainable systems and services through automation and uplift
- Balance feature development speed and reliability with well-defined service level objectives
Troubleshooting and Supporting Escalations:
- Gather and analyze metrics from both operating systems and applications to assist in performance tuning and fault finding
- Diagnose performance bottlenecks and implement optimizations across infrastructure, databases, web, and mobile applications
- Implement strategies to increase system reliability and performance through on-call rotation and process optimization
- Perform and run blameless RCAs on incidents and outages aggressively, looking for answers that will prevent the incident from ever happening again
Do you have the right ingredients? (Requirements)
- Extensive industry experience with at least 7+ years in SRE and/or DevOps roles
- Polyglot technologist/generalist with a thirst for learning
- Deep understanding of cloud and microservice architecture and the JVM
- Experience with tools such as APM, Terraform, Ansible, GitHub, Jenkins, and Docker
- Experience developing software or software projects in at least four languages, ideally including two of Go, Python, and Java
- Experience with cloud computing technologies ( AWS cloud provider preferred)
Bread puns are encouraged but not required
- Recommend a migration and consolidation strategy for DevOps tools
- Design and implement an Agile work management approach
- Make a quality strategy
- Design a secure development process
- Create a tool integration strategy
- Working on scalability, maintainability and reliability of company's products.
- Working with clients to solve their day-to-day challenges, moving manual processes to automation.
- Keeping systems reliable and gauging the effort it takes to reach there.
- Understanding Juxtapose tools and technologies to choose x over y.
- Understanding Infrastructure as a Code and applying software design principles to it.
- Automating tedious work using your favourite scripting languages.
- Taking code from the local system to production by implementing Continuous Integration and Delivery principles.
What you need to have:
- Worked with any one of the programming languages like Go, Python, Java, Ruby.
- Work experience with public cloud providers like AWS, GCP or Azure.
- Understanding of Linux systems and Containers
- Meticulous in creating and following runbooks and checklists
- Microservices experience and use of orchestration tools like Kubernetes/Nomad.
- Understanding of Computer Networking fundamentals like TCP, UDP.
- Strong bash scripting skills.
As part of the engineering team, you would be expected to have
deep technology expertise with a passion for building highly scalable products.
This is a unique opportunity where you can impact the lives of people across 150+
countries!
Responsibilities
• Develop Collaborate in large-scale systems design discussions.
• Deploying and maintaining in-house/customer systems ensuring high availability,
performance and optimal cost.
• Automate build pipelines. Ensuring right architecture for CI/CD
• Work with engineering leaders to ensure cloud security
• Develop standard operating procedures for various facets of Infrastructure
services (CI/CD, Git Branching, SAST, Quality gates, Auto Scaling)
• Perform & automate regular backups of servers & databases. Ensure rollback and
restore capabilities are Realtime and with zero-downtime.
• Lead the entire DevOps charter for ONE Championship. Mentor other DevOps
engineers. Ensure industry standards are followed.
Requirements
• Overall 5+ years of experience in as DevOps Engineer/Site Reliability Engineer
• B.E/B.Tech in CS or equivalent streams from institute of repute
• Experience in Azure is a must. AWS experience is a plus
• Experience in Kubernetes, Docker, and containers
• Proficiency in developing and deploying fully automated environments using
Puppet/Ansible and Terraform
• Experience with monitoring tools like Nagios/Icinga, Prometheus, AlertManager,
Newrelic
• Good knowledge of source code control (git)
• Expertise in Continuous Integration and Continuous Deployment setup using Azure
Pipeline or Jenkins
• Strong experience in programming languages. Python is preferred
• Experience in scripting and unit testing
• Basic knowledge of SQL & NoSQL databases
• Strong Linux fundamentals
• Experience in SonarQube, Locust & Browserstack is a plus










