
Role & Responsibilities
- You take end-to-end ownership of infrastructure, design, scale, and operate it. This goes beyond execution. Here's what that looks like day to day:
- Own the design, architecture, and reliability of Locus's cloud infrastructure across AWS, Azure, GCP, and Aliyun, supporting multi-region, global deployments.
- Lead the evolution of our CI/CD ecosystem, optimize and refactor our Jenkins-as-Code setup for scalability, performance, and developer efficiency.
- Drive the Infrastructure as Code (IaC) journey end-to-end, migrate existing cloud resources, alarms, and configurations fully into code with strong versioning, review, and rollback practices.
- Partner with engineering teams to identify and resolve performance, scalability, and reliability bottlenecks, deep dives into memory, CPU, networking, and storage constraints.
- Define and implement monitoring, alerting, and incident response best practices, improve MTTR, system observability, and operational readiness.
- Lead initiatives around cost optimization, security hardening, and capacity planning, keep infrastructure efficient and compliant as the platform scales.
- Act as a technical mentor for junior DevOps engineers and raise the overall DevOps maturity across teams.
Ideal Candidate
- Strong Senior DevOps Engineer Profile
- Mandatory (Experience 1): Must have 5+ years in DevOps / SRE / Infrastructure roles with hands-on experience (clear scale signals like traffic, uptime, latency, infra size should be mentioned)
- Mandatory (Experience 2): Must have B2B SaaS company experience with multi-tenant architecture OR multiple production stacks (multi-env / multi-client systems)
- Mandatory (Tech Skills 1 - Cloud & Infra): AWS (VPC, EKS, EC2, RDS, networking), Kubernetes (EKS) at scale, Designing high availability, multi-region systems
- Mandatory (Tech Skills 2 - Automation & IaC): Terraform (must-have), Helm / GitOps, Strong scripting (Python / Go / Bash)
- Mandatory (Tech Skills 3 - CI/CD & Release): Scalable CI/CD pipelines (GitHub Actions / Jenkins), Zero/low downtime deployments
- Mandatory (Tech Skills 4 - Reliability & Observability): SRE principles (SLOs, SLIs, error budgets), Monitoring tools (Prometheus, Grafana, Datadog), Alerting, on-call, incident management
- Mandatory (Education): BTech in Computer Science or related fields
- Mandatory (Company): Strong B2B SaaS product companies only (good scaled)

About TalentXO
About
Company social profiles
Similar jobs
Key responsibilities
• Design, build, and maintain robust CI/CD pipelines using Azure DevOps Services (Azure Pipelines) and Git-based workflows.
• Implement and manage infrastructure as code (IaC) using ARM templates, Bicep, and/or Terraform for repeatable environment provisioning.
• Containerize applications (Docker) and manage container orchestration platforms such as AKS (Azure Kubernetes Service).
• Automate build, test, release, and rollback processes; integrate automated testing and quality gates into pipelines.
• Monitor and improve platform reliability and observability using logging and monitoring tools (e.g., Azure Monitor, Application Insights, Prometheus, Grafana).
• Drive platform security and compliance through pipeline controls, secrets management (Key Vault / Vault), and secure configuration practices.
• Implement cost-optimization and governance for Azure resources (tags, policies, budgets).
• Troubleshoot build/release failures, production incidents, and performance bottlenecks; perform root-cause analysis and implement permanent fixes.
• Mentor developers in Git workflows, pipeline authoring, best practices for IaC, and cloud-native design.
• Maintain clear documentation: runbooks, deployment playbooks, architecture diagrams, and pipeline templates.
Required skills & experience
• 5+ years hands-on experience working with Azure and cloud-native application delivery.
• Deep experience with Azure DevOps (Repos, Pipelines, Artifacts, Boards).
• Strong IaC skills with Terraform, ARM templates, or Bicep.
• Solid experience with CI/CD design and YAML pipeline authoring.
• Practical knowledge of containerization (Docker) and Kubernetes — preferably AKS.
• Scripting skills: PowerShell, Bash, and/or Python for automation.
• Experience with Git workflows (branching strategies, PRs, code reviews).
• Familiarity with configuration management and secrets management (Azure Key Vault, HashiCorp Vault).
• Understanding of networking, identity (Azure AD), and security fundamentals in Azure.
• Strong troubleshooting, debugging, and incident response skills.
• Good collaboration and communication skills; ability to work across teams.
Certification
AZ-400: Microsoft Certified: DevOps Engineer Expert or AZ-104 or AZ 305 or Terraform Associate.
- Development/Technical support experience in preferably DevOps.
- Looking for an engineer to be part of GitHub Actions support. Experience with CI/CD tools like Bamboo, Harness, Ansible, Salt Scripting.
- Hands-on expertise with GitHub Actions and CICD Tools like Bamboo, Harness, CI/CD Pipeline stages, Build Tools, SonarQube, Artifactory, Nuget, Proget Veracode, LaunchDarkly, GitHub/Bitbucket repos, Monitoring tools.
- Handelling Xmatters,Techlines,Incidents
- Strong Scripting skills (PowerShell, Python, Bash/Shell Scripting) for Implementing automation scripts and Tools to streamline administrative tasks and improve efficiency.
- An Atlassian Tools Administrator is responsible for managing and maintaining Atlassian products such as Jira, Confluence, Bitbucket, and Bamboo.
- Expertise in Bitbucket, GitHub for version control and collaboration global level.
- Good experience on Linux/Windows systems activities, Databases.
- Aware of SLA and Error concepts and their implementations; provide support and participate in Incident management & Jira Stories. Continuously Monitoring system performance and availability, and responding to incidents promptly to minimize downtime.
- Well-versed with Observability tool as Splunk for Monitoring, alerting and logging solutions to identify and address potential issues, especially in infrastructure.
- Expert with Troubleshooting production issues and bugs. Identifying and resolving issues in production environments.
- Experience in providing 24x5 support.
- GitHub Actions
- Atlassian Tools (Bamboo, Bitbucket, Jira, Confluence)
- Build Tools (Maven, Gradle, MS Build, NodeJS)
- SonarQube, Veracode.
- Nexus, JFrog, Nuget, Proget
- Harness
- Salt Services, Ansible
- PowerShell, Shell scripting
- Splunk
- Linux, Windows
-
4+ years of experience in IT and infrastructure
-
2+ years of experience in Azure Devops
-
Experience with Azure DevOps using both as CI / CD tool and Agile framework
-
Practical experience building and maintaining automated operational infrastructure
-
Experience in building React or Angular applications, .NET is must.
-
Practical experience using version control systems with Azure Repo
-
Developed and maintained scripts using Power Shell, ARM templates/ Terraform scripts for Infrastructure as a Code.
-
Experience in Linux shell scripting (Ubuntu) is must
-
Hands on experience with release automation, configuration and debugging.
-
Should have good knowledge of branching and merging
-
Integration of tools like static code analysis tools like SonarCube and Snky or static code analyser tools is a must.
As a MLOps Engineer in QuantumBlack you will:
Develop and deploy technology that enables data scientists and data engineers to build, productionize and deploy machine learning models following best practices. Work to set the standards for SWE and
DevOps practices within multi-disciplinary delivery teams
Choose and use the right cloud services, DevOps tooling and ML tooling for the team to be able to produce high-quality code that allows your team to release to production.
Build modern, scalable, and secure CI/CD pipelines to automate development and deployment
workflows used by data scientists (ML pipelines) and data engineers (Data pipelines)
Shape and support next generation technology that enables scaling ML products and platforms. Bring
expertise in cloud to enable ML use case development, including MLOps
Our Tech Stack-
We leverage AWS, Google Cloud, Azure, Databricks, Docker, Kubernetes, Argo, Airflow, Kedro, Python,
Terraform, GitHub actions, MLFlow, Node.JS, React, Typescript amongst others in our projects
Key Skills:
• Excellent hands-on expert knowledge of cloud platform infrastructure and administration
(Azure/AWS/GCP) with strong knowledge of cloud services integration, and cloud security
• Expertise setting up CI/CD processes, building and maintaining secure DevOps pipelines with at
least 2 major DevOps stacks (e.g., Azure DevOps, Gitlab, Argo)
• Experience with modern development methods and tooling: Containers (e.g., docker) and
container orchestration (K8s), CI/CD tools (e.g., Circle CI, Jenkins, GitHub actions, Azure
DevOps), version control (Git, GitHub, GitLab), orchestration/DAGs tools (e.g., Argo, Airflow,
Kubeflow)
• Hands-on coding skills Python 3 (e.g., API including automated testing frameworks and libraries
(e.g., pytest) and Infrastructure as Code (e.g., Terraform) and Kubernetes artifacts (e.g.,
deployments, operators, helm charts)
• Experience setting up at least one contemporary MLOps tooling (e.g., experiment tracking,
model governance, packaging, deployment, feature store)
• Practical knowledge delivering and maintaining production software such as APIs and cloud
infrastructure
• Knowledge of SQL (intermediate level or more preferred) and familiarity working with at least
one common RDBMS (MySQL, Postgres, SQL Server, Oracle)
• At least 4 years of hands-on experience with cloud infrastructure on GCP
• Hands-on-Experience on Kubernetes is a mandate
• Exposure to configuration management and orchestration tools at scale (e.g. Terraform, Ansible, Packer)
• Knowledge and hand-on-experience in DevOps tools (e.g. Jenkins, Groovy, and Gradle)
• Knowledge and hand-on-experience on the various platforms (e.g. Gitlab, CircleCl and Spinnakar)
• Familiarity with monitoring and alerting tools (e.g. CloudWatch, ELK stack, Prometheus)
• Proven ability to work independently or as an integral member of a team
Preferable Skills:
• Familiarity with standard IT security practices such as encryption,
credentials and key management.
• Proven experience on various coding languages (Java, Python-) to
• support DevOps operation and cloud transformation
• Familiarity and knowledge of the web standards (e.g. REST APIs, web security mechanisms)
• Hands on experience with GCP
• Experience in performance tuning, services outage management and troubleshooting.
Attributes:
• Good verbal and written communication skills
• Exceptional leadership, time management, and organizational skill Ability to operate independently and make decisions with little direct supervision
Requirements:-
- Bachelor’s Degree or Master’s in Computer Science, Engineering,Software Engineering or a relevant field.
- Strong experience with Windows/Linux-based infrastructures, Linux/Unix administration.
- knowledge of Jira, Bitbucket, Jenkins, Xray, Ansible, Windows and .Net. as their Core Skill.
- Strong experience with databases such as SQL, MS SQL, MySQL, NoSQL.
- Knowledge of scripting languages such as Shell Scripting /Python/ PHP/Groovy, Bash.
- Experience with project management and workflow tools such as Agile, Jira / WorkFront etc.
- Experience with open-source technologies and cloud services.
- Experience in working with Puppet or Chef for automation and configuration.
- Strong communication skills and ability to explain protocol and processes with team and management.
- Experience in a DevOps Engineer role (or similar role)
- AExperience in software development and infrastructure development is a plus
Job Specifications:-
- Building and maintaining tools, solutions and micro services associated with deployment and our operations platform, ensuring that all meet our customer service standards and reduce errors.
- Actively troubleshoot any issues that arise during testing and production, catching and solving issues before launch.
- Test our system integrity, implemented designs, application developments and other processes related to infrastructure, making improvements as needed
- Deploy product updates as required while implementing integrations when they arise.
- Automate our operational processes as needed, with accuracy and in compliance with our security requirements.
- Specifying, documenting and developing new product features, and writing automating scripts. Manage code deployments, fixes, updates and related processes.
- Work with open-source technologies as needed.
- Work with CI and CD tools, and source control such as GIT and SVN.
- Lead the team through development and operations.
- Offer technical support where needed, developing software for our back-end systems.

Must Have Skills
- AWS Solutions Architect and/or DevOps certification, Professional preferred
- BS level technical degree or equivalent experience; Computer Science or Engineering background preferred
- Hands-on technical expertise on Amazon Web Services (AWS) to include but not limited to EC2, VPC, IAM, Security groups, ELB/NLB/ALB, Internet gateway, S3, EBS and EFS.
- Experience in migration-deployment of applications to Cloud, re-engineering of application for cloud, setting up OS and app environments in virtualized cloud
- DevOps automation, CI/CD, infrastructure/services provisioning, application deployment and configuration
- DevOps toolsets to include Ansible, Jenkins and XLdeploy, XLrelease
- Deploy and configuration of Java/Wildfly, Springboot, JavaScript/Node.js and Ruby applications and middleware
- Scripting in Shell (bash) and Extensive Python
- Agile software development
- Excellent written, verbal communication skills, presentation, and collaboration skills - Team leadership skills
- Linux (RHEL) administration/engineering
Experience:
- Deploying, configuring, and supporting large scale monolithic and microservices based SaaS applications
- Working as both an infrastructure and application migration specialist
- Identifying and documenting application requirements for network, F5, IAM, and security groups
- Implementing DevOps practices such as infrastructure as code, continuous integration, and automated deployment
- Work with Technology leadership to understand business goals and requirements
- Experience with continuous integration tools
- Experience with configuration management platforms
- Writing and diagnosing issues with complex shell scripts
- Strong practical application development experience on Linux and Windows-based systems

2. Has done Infrastructure coding using Cloudformation/Terraform and Configuration also understands it very clearly
3. Deep understanding of the microservice design and aware of centralized Caching(Redis),centralized configuration(Consul/Zookeeper)
4. Hands-on experience of working on containers and its orchestration using Kubernetes
5. Hands-on experience of Linux and Windows Operating System
6. Worked on NoSQL Databases like Cassandra, Aerospike, Mongo or
Couchbase, Central Logging, monitoring and Caching using stacks like ELK(Elastic) on the cloud, Prometheus, etc.
7. Has good knowledge of Network Security, Security Architecture and Secured SDLC practices






