
As a MLOps Engineer in QuantumBlack you will:
Develop and deploy technology that enables data scientists and data engineers to build, productionize and deploy machine learning models following best practices. Work to set the standards for SWE and
DevOps practices within multi-disciplinary delivery teams
Choose and use the right cloud services, DevOps tooling and ML tooling for the team to be able to produce high-quality code that allows your team to release to production.
Build modern, scalable, and secure CI/CD pipelines to automate development and deployment
workflows used by data scientists (ML pipelines) and data engineers (Data pipelines)
Shape and support next generation technology that enables scaling ML products and platforms. Bring
expertise in cloud to enable ML use case development, including MLOps
Our Tech Stack-
We leverage AWS, Google Cloud, Azure, Databricks, Docker, Kubernetes, Argo, Airflow, Kedro, Python,
Terraform, GitHub actions, MLFlow, Node.JS, React, Typescript amongst others in our projects
Key Skills:
• Excellent hands-on expert knowledge of cloud platform infrastructure and administration
(Azure/AWS/GCP) with strong knowledge of cloud services integration, and cloud security
• Expertise setting up CI/CD processes, building and maintaining secure DevOps pipelines with at
least 2 major DevOps stacks (e.g., Azure DevOps, Gitlab, Argo)
• Experience with modern development methods and tooling: Containers (e.g., docker) and
container orchestration (K8s), CI/CD tools (e.g., Circle CI, Jenkins, GitHub actions, Azure
DevOps), version control (Git, GitHub, GitLab), orchestration/DAGs tools (e.g., Argo, Airflow,
Kubeflow)
• Hands-on coding skills Python 3 (e.g., API including automated testing frameworks and libraries
(e.g., pytest) and Infrastructure as Code (e.g., Terraform) and Kubernetes artifacts (e.g.,
deployments, operators, helm charts)
• Experience setting up at least one contemporary MLOps tooling (e.g., experiment tracking,
model governance, packaging, deployment, feature store)
• Practical knowledge delivering and maintaining production software such as APIs and cloud
infrastructure
• Knowledge of SQL (intermediate level or more preferred) and familiarity working with at least
one common RDBMS (MySQL, Postgres, SQL Server, Oracle)

Similar jobs
Job Description
Experience: 5 - 9 years
Location: Bangalore/Pune/Hyderabad
Work Mode: Hybrid(3 Days WFO)
Senior Cloud Infrastructure Engineer for Data Platform
The ideal candidate will play a critical role in designing, implementing, and maintaining cloud infrastructure and CI/CD pipelines to support scalable, secure, and efficient data and analytics solutions. This role requires a strong understanding of cloud-native technologies, DevOps best practices, and hands-on experience with Azure and Databricks.
Key Responsibilities:
Cloud Infrastructure Design & Management
Architect, deploy, and manage scalable and secure cloud infrastructure on Microsoft Azure.
Implement best practices for Azure Resource Management, including resource groups, virtual networks, and storage accounts.
Optimize cloud costs and ensure high availability and disaster recovery for critical systems
Databricks Platform Management
Set up, configure, and maintain Databricks workspaces for data engineering, machine learning, and analytics workloads.
Automate cluster management, job scheduling, and monitoring within Databricks.
Collaborate with data teams to optimize Databricks performance and ensure seamless integration with Azure services.
CI/CD Pipeline Development
Design and implement CI/CD pipelines for deploying infrastructure, applications, and data workflows using tools like Azure DevOps, GitHub Actions, or similar.
Automate testing, deployment, and monitoring processes to ensure rapid and reliable delivery of updates.
Monitoring & Incident Management
Implement monitoring and alerting solutions using tools like Dynatrace, Azure Monitor, Log Analytics, and Databricks metrics.
Troubleshoot and resolve infrastructure and application issues, ensuring minimal downtime.
Security & Compliance
Enforce security best practices, including identity and access management (IAM), encryption, and network security.
Ensure compliance with organizational and regulatory standards for data protection and cloud operations.
Collaboration & Documentation
Work closely with cross-functional teams, including data engineers, software developers, and business stakeholders, to align infrastructure with business needs.
Maintain comprehensive documentation for infrastructure, processes, and configurations.
Required Qualifications
Education: Bachelor’s degree in Computer Science, Engineering, or a related field.
Must Have Experience:
6+ years of experience in DevOps or Cloud Engineering roles.
Proven expertise in Microsoft Azure services, including Azure Data Lake, Azure Databricks, Azure Data Factory (ADF), Azure Functions, Azure Kubernetes Service (AKS), and Azure Active Directory.
Hands-on experience with Databricks for data engineering and analytics.
Technical Skills:
Proficiency in Infrastructure as Code (IaC) tools like Terraform, ARM templates, or Bicep.
Strong scripting skills in Python, or Bash.
Experience with containerization and orchestration tools like Docker and Kubernetes.
Familiarity with version control systems (e.g., Git) and CI/CD tools (e.g., Azure DevOps, GitHub Actions).
Soft Skills:
Strong problem-solving and analytical skills.
Excellent communication and collaboration abilities.
Please Apply - https://zrec.in/GzLLD?source=CareerSite
About Us
Infra360 Solutions is a services company specializing in Cloud, DevSecOps, Security, and Observability solutions. We help technology companies adapt DevOps culture in their organization by focusing on long-term DevOps roadmap. We focus on identifying technical and cultural issues in the journey of successfully implementing the DevOps practices in the organization and work with respective teams to fix issues to increase overall productivity. We also do training sessions for the developers and make them realize the importance of DevOps. We provide these services - DevOps, DevSecOps, FinOps, Cost Optimizations, CI/CD, Observability, Cloud Security, Containerization, Cloud Migration, Site Reliability, Performance Optimizations, SIEM and SecOps, Serverless automation, Well-Architected Review, MLOps, Governance, Risk & Compliance. We do assessments of technology architecture, security, governance, compliance, and DevOps maturity model for any technology company and help them optimize their cloud cost, streamline their technology architecture, and set up processes to improve the availability and reliability of their website and applications. We set up tools for monitoring, logging, and observability. We focus on bringing the DevOps culture to the organization to improve its efficiency and delivery.
Job Description
Job Title: DevOps Engineer AWS
Department: Technology
Location: Gurgaon
Work Mode: On-site
Working Hours: 10 AM - 7 PM
Terms: Permanent
Experience: 2-4 years
Education: B.Tech/MCA/BCA
Notice Period: Immediately
Infra360.io is searching for a DevOps Engineer to lead our group of IT specialists in maintaining and improving our software infrastructure. You'll collaborate with software engineers, QA engineers, and other IT pros in deploying, automating, and managing the software infrastructure. As a DevOps engineer you will also be responsible for setting up CI/CD pipelines, monitoring programs, and cloud infrastructure.
Below is a detailed description of the roles and responsibilities, expectations for the role.
Tech Stack :
- Kubernetes: Deep understanding of Kubernetes clusters, container orchestration, and its architecture.
- Terraform: Extensive hands-on experience with Infrastructure as Code (IaC) using Terraform for managing cloud resources.
- ArgoCD: Experience in continuous deployment and using ArgoCD to maintain GitOps workflows.
- Helm: Expertise in Helm for managing Kubernetes applications.
- Cloud Platforms: Expertise in AWS, GCP or Azure will be an added advantage.
- Debugging and Troubleshooting: The DevOps Engineer must be proficient in identifying and resolving complex issues in a distributed environment, ranging from networking issues to misconfigurations in infrastructure or application components.
Key Responsibilities:
- CI/CD and configuration management
- Doing RCA of production issues and providing resolution
- Setting up failover, DR, backups, logging, monitoring, and alerting
- Containerizing different applications on the Kubernetes platform
- Capacity planning of different environment's infrastructure
- Ensuring zero outages of critical services
- Database administration of SQL and NoSQL databases
- Infrastructure as a code (IaC)
- Keeping the cost of the infrastructure to the minimum
- Setting up the right set of security measures
- CI/CD and configuration management
- Doing RCA of production issues and providing resolution
- Setting up failover, DR, backups, logging, monitoring, and alerting
- Containerizing different applications on the Kubernetes platform
- Capacity planning of different environment's infrastructure
- Ensuring zero outages of critical services
- Database administration of SQL and NoSQL databases
- Infrastructure as a code (IaC)
- Keeping the cost of the infrastructure to the minimum
- Setting up the right set of security measures
Ideal Candidate Profile:
- A graduation/post-graduation degree in Computer Science and related fields
- 2-4 years of strong DevOps experience with the Linux environment.
- Strong interest in working in our tech stack
- Excellent communication skills
- Worked with minimal supervision and love to work as a self-starter
- Hands-on experience with at least one of the scripting languages - Bash, Python, Go etc
- Experience with version control systems like Git
- Strong experience of Amazon Web Services (EC2, RDS, VPC, S3, Route53, IAM etc.)
- Strong experience with managing the Production Systems day in and day out
- Experience in finding issues in different layers of architecture in production environment and fixing them
- Knowledge of SQL and NoSQL databases, ElasticSearch, Solr etc.
- Knowledge of Networking, Firewalls, load balancers, Nginx, Apache etc.
- Experience in automation tools like Ansible/SaltStack and Jenkins
- Experience in Docker/Kubernetes platform and managing OpenStack (desirable)
- Experience with Hashicorp tools i.e. Vault, Vagrant, Terraform, Consul, VirtualBox etc. (desirable)
- Experience with managing/mentoring small team of 2-3 people (desirable)
- Experience in Monitoring tools like Prometheus/Grafana/Elastic APM.
- Experience in logging tools Like ELK/Loki.
True to its name, is on a mission to unlock $100+ billion of trapped working capital in the economy by creating India’s largest marketplace for invoice discounting to solve the day-to-day. problems faced by businesses. Founded by ex-BCG and ISB / IIM alumni, and backed by SAIF Partners, CashFlo helps democratize access to credit in a fair and transparent manner. Awarded Supply Chain Finance solution of the year in 2019, CashFlo creates a win-win ecosystem for Buyers, suppliers
and financiers through its unique platform model. CashFlo shares its parentage with HCS Ltd., a 25 year old, highly reputed financial services company has raised over Rs. 15,000 Crores in the market till date,
for over 200 corporate clients.
Our leadership team consists of ex-BCG, ISB / IIM alumni with a team of industry veterans from financial services serving as the advisory board. We bring to the table deep insights in the SME lending
space, based on 100+ years of combined experience in Financial Services. We are a team of passionate problem solvers, and are looking for like-minded people to join our team.
The challenge
Solve a complex $300+ billion problem at the cutting edge of Fintech innovation, and make a tangible difference to the small business landscape in India.Find innovative solutions for problems in a yet to be discovered market.
Key Responsibilities
As an early team member, you will get a chance to set the foundations of our engineering culture. You will help articulate our engineering
principles and help set the long-term roadmap. Making decisions on the evolution of CashFlo's technical architectureBuilding new features end to end, from talking to customers to writing code.
Our Ideal Candidate Will Have
3+ years of full-time DevOps engineering experience
Hands-on experience working with AWS services
Deep understanding of virtualization and orchestration tools like Docker, ECS
Experience in writing Infrastructure as Code using tools like CDK, Cloud formation, Terragrunt or Terraform
Experience using centralized logging & monitoring tools such as ELK, CloudWatch, DataDog
Built monitoring dashboards using Prometheus, Grafana
Built and maintained code pipelines and CI/CD
Thorough knowledge of SDLC
Been part of teams that have maintained large deployments
About You
Product-minded. You have a sense for great user experience and feel for when something is off. You love understanding customer pain points
and solving for them.
Get a lot done. You enjoy all aspects of building a product and are comfortable moving across the stack when necessary. You problem solve
independently and enjoy figuring stuff out.
High conviction. When you commit to something, you're in all the way. You're opinionated, but you know when to disagree and commit.
Mediocrity is the worst of all possible outcomes.
Whats in it for me
Gain exposure to the Fintech space - one of the largest and fastest growing markets in India and globally
Shape India’s B2B Payments landscape through cutting edge technology innovation
Be directly responsible for driving company’s success
Join a high performance, dynamic and collaborative work environment that throws new challenges on a daily basis
Fast track your career with trainings, mentoring, growth opportunities on both IC and management track
Work-life balance and fun team events
5 to 10 years of software development & coding experience
Experience with Infrastructure as Code development (Automation, CICD) AWS CloudFormation, AWS CodeBuild, CodeDeploy are a must have.
Experience troubleshooting AWS policy or permissions related errors during resource deployments \
Programming experience; preferred Python, PowerShell, bash development experience \
Have Experience with application build automation tools like Apache Maven, Jenkins, Concourse, and Git supporting continuous integration / continuous deployment capabilities (CI/CD) à GitHub and GitHub actions for deployments are must-have skills (Maven, Jenkins, etc. are nice to have)
Have configuration management experience (Chef, Puppet, or Ansible)
Worked in a Development Shop or have SDLC hands on Experience
Familiar with how to write software, test plans, automate and release using modern development methods
AWS certified at an appropriate level
Looking out for GCP Devop's Engineer who can join Immediately or within 15 days
Job Summary & Responsibilities:
Job Overview:
You will work in engineering and development teams to integrate and develop cloud solutions and virtualized deployment of software as a service product. This will require understanding the software system architecture function as well as performance and security requirements. The DevOps Engineer is also expected to have expertise in available cloud solutions and services, administration of virtual machine clusters, performance tuning and configuration of cloud computing resources, the configuration of security, scripting and automation of monitoring functions. This position requires the deployment and management of multiple virtual clusters and working with compliance organizations to support security audits. The design and selection of cloud computing solutions that are reliable, robust, extensible, and easy to migrate are also important.
Experience:
Experience working on billing and budgets for a GCP project - MUST
Experience working on optimizations on GCP based on vendor recommendations - NICE TO HAVE
Experience in implementing the recommendations on GCP
Architect Certifications on GCP - MUST
Excellent communication skills (both verbal & written) - MUST
Excellent documentation skills on processes and steps and instructions- MUST
At least 2 years of experience on GCP.
Basic Qualifications:
● Bachelor’s/Master’s Degree in Engineering OR Equivalent.
● Extensive scripting or programming experience (Shell Script, Python).
● Extensive experience working with CI/CD (e.g. Jenkins).
● Extensive experience working with GCP, Azure, or Cloud Foundry.
● Experience working with databases (PostgreSQL, elastic search).
● Must have 2 years of minimum experience with GCP certification.
Benefits :
● Competitive salary.
● Work from anywhere.
● Learning and gaining experience rapidly.
● Reimbursement for basic working set up at home.
● Insurance (including top-up insurance for COVID).
Location :
Remote - work from anywhere.
Description
Do you dream about code every night? If so, we’d love to talk to you about a new product that we’re making to enable delightful testing experiences at scale for development teams who build modern software solutions.
What You'll Do
Troubleshooting and analyzing technical issues raised by internal and external users.
Working with Monitoring tools like Prometheus / Nagios / Zabbix.
Developing automation in one or more technologies such as Terraform, Ansible, Cloud Formation, Puppet, Chef will be preferred.
Monitor infrastructure alerts and take proactive action to avoid downtime and customer impacts.
Working closely with the cross-functional teams to resolve issues.
Test, build, design, deployment, and ability to maintain continuous integration and continuous delivery process using tools like Jenkins, maven Git, etc.
Work in close coordination with the development and operations team such that the application is in line with performance according to the customer's expectations.
What you should have
Bachelor’s or Master’s degree in computer science or any related field.
3 - 6 years of experience in Linux / Unix, cloud computing techniques.
Familiar with working on cloud and datacenter for enterprise customers.
Hands-on experience on Linux / Windows / Mac OS’s and Batch/Apple/Bash scripting.
Experience with various databases such as MongoDB, PostgreSQL, MySQL, MSSQL.
Familiar with AWS technologies like EC2, S3, Lambda, IAM, etc.
Must know how to choose the best tools and technologies which best fit the business needs.
Experience in developing and maintaining CI/CD processes using tools like Git, GitHub, Jenkins etc.
Excellent organizational skills to adapt to a constantly changing technical environment
Type, Location
Full Time @ Anywhere in India
Desired Experience
2+ years
Job Description
What You’ll Do
● Deploy, automate and maintain web-scale infrastructure with leading public cloud vendors such as Amazon Web Services, Digital Ocean & Google Cloud Platform.
● Take charge of DevOps activities for CI/CD with the latest tech stacks.
● Acquire industry-recognized, professional cloud certifications (AWS/Google) in the capacity of developer or architect Devise multi-region technical solutions.
● Implementing the DevOps philosophy and strategy across different domains in organisation.
● Build automation at various levels, including code deployment to streamline release process
● Will be responsible for architecture of cloud services
● 24*7 monitoring of the infrastructure
● Use programming/scripting in your day-to-day work
● Have shell experience - for example Powershell on Windows, or BASH on *nix
● Use a Version Control System, preferably git
● Hands on at least one CLI/SDK/API of at least one public cloud ( GCP, AWS, DO)
● Scalability, HA and troubleshooting of web-scale applications.
● Infrastructure-As-Code tools like Terraform, CloudFormation
● CI/CD systems such as Jenkins, CircleCI
● Container technologies such as Docker, Kubernetes, OpenShift
● Monitoring and alerting systems: e.g. NewRelic, AWS CloudWatch, Google StackDriver, Graphite, Nagios/ICINGA
What you bring to the table
● Hands on experience in Cloud compute services, Cloud Function, Networking, Load balancing, Autoscaling.
● Hands on with GCP/AWS Compute & Networking services i.e. Compute Engine, App Engine, Kubernetes Engine, Cloud Function, Networking (VPC, Firewall, Load Balancer), Cloud SQL, Datastore.
● DBs: Postgresql, MySQL, Elastic Search, Redis, kafka, MongoDB or other NoSQL systems
● Configuration management tools such as Ansible/Chef/Puppet
Bonus if you have…
● Basic understanding of Networking(routing, switching, dns) and Storage
● Basic understanding of Protocol such as UDP/TCP
● Basic understanding of Cloud computing
● Basic understanding of Cloud computing models like SaaS, PaaS
● Basic understanding of git or any other source code repo
● Basic understanding of Databases(sql/no sql)
● Great problem solving skills
● Good in communication
● Adaptive to learning
- Essentail Skills:
- Docker
- Jenkins
- Python dependency management using conda and pip
- Base Linux System Commands, Scripting
- Docker Container Build & Testing
- Common knowledge of minimizing container size and layers
- Inspecting containers for un-used / underutilized systems
- Multiple Linux OS support for virtual system
- Has experience as a user of jupyter / jupyter lab to test and fix usability issues in workbenches
- Templating out various configurations for different use cases (we use Python Jinja2 but are open to other languages / libraries)
- Jenkins PIpeline
- Github API Understanding to trigger builds, tags, releases
- Artifactory Experience
- Nice to have: Kubernetes, ArgoCD, other deployment automation tool sets (DevOps)
About the Company
Blue Sky Analytics is a Climate Tech startup that combines the power of AI & Satellite data to aid in the creation of a global environmental data stack. Our funders include Beenext and Rainmatter. Over the next 12 months, we aim to expand to 10 environmental data-sets spanning water, land, heat, and more!
We are looking for DevOps Engineer who can help us build the infrastructure required to handle huge datasets on a scale. Primarily, you will work with AWS services like EC2, Lambda, ECS, Containers, etc. As part of our core development crew, you’ll be figuring out how to deploy applications ensuring high availability and fault tolerance along with a monitoring solution that has alerts for multiple microservices and pipelines. Come save the planet with us!
Your Role
- Applications built at scale to go up and down on command.
- Manage a cluster of microservices talking to each other.
- Build pipelines for huge data ingestion, processing, and dissemination.
- Optimize services for low cost and high efficiency.
- Maintain high availability and scalable PSQL database cluster.
- Maintain alert and monitoring system using Prometheus, Grafana, and Elastic Search.
Requirements
- 1-4 years of work experience.
- Strong emphasis on Infrastructure as Code - Cloudformation, Terraform, Ansible.
- CI/CD concepts and implementation using Codepipeline, Github Actions.
- Advanced hold on AWS services like IAM, EC2, ECS, Lambda, S3, etc.
- Advanced Containerization - Docker, Kubernetes, ECS.
- Experience with managed services like database cluster, distributed services on EC2.
- Self-starters and curious folks who don't need to be micromanaged.
- Passionate about Blue Sky Climate Action and working with data at scale.
Benefits
- Work from anywhere: Work by the beach or from the mountains.
- Open source at heart: We are building a community where you can use, contribute and collaborate on.
- Own a slice of the pie: Possibility of becoming an owner by investing in ESOPs.
- Flexible timings: Fit your work around your lifestyle.
- Comprehensive health cover: Health cover for you and your dependents to keep you tension free.
- Work Machine of choice: Buy a device and own it after completing a year at BSA.
- Quarterly Retreats: Yes there's work-but then there's all the non-work+fun aspect aka the retreat!
- Yearly vacations: Take time off to rest and get ready for the next big assignment by availing the paid leaves.
Role
We are looking for an experienced DevOps engineer that will help our team establish DevOps practice. You will work closely with the technical lead to identify and establish DevOps practices in the company.
You will also help us build scalable, efficient cloud infrastructure. You’ll implement monitoring for automated system health checks. Lastly, you’ll build our CI pipeline, and train and guide the team in DevOps practices.
This would be a hybrid role and the person would be expected to also do some application level programming in their downtime.
Responsibilities
- Deployment, automation, management, and maintenance of production systems.
- Ensuring availability, performance, security, and scalability of production systems.
- Evaluation of new technology alternatives and vendor products.
- System troubleshooting and problem resolution across various application domains and platforms.
- Providing recommendations for architecture and process improvements.
- Definition and deployment of systems for metrics, logging, and monitoring on AWS platform.
- Manage the establishment and configuration of SaaS infrastructure in an agile way by storing infrastructure as code and employing automated configuration management tools with a goal to be able to re-provision environments at any point in time.
- Be accountable for proper backup and disaster recovery procedures.
- Drive operational cost reductions through service optimizations and demand based auto scaling.
- Have on call responsibilities.
- Perform root cause analysis for production errors
- Uses open source technologies and tools to accomplish specific use cases encountered within the project.
- Uses coding languages or scripting methodologies to solve a problem with a custom workflow.
Requirements
- Systematic problem-solving approach, coupled with strong communication skills and a sense of ownership and drive.
- Prior experience as a software developer in a couple of high level programming languages.
- Extensive experience in any Javascript based framework since we will be deploying services to NodeJS on AWS Lambda (Serverless)
- Extensive experience with web servers such as Nginx/Apache
- Strong Linux system administration background.
- Ability to present and communicate the architecture in a visual form.
- Strong knowledge of AWS (e.g. IAM, EC2, VPC, ELB, ALB, Autoscaling, Lambda, NAT gateway, DynamoDB)
- Experience maintaining and deploying highly-available, fault-tolerant systems at scale (~ 1 Lakh users a day)
- A drive towards automating repetitive tasks (e.g. scripting via Bash, Python, Ruby, etc)
- Expertise with Git
- Experience implementing CI/CD (e.g. Jenkins, TravisCI)
- Strong experience with databases such as MySQL, NoSQL, Elasticsearch, Redis and/or Mongo.
- Stellar troubleshooting skills with the ability to spot issues before they become problems.
- Current with industry trends, IT ops and industry best practices, and able to identify the ones we should implement.
- Time and project management skills, with the capability to prioritize and multitask as needed.










