
MLOps Lead Engineer
at IT solutions specialized in Apps Lifecycle management. (MG1)
- Automate and maintain ML and Data pipelines at scale
- Collaborate with Data Scientists and Data Engineers on feature development teams to containerize and build out deployment pipelines for new modules
- Maintain and expand our on-prem deployments with spark clusters
- Design, build and optimize applications containerization and orchestration with Docker and Kubernetes and AWS or Azure
- 5 years of IT experience in data-driven or AI technology products
- Understanding of ML Model Deployment and Lifecycle
- Extensive experience in Apache airflow for MLOps workflow automation
- Experience is building and automating data pipelines
- Experience in working on Spark Cluster architecture
- Extensive experience with Unix/Linux environments
- Experience with standard concepts and technologies used in CI/CD build, deployment pipelines using Jenkins
- Strong experience in Python and PySpark and building required automation (using standard technologies such as Docker, Jenkins, and Ansible).
- Experience with Kubernetes or Docker Swarm
- Working technical knowledge of current systems software, protocols, and standards, including firewalls, Active Directory, etc.
- Basic knowledge of Multi-tier architectures: load balancers, caching, web servers, application servers, and databases.
- Experience with various virtualization technologies and multi-tenant, private and hybrid cloud environments.
- Hands-on software and hardware troubleshooting experience.
- Experience documenting and maintaining configuration and process information.
- Basic Knowledge of machine learning frameworks: Tensorflow, Caffe/Caffe2, Pytorch

Similar jobs
What does a successful Senior DevOps Engineer do at Fiserv?
This role’s focus will be on contributing and enhancing our DevOps environment within Issuer Solution group, where our cross functional Scrum teams are delivering solutions built on cutting-edge mobile technology and products. You will be expected to support across the wider business unit, leading DevOps practices and initiatives.
What will you do:
• Build, manage, and deploy CI/CD pipelines.
• DevOps Engineer - Helm Chart, Rundesk, Openshift
• Strive for continuous improvement and build continuous integration, continuous development, and constant deployment pipeline.
• Implementing various development, testing, automation tools, and IT infrastructure
• Optimize and automate release/development cycles and processes.
• Be part of and help promote our DevOps culture.
• Identify and implement continuous improvements to the development practice
What you must have:
• 3+ years of experience in devops with hands-on experience in the following:
- Writing automation scripts for deployments and housekeeping using shell scripts (bash) and ansible playbooks
- Building docker images and running/managing docker instances
- Building Jenkins pipelines using groovy scripts
- Working knowledge on kubernetes including application deployments, managing application configurations and persistence volumes
• Has good understanding on infrastructure as code
• Ability to write and update documentation
• Demonstrate a logical, process orientated approach to problems and troubleshooting
• Ability to collaborate with multi development teams
What you are preferred to have:
• 8+ years of development experience
• Jenkins administration experience
• Hands-on experience in building and deploying helm charts
Process Skills:
• Should have worked in Agile Project
Requirements:
• Previously help a DevOps Engineer or System Engineer role
• 4+ years of production Linux system admin experience in high traffic environment
• 1+ years of experience with Amazon AWS and related services (instances, ELB,
EBS, S3, etc.) and abstractions on top of AWS.
• Strong understanding of network fundamentals, IP and related services (DNS, VPN, firewalls, etc.) and
security concerns.
• Experience in running Docker and Kubernetes clusters in production.
• Love automating mundane tasks and make developers life easy
• Must be able to code in, at a minimum, Python (or Ruby) and Bash.
• Non-trivial production experience with Saltstack and/or Puppet, Composer,Jenkins, GIT
• Agile software development best practices - continuous integration, releases,branches, etc.
• Experience with modern monitoring tools; capacity planning.
• Some experience with MySQL, PostgreSQL, ElasticSearch, Node.js, and PHP is a plus.
• Self-motivated, fast learner, detail-oriented, team player with a sense of humor
Experience in managing CI/CD using Jenkins.
Required Skills
• Automation is a part of your daily functions, so thorough familiarity with Unix Bourne shell scripting and Python is a critical survival skill.
• Integration and maintenance of automated tools
• Strong analytical and problem-solving skills
• Working experience in source control tools such as GIT/Github/Gitlab/TFS
• Have experience with modern virtualization technologies (Docker, KVM, AWS, OpenStack, or any orchestration platforms)
• Automation of deployment, customization, upgrades, and monitoring through modern DevOps tools (Ansible, Kubernetes, OpenShift, etc) • Advanced Linux admin experience
• Using Jenkins or similar tools
• Deep understanding of Container orchestration(Preferably Kubernetes )
• Strong knowledge of Object Storage(Preferably Cept on Rook)
• Experience in installing, managing & tuning microservices environments using Kubernetes & Docker both on-premise and on the cloud.
• Experience in deploying and managing spring boot applications.
• Experience in deploying and managing Python applications using Django, FastAPI, Flask.
• Experience in deploying machine learning pipelines/data pipelines using Airflow/Kubeflow /Mlflow.
• Experience in web server and reverse Proxy like Nginx, Apache Server, HAproxy
• Experience in monitoring tools like Prometheus, Grafana.
• Experience in provisioning & maintaining SQL/NoSQL databases.
Desired Skills
• Configuration software: Ansible
• Excellent communication and collaboration skills
• Good experience on Networking Technologies like a Load balancer, ACL, Firewall, VIP, DNS
• Programmatic experience with AWS, DO, or GCP storage & machine images
• Experience on various Linux distributions
• Knowledge of Azure DevOps Server
• Docker management and troubleshooting
• Familiarity with micro-services and RESTful systems
• AWS / GCP / Azure certification
• Interact with the Engineering for supporting/maintaining/designing backend infrastructure for product support
• Create fully automated global cloud infrastructure that spans multiple regions.
• Great learning attitude to the newest technology and a Team player
environment. He/she must demonstrate a high level of ownership, integrity, and leadership
skills and be flexible and adaptive with a strong desire to learn & excel.
Required Skills:
- Strong experience working with tools and platforms like Helm charts, Circle CI, Jenkins,
- and/or Codefresh
- Excellent knowledge of AWS offerings around Cloud and DevOps
- Strong expertise in containerization platforms like Docker and container orchestration platforms like Kubernetes & Rancher
- Should be familiar with leading Infrastructure as Code tools such as Terraform, CloudFormation, etc.
- Strong experience in Python, Shell Scripting, Ansible, and Terraform
- Good command over monitoring tools like Datadog, Zabbix, Elk, Grafana, CloudWatch, Stackdriver, Prometheus, JFrog, Nagios, etc.
- Experience with Linux/Unix systems administration.
We are looking for a DevOps Engineer for managing the interchange of data between the server and the users. Your primary responsibility will be the development of all server-side logic, definition, and maintenance of the central database, and ensuring high performance and responsiveness to request from the frontend. You will also be responsible for integrating the front-end elements built by your co-workers into the application. Therefore, a basic understanding of frontend technologies is necessary as well.
What we are looking for
- Must have strong knowledge of Kubernetes and Helm3
- Should have previous experience in Dockerizing the applications.
- Should be able to automate manual tasks using Shell or Python
- Should have good working knowledge on AWS and GCP clouds
- Should have previous experience working on Bitbucket, Github, or any other VCS.
- Must be able to write Jenkins Pipelines and have working knowledge on GitOps and ArgoCD.
- Have hands-on experience in Proactive monitoring using tools like NewRelic, Prometheus, Grafana, Fluentbit, etc.
- Should have a good understanding of ELK Stack.
- Exposure on Jira, confluence, and Sprints.
What you will do:
- Mentor junior Devops engineers and improve the team’s bar
- Primary owner of tech best practices, tech processes, DevOps initiatives, and timelines
- Oversight of all server environments, from Dev through Production.
- Responsible for the automation and configuration management
- Provides stable environments for quality delivery
- Assist with day-to-day issue management.
- Take lead in containerising microservices
- Develop deployment strategies that allow DevOps engineers to successfully deploy code in any environment.
- Enables the automation of CI/CD
- Implement dashboard to monitors various
- 1-3 years of experience in DevOps
- Experience in setting up front end best practices
- Working in high growth startups
- Ownership and Be Proactive.
- Mentorship & upskilling mindset.
- systems and applications
what you’ll get- Health Benefits
- Innovation-driven culture
- Smart and fun team to work with
- Friends for life
- Good experience in AWS services like Elastic Compute Cloud(EC2), IAM, RDS, API Gateway, Cognito, etc.
- Using GIT, SonarQube, Ansible, Nexus, Nagios, etc.
- Strong experience in creating, importing and launching volumes with security groups, auto-scaling, Load Balancers, Fault-tolerant
- Experience in configuring Jenkins job with related Plugins for Building, Testing, and Continuous Deployment to accomplish the complete CI/CD.
Mandatory:
● A minimum of 1 year of development, system design or engineering experience ●
Excellent social, communication, and technical skills
● In-depth knowledge of Linux systems
● Development experience in at least two of the following languages: Php, Go, Python,
JavaScript, C/C++, Bash
● In depth knowledge of web servers (Apache, NgNix preferred)
● Strong in using DevOps tools - Ansible, Jenkins, Docker, ELK
● Knowledge to use APM tools, NewRelic is preferred
● Ability to learn quickly, master our existing systems and identify areas of improvement
● Self-starter that enjoys and takes pride in the engineering work of their team ● Tried
and Tested Real-world Cloud Computing experience - AWS/ GCP/ Azure ● Strong
Understanding of Resilient Systems design
● Experience in Network Design and Management
Rules & Responsibilities:
- Design, implement and maintain all AWS infrastructure and services within a managed service environment
- Should be able to work on 24 X 7 shifts for support of infrastructure.
- Design, Deploy and maintain enterprise class security, network and systems management applications within an AWS environment
- Design and implement availability, scalability, and performance plans for the AWS managed service environment
- Continual re-evaluation of existing stack and infrastructure to maintain optimal performance, availability and security
- Manage the production deployment and deployment automation
- Implement process and quality improvements through task automation
- Institute infrastructure as code, security automation and automation or routine maintenance tasks
- Experience with containerization and orchestration tools like docker, Kubernetes
- Build, Deploy and Manage Kubernetes clusters thru automation
- Create and deliver knowledge sharing presentations and documentation for support teams
- Learning on the job and explore new technologies with little supervision
- Work effectively with onsite/offshore teams
Qualifications:
- Must have Bachelor's degree in Computer Science or related field and 4+ years of experience in IT
- Experience in designing, implementing, and maintaining all AWS infrastructure and services
- Design and implement availability, scalability, and performance plans for the AWS managed service environment
- Continual re-evaluation of existing stack and infrastructure to maintain optimal performance, availability, and security
- Hands-on technical expertise in Security Architecture, automation, integration, and deployment
- Familiarity with compliance & security standards across the enterprise IT landscape
- Extensive experience with Kubernetes and AWS(IAM, Route53, SSM, S3, EFS, EBS, ELB, Lambda, CloudWatch, CloudTrail, SQS, SNS, RDS, Cloud Formation, DynamoDB)
- Solid understanding of AWS IAM Roles and Policies
- Solid Linux experience with a focus on web (Apache Tomcat/Nginx)
- Experience with automation/configuration management using Terraform\Chef\Ansible or similar.
- Understanding of protocols/technologies like Microservices, HTTP/HTTPS, SSL/TLS, LDAP, JDBC, SQL, HTML
- Experience in managing and working with the offshore teams
- Familiarity with CI/CD systems such as Jenkins, GitLab CI
- Scripting experience (Python, Bash, etc.)
- AWS, Kubernetes Certification is preferred
- Ability to work with and influence Engineering teams
- Define and document best practices and strategies regarding application deployment and infrastructure maintenance.
- Ensure limited system failure and increase up-time and availability of the various company apps.
- Understand the current application infrastructure and strive for making it better.
- Automate infrastructure and develop tools and processes to improve the customer experience and reduce support time.
- Work closely with a team of developers and solution strategists to develop, deploy and troubleshoot the deployment and infrastructure issues.
- Manage full application stacks from the OS through custom applications using Amazon cloud-based computing environments.
- Set up a monitoring stack.
- Implement the application’s CI/CD pipeline using the AWS stack. Increasingly automate and improve the testing plans and development workflows and tools.
- Work closely with the engineers to design networks, systems, and storage environments that effectively reflect business needs, security requirements, and service level requirements.
- Manage a continuous integration/continuous deployment methodology for the server-based technologies.
- Proficient in leveraging CI and CD tools to automate testing and deployment. Experience working in an Agile, fast-paced, DevOps environment.
- Support internal and external customers on multiple platforms.
- First point of contact for handling customer issues, providing guidance and recommendations to increase efficiency and reduce customer incidents.
- Learn on the job and explore new technologies with little supervision.
- In addition to providing customer support, will be responsible for helping build tools and processes necessary for excellent customer outcomes.
Skills:
- Experience with the core AWS services, plus the specifics mentioned in this job description.
- Experience working with at least one of the following languages: Node.js, Python, PHP, Ruby, Kotlin or Java.
- Proficient with Git and Git workflows and hosted enterprise Git solutions like GitHub.
- Ability to troubleshoot distributed systems.
- Experience with. AWS EKS Kubernetes infrastructure setup.
- Experience creating Cloud Formation Template to create Auto Scaling Groups, Route 53, DNS, back-end database, Elastic load balancer, VPCs, Subnets, Security Groups, Cloud Watch, S3, IAM roles, RDS DB instances, and to provide those instances and configure those resources to work together reducing the manual effort.
- Experience in deploying and monitoring microservices on Kubernetes, AWS ECS, and AWS EKS
- Security aware and ensures that all systems are security standards-compliant.
- Good background in Linux/Unix administration.
- Experience with building or maintaining cloud-native applications.
- Minimum 3-5 years of cloud development experience, preferably AWS
- Experience with CI/CD tools like Jenkins preferred.
- Good analytical and communication skills
- Bachelor’s Degree in Computer Science, Engineering or a related technical discipline








