![IT solutions specialized in Apps Lifecycle management. (MG1)'s logo](/_next/image?url=https%3A%2F%2Fcdn.cutshort.io%2Fpublic%2Fimages%2Fdefault_company_picture.jpg&w=3840&q=75)
MLOps Lead Engineer
at IT solutions specialized in Apps Lifecycle management. (MG1)
![skill icon](/_next/image?url=https%3A%2F%2Fcdn.cutshort.io%2Fpublic%2Fimages%2Fskill_icons%2Fpython.png&w=32&q=75)
- Automate and maintain ML and Data pipelines at scale
- Collaborate with Data Scientists and Data Engineers on feature development teams to containerize and build out deployment pipelines for new modules
- Maintain and expand our on-prem deployments with spark clusters
- Design, build and optimize applications containerization and orchestration with Docker and Kubernetes and AWS or Azure
- 5 years of IT experience in data-driven or AI technology products
- Understanding of ML Model Deployment and Lifecycle
- Extensive experience in Apache airflow for MLOps workflow automation
- Experience is building and automating data pipelines
- Experience in working on Spark Cluster architecture
- Extensive experience with Unix/Linux environments
- Experience with standard concepts and technologies used in CI/CD build, deployment pipelines using Jenkins
- Strong experience in Python and PySpark and building required automation (using standard technologies such as Docker, Jenkins, and Ansible).
- Experience with Kubernetes or Docker Swarm
- Working technical knowledge of current systems software, protocols, and standards, including firewalls, Active Directory, etc.
- Basic knowledge of Multi-tier architectures: load balancers, caching, web servers, application servers, and databases.
- Experience with various virtualization technologies and multi-tenant, private and hybrid cloud environments.
- Hands-on software and hardware troubleshooting experience.
- Experience documenting and maintaining configuration and process information.
- Basic Knowledge of machine learning frameworks: Tensorflow, Caffe/Caffe2, Pytorch
![companies logos](/_next/image?url=https%3A%2F%2Fcdn.cutshort.io%2Fpublic%2Fimages%2Fhiring_companies_logos-v2.webp&w=3840&q=80)
Similar jobs
- Configure, optimize, document, and support of the infrastructure components of software products (which are hosted in collocated facilities and cloud services such as AWS)
- Design and build tools and frameworks that support deployment and management and platforms
- Design, build, and deliver cloud computing solutions, hosted services, and underlying software infrastructures
- Build core functionality of our cloud-based platform product, deliver secure, reliable services and construct third party integrations
- Assist in coaching application developers on proper DevOps techniques for building scalable applications in the microservices paradigm
- Foster collaboration with software product development and architecture teams to ensure releases are delivered with repeatable and auditable processes
- Support and troubleshoot scalability, high availability, performance, monitoring, backup, and restores of different environments
- Work independently across multiple platforms and applications to understand dependencies
- Evaluate new tools, technologies, and processes to improve speed, efficiency, and scalability of continuous integration environments
- Design and architect solutions for existing client-facing applications as they are moved into cloud environments such as AWS
- Competencies
- Full understanding of scripting and automated process management in languages such as Shell, Ruby and/ or Python
- Working Knowledge SCM tools such as Git, GitHub, Bitbucket, etc.
- Working knowledge of Amazon Web Services and related APIs
- Ability to deliver and manage web or cloud-based services
- General familiarity with monitoring tools
- General familiarity with configuration/provisioning tools such as Terraform
- Experience
- Experience working within an Agile type environment
- 4+ years of experience with cloud-based provisioning (Azure, AWS, Google), monitoring, troubleshooting, and related DevOps technologies
- 4+ years of experience with containerization/orchestration technologies like Rancher, Docker and Kubernetes
DESIRED SKILLS AND EXPERIENCE
Strong analytical and problem-solving skills
Ability to work independently, learn quickly and be proactive
3-5 years overall and at least 1-2 years of hands-on experience in designing and managing DevOps Cloud infrastructure
Experience must include a combination of:
o Experience working with configuration management tools – Ansible, Chef, Puppet, SaltStack (expertise in at least one tool is a must)
o Ability to write and maintain code in at least one scripting language (Python preferred)
o Practical knowledge of shell scripting
o Cloud knowledge – AWS, VMware vSphere o Good understanding and familiarity with Linux
o Networking knowledge – Firewalls, VPNs, Load Balancers
o Web/Application servers, Nginx, JVM environments
o Virtualization and containers - Xen, KVM, Qemu, Docker, Kubernetes, etc.
o Familiarity with logging systems - Logstash, Elasticsearch, Kibana
o Git, Jenkins, Jira
Objectives :
- Building and setting up new development tools and infrastructure
- Working on ways to automate and improve development and release processes
- Testing code written by others and analyzing results
- Ensuring that systems are safe and secure against cybersecurity threats
- Identifying technical problems and developing software updates and ‘fixes’
- Working with software developers and software engineers to ensure that development follows established processes and works as intended
- Planning out projects and being involved in project management decisions
Daily and Monthly Responsibilities :
- Deploy updates and fixes
- Build tools to reduce occurrences of errors and improve customer experience
- Develop software to integrate with internal back-end systems
- Perform root cause analysis for production errors
- Investigate and resolve technical issues
- Develop scripts to automate visualization
- Design procedures for system troubleshooting and maintenance
Skills and Qualifications :
- Degree in Computer Science or Software Engineering or BSc in Computer Science, Engineering or relevant field
- 3+ years of experience as a DevOps Engineer or similar software engineering role
- Proficient with git and git workflows
- Good logical skills and knowledge of programming concepts(OOPS,Data Structures)
- Working knowledge of databases and SQL
- Problem-solving attitude
- Collaborative team spirit
Company Introduction :
My Next Developer is a global network of top talent in business, design, and technology that enables companies to scale their teams, on-demand.
We take the best elements of virtual teams and combine them with a support structure that encourages innovation, social interaction, and fun. We see no borders, move at a fast pace, and are never afraid to break the mold.
Job Responsibilities
- Ensure security in the development activities.
- Implement risk management techniques and threat modeling.
- Implement infrastructure automation, monitoring and alerts as part of ISO 27001 and SOC 2 certifications.
- Collaborate with internal teams to produce the best security solutions
Minimum requirements :
- Bachelor's/Master's degree in degree in computer science, cybersecurity, engineering, or equivalent degree.
- 3+ years of experience as a DevSecOps engineer.
- Proficiency in back-end technologies such as NodeJs or Python.
- Expertise in using DevOps tools like GitHub, dependency management, and CI/CD.
- Profound knowledge of AWS cloud, DevOps culture and automation tools.
- Fluent in both spoken and written English communication.
- Ability to work full-time (40 hours/week) with a 2-3 hour overlap with European time zone.
- Stay up-to-date on cybersecurity threats and follow the best practices.
- Previous project experience related to ISO 27001 and SOC 2 certifications.
- Good experience in AWS services like Elastic Compute Cloud(EC2), IAM, RDS, API Gateway, Cognito, etc.
- Using GIT, SonarQube, Ansible, Nexus, Nagios, etc.
- Strong experience in creating, importing and launching volumes with security groups, auto-scaling, Load Balancers, Fault-tolerant
- Experience in configuring Jenkins job with related Plugins for Building, Testing, and Continuous Deployment to accomplish the complete CI/CD.
Our client is a call management solutions company, which helps small to mid-sized businesses use its virtual call center to manage customer calls and queries. It is an AI and cloud-based call operating facility that is affordable as well as feature-optimized. The advanced features offered like call recording, IVR, toll-free numbers, call tracking, etc are based on automation and enhances the call handling quality and process, for each client as per their requirements. They service over 6,000 business clients including large accounts like Flipkart and Uber.
- Being involved in Configuration Management, Web Services Architectures, DevOps Implementation, Build & Release Management, Database management, Backups, and Monitoring.
- Ensuring reliable operation of CI/ CD pipelines
- Orchestrate the provisioning, load balancing, configuration, monitoring and billing of resources in the cloud environment in a highly automated manner
- Logging, metrics and alerting management.
- Creating Docker files
- Creating Bash/ Python scripts for automation.
- Performing root cause analysis for production errors.
What you need to have:
- Proficient in Linux Commands line and troubleshooting.
- Proficient in AWS Services. Deployment, Monitoring and troubleshooting applications in AWS.
- Hands-on experience with CI tooling preferably with Jenkins.
- Proficient in deployment using Ansible.
- Knowledge of infrastructure management tools (Infrastructure as cloud) such as terraform, AWS cloudformation etc.
- Proficient in deployment of applications behind load balancers and proxy servers such as nginx, apache.
- Scripting languages: Bash, Python, Groovy.
- Experience with Logging, Monitoring, and Alerting tools like ELK(Elastic-search, Logstash, Kibana), Nagios. Graylog, splunk Prometheus, Grafana is a plus.
A.P.T Portfolio, a high frequency trading firm that specialises in Quantitative Trading & Investment Strategies.Founded in November 2009, it has been a major liquidity provider in global Stock markets.
As a manager, you would be incharge of managing the devops team and your remit shall include the following
- Private Cloud - Design & maintain a high performance and reliable network architecture to support HPC applications
- Scheduling Tool - Implement and maintain a HPC scheduling technology like Kubernetes, Hadoop YARN Mesos, HTCondor or Nomad for processing & scheduling analytical jobs. Implement controls which allow analytical jobs to seamlessly utilize ideal capacity on the private cloud.
- Security - Implementing best security practices and implementing data isolation policy between different divisions internally.
- Capacity Sizing - Monitor private cloud usage and share details with different teams. Plan capacity enhancements on a quarterly basis.
- Storage solution - Optimize storage solutions like NetApp, EMC, Quobyte for analytical jobs. Monitor their performance on a daily basis to identify issues early.
- NFS - Implement and optimize latest version of NFS for our use case.
- Public Cloud - Drive AWS/Google-Cloud utilization in the firm for increasing efficiency, improving collaboration and for reducing cost. Maintain the environment for our existing use cases. Further explore potential areas of using public cloud within the firm.
- BackUps - Identify and automate back up of all crucial data/binary/code etc in a secured manner at such duration warranted by the use case. Ensure that recovery from back-up is tested and seamless.
- Access Control - Maintain password less access control and improve security over time. Minimize failures for automated job due to unsuccessful logins.
- Operating System -Plan, test and roll out new operating system for all production, simulation and desktop environments. Work closely with developers to highlight new performance enhancements capabilities of new versions.
- Configuration management -Work closely with DevOps/ development team to freeze configurations/playbook for various teams & internal applications. Deploy and maintain standard tools such as Ansible, Puppet, chef etc for the same.
- Data Storage & Security Planning - Maintain a tight control of root access on various devices. Ensure root access is rolled back as soon the desired objective is achieved.
- Audit access logs on devices. Use third party tools to put in a monitoring mechanism for early detection of any suspicious activity.
- Maintaining all third party tools used for development and collaboration - This shall include maintaining a fault tolerant environment for GIT/Perforce, productivity tools such as Slack/Microsoft team, build tools like Jenkins/Bamboo etc
Qualifications
- Bachelors or Masters Level Degree, preferably in CSE/IT
- 10+ years of relevant experience in sys-admin function
- Must have strong knowledge of IT Infrastructure, Linux, Networking and grid.
- Must have strong grasp of automation & Data management tools.
- Efficient in scripting languages and python
Desirables
- Professional attitude, co-operative and mature approach to work, must be focused, structured and well considered, troubleshooting skills.
- Exhibit a high level of individual initiative and ownership, effectively collaborate with other team members.
APT Portfolio is an equal opportunity employer
One of our US based client is looking for a Devops professional who can handle Technical as well as Trainings for them in US.
If you are hired, you will be sent to US for the working from there. Training & Technical work ratio will be 70% & 30% respectively.
Company Will sponsor for US Visa.
If you are an Experienced Devops professional and also given professional trainings then feel free to connect with us for more.
Implement integrations requested by customers
Deploy updates and fixes
Provide Level 2 technical support
Build tools to reduce occurrences of errors and improve customer experience
Develop software to integrate with internal back-end systems
Perform root cause analysis for production errors
Investigate and resolve technical issues
Develop scripts to automate visualization
Design procedures for system troubleshooting and maintenance
Multiple Clouds [AWS/Azure/GCP] hands on experience
Good Experience on Docker implementation at scale.
Kubernets implementation and orchestration.
Job Brief:
We are looking for candidates that have experience in development and have performed CI/CD based projects. Should have a good hands-on Jenkins Master-Slave architecture, used AWS native services like CodeCommit, CodeBuild, CodeDeploy and CodePipeline. Should have experience in setting up cross platform CI/CD pipelines which can be across different cloud platforms or on-premise and cloud platform.
Job Location:
Pune.
Job Description:
- Hands on with AWS (Amazon Web Services) Cloud with DevOps services and CloudFormation.
- Experience interacting with customer.
- Excellent communication.
- Hands-on in creating and managing Jenkins job, Groovy scripting.
- Experience in setting up Cloud Agnostic and Cloud Native CI/CD Pipelines.
- Experience in Maven.
- Experience in scripting languages like Bash, Powershell, Python.
- Experience in automation tools like Terraform, Ansible, Chef, Puppet.
- Excellent troubleshooting skills.
- Experience in Docker and Kuberneties with creating docker files.
- Hands on with version control systems like GitHub, Gitlab, TFS, BitBucket, etc.
- 2+ years of demonstrable experience leading site reliability and performance in large-scale, high-traffic environments
- 2+ years of hands-on experience as a DevOps engineer
- Strong leadership, communication and interpersonal skills geared to getting things done
- Developing themselves and the talent within their charge – fostering and creating opportunity for the team
- Strong understanding of SRE concepts and the DevOps culture. Set the direction and strategy for your team, and help shape the overall SRE program for the company
- Be able to lead complicated technical issues and communicating status updates/RCA with management and customers.
- Own site stability, performance, capacity planning, DevOps recruitment.
![icon](/_next/image?url=https%3A%2F%2Fcdn.cutshort.io%2Fpublic%2Fimages%2Fsearch.png&w=48&q=75)
![companies logos](/_next/image?url=https%3A%2F%2Fcdn.cutshort.io%2Fpublic%2Fimages%2Fhiring_companies_logos-v2.webp&w=3840&q=80)