![skill icon](/_next/image?url=https%3A%2F%2Fcdn.cutshort.io%2Fpublic%2Fimages%2Fskill_icons%2Fpython.png&w=32&q=75)
Experience of Linux
Experience using Python or Shell scripting (for Automation)
Hands-on experience with Implementation of CI/CD Processes
Experience working with one cloud platforms (AWS or Azure or Google)
Experience working with configuration management tools such as Ansible & Chef
Experience working with Containerization tool Docker.
Experience working with Container Orchestration tool Kubernetes.
Experience in source Control Management including SVN and/or Bitbucket
& GitHub
Experience with setup & management of monitoring tools like Nagios, Sensu & Prometheus or any other popular tools
Hands-on experience in Linux, Scripting Language & AWS is mandatory
Troubleshoot and Triage development, Production issues
![companies logos](/_next/image?url=https%3A%2F%2Fcdn.cutshort.io%2Fpublic%2Fimages%2Fhiring_companies_logos-v2.webp&w=3840&q=80)
About Opt IT Technologies
About
Connect with the team
Company social profiles
Similar jobs
- At least 5 year of experience in Cloud technologies-AWS and Azure and developing.
- Experience in implementing DevOps practices and DevOps-tools in areas like CI/CD using Jenkins environment automation, and release automation, virtualization, infra as a code or metrics tracking.
- Hands on experience in DevOps tools configuration in different environments.
- Strong knowledge of working with DevOps design patterns, processes and best practices
- Hand-on experience in Setting up Build pipelines.
- Prior working experience in system administration or architecture in Windows or Linux.
- Must have experience in GIT (BitBucket, GitHub, GitLab)
- Hands-on experience on Jenkins pipeline scripting.
- Hands-on knowledge in one scripting language (Nant, Perl, Python, Shell or PowerShell)
- Configuration level skills in tools like SonarQube (or similar tools) and Artifactory.
- Expertise on Virtual Infrastructure (VMWare or VirtualBox or QEMU or KVM or Vagrant) and environment automation/provisioning using SaltStack/Ansible/Puppet/Chef
- Deploying, automating, maintaining and managing Azure cloud based production systems including monitoring capacity.
- Good to have experience in migrating code repositories from one source control to another.
- Hands-on experience in Docker container and orchestration based deployments like Kubernetes, Service Fabric, Docker swarm.
- Must have good communication skills and problem solving skills
We are looking for a DevOps Lead to join our team.
Responsibilities
• A technology Professional who understands software development and can solve IT Operational and deployment challenges using software engineering tools and processes. This position requires an understanding of both Software development (Dev) and deployment
Operations (Ops)
• Identity manual processes and automate them using various DevOps automation tools
• Maintain the organization’s growing cloud infrastructure
• Monitor and maintain DevOps environment stability
• Collaborate with distributed Agile teams to define technical requirements and resolve technical design issues
• Orchestrating builds and test setups using Docker and Kubernetes.
• Participate in designing and building Kubernetes, Cloud, and on-prem environments for maximum performance, reliability and scalability
• Share business and technical learnings with the broader engineering and product organization, while adapting approaches for different audiences
Requirements
• Candidates working for this position should possess at least 5 years of work experience as a DevOps Engineer.
• Candidate should have experience in ELK stack, Kubernetes, and Docker.
• Solid experience in the AWS environment.
• Should have experience in monitoring tools like DataDog or Newrelic.
• Minimum of 5 years experience with code repository management, code merge and quality checks, continuous integration, and automated deployment & management using tools like Jenkins, SVN, Git, Sonar, and Selenium.
• Candidates must possess ample knowledge and experience in system automation, deployment, and implementation.
• Candidates must possess experience in using Linux, Jenkins, and ample experience in configuring and automating the monitoring tools.
• The candidates should also possess experience in the software development process and tools and languages like SaaS, Python, Java, MongoDB, Shell scripting, Python, PostgreSQL, and Git.
• Candidates should demonstrate knowledge in handling distributed data systems.
Examples: Elastisearch, Cassandra, Hadoop, and others.
• Should have experience in GitLab- CIRoles and Responsibilities
Key Responsibilities:
- Work with the development team to plan, execute and monitor deployments
- Capacity planning for product deployments
- Adopt best practices for deployment and monitoring systems
- Ensure the SLAs for performance, up time are met
- Constantly monitor systems, suggest changes to improve performance and decrease costs.
- Ensure the highest standards of security
Key Competencies (Functional):
- Proficiency in coding in atleast one scripting language - bash, Python, etc
- Has personally managed a fleet of servers (> 15)
- Understand different environments production, deployment and staging
- Worked in micro service / Service oriented architecture systems
- Has worked with automated deployment systems – Ansible / Chef / Puppet.
- Can write MySQL queries
- 7-10 years’ total experience, including 6+ years in a production 24/7 high-availability
- multi-site Cloud environment, including application hosting, CDN Networks, security and information protection.
- Experience of leading overall infrastructure for a complex organization and network, including 24x7 monitoring of a media website & digital properties.
- Experience in hosting and managing video streaming applications, React & Node JS based applications.
- Experience with regulatory compliance issues, as well best practices in application and network security.
- Experience in hosting services on Amazon Cloud & Google Cloud.
- Experience in performing Vulnerability Assessment at server & application level
- Experience in managing Live Streaming on digital platforms.
- Experience in managing SVN, Git code repository & Code release management.
- Partners with Technology head lead the technology infrastructure strategy and execution for the enterprise
- Planning, project management and implementation leadership, identifying opportunities for automation, cost savings, and service quality improvement.
- Provides infrastructure services vision, enables innovation and seeks to leverage market trends that can create business value consistent with the company’s requirements and expectations.
- Participate in the formulation of the company's enterprise architecture and business system plans; assessing cost and feasibility, and ensuring the plan is aligned with and supports the strategic goals of the business
- Hands-on technical depth enables direct oversight, problem-solving leadership and participation for complex infrastructure implementation, system upgrades and operational troubleshooting.
- Experience with comprehensive disaster recovery architecture and operations, including storage area network and redundant, highly-available server and network architectures.
- Leadership for delivery of 24/7 service operations and KPI compliance.
- Ensure best practices are followed for code release management & monitoring of traffic on websites & other applications.
Summary
We are building the fastest, most reliable & intelligent trading platform. That requires highly available, scalable & performant systems. And you will be playing one of the most crucial roles in making this happen.
You will be leading our efforts in designing, automating, deploying, scaling and monitoring all our core products.
Tech Facts so Far
1. 8+ services deployed on 50+ servers
2. 35K+ concurrent users on average
3. 1M+ algorithms run every min
4. 100M+ messages/min
We are a 4-member backend team with 1 Devops Engineer. Yes! this is all done by this incredible lean team.
Big Challenges for You
1. Manage 25+ services on 200+ servers
2. Achieve 99.999% (5 Nines) availability
3. Make 1-minute automated deployments possible
If you like to work on extreme scale, complexity & availability, then you will love it here.
Who are we
We are on a mission to help retail traders prosper in the stock market. In just 3 years, we have the 3rd most popular app for the stock markets in India. And we are aiming to be the de-facto trading app in the next 2 years.
We are a young, lean team of ordinary people that is building exceptional products, that solve real problems. We love to innovate, thrill customers and work with brilliant & humble humans.
Key Objectives for You
• Spearhead system & network architecture
• CI, CD & Automated Deployments
• Achieve 99.999% availability
• Ensure in-depth & real-time monitoring, alerting & analytics
• Enable faster root cause analysis with improved visibility
• Ensure a high level of security
Possible Growth Paths for You
• Be our Lead DevOps Engineer
• Be a Performance & Security Expert
Perks
• Challenges that will push you beyond your limits
• A democratic place where everyone is heard & aware
- Experience using AWS (that’s just common sense)
- Experience designing and building web environments on AWS, which includes working with services like EC2, ELB, RDS, and S3
- Experience building and maintaining cloud-native applications
- A solid background in Linux/Unix and Windows server system administration
- Experience using https://www.simplilearn.com/tutorials/devops-tutorial/devops-tools" target="_blank">DevOps tools in a cloud environment, such as Ansible, Artifactory, https://www.simplilearn.com/tutorials/docker-tutorial/what-is-docker-container" target="_blank">Docker, GitHub, https://www.simplilearn.com/tutorials/jenkins-tutorial/what-is-jenkins" target="_blank">Jenkins, https://www.simplilearn.com/tutorials/kubernetes-tutorial/what-is-kubernetes" target="_blank">Kubernetes, Maven, and Sonar Qube
- Experience installing and configuring different application servers such as JBoss, Tomcat, and WebLogic
- Experience using monitoring solutions like CloudWatch, ELK Stack, and Prometheus
- An understanding of writing Infrastructure-as-Code (IaC), using tools like CloudFormation or Terraform
- Knowledge of one or more of the most-used programming languages available for today’s cloud computing (i.e., SQL data, XML data, R math, Clojure math, Haskell functional, Erlang functional, Python procedural, and Go procedural languages)
- Experience in troubleshooting distributed systems
- Proficiency in script development and scripting languages
- The ability to be a team player
- The ability and skill to train other people in procedural and technical topics
- Strong communication and collaboration skills
As a special aside, an AWS engineer who works in DevOps should also have experience with:
- The theory, concepts, and real-world application of Continuous Delivery (CD), which requires familiarity with tools like AWS CodeBuild, AWS CodeDeploy, and AWS CodePipeline
- An understanding of automation
MLOps Engineer
Required Candidate profile :
- 3+ years’ experience in developing continuous integration and deployment (CI/CD) pipelines (e.g. Jenkins, Github Actions) and bringing ML models to CI/CD pipelines
- Candidate with strong Azure expertise
- Exposure of Productionize the models
- Candidate should have complete knowledge of Azure ecosystem, especially in the area of DE
- Candidate should have prior experience in Design, build, test, and maintain machine learning infrastructure to empower data scientists to rapidly iterate on model development
- Develop continuous integration and deployment (CI/CD) pipelines on top of Azure that includes AzureML, MLflow and Azure Devops
- Proficient knowledge of git, Docker and containers, Kubernetes
- Familiarity with Terraform
- E2E production experience with Azure ML, Azure ML Pipelines
- Experience in Azure ML extension for Azure Devops
- Worked on Model Drift (Concept Drift, Data Drift preferable on Azure ML.)
- Candidate will be part of a cross-functional team that builds and delivers production-ready data science projects. You will work with team members and stakeholders to creatively identify, design, and implement solutions that reduce operational burden, increase reliability and resiliency, ensure disaster recovery and business continuity, enable CI/CD, optimize ML and AI services, and maintain it all in infrastructure as code everything-in-version-control manner.
- Candidate with strong Azure expertise
- Candidate should have complete knowledge of Azure ecosystem, especially in the area of DE
- Candidate should have prior experience in Design, build, test, and maintain machine learning infrastructure to empower data scientists to rapidly iterate on model development
- Develop continuous integration and deployment (CI/CD) pipelines on top of Azure that includes AzureML, MLflow and Azure Devops
About Us
We have grown over 1400% in revenues in the last year.
Interface.ai provides an Intelligent Virtual Assistant (IVA) to FIs to automate calls and customer inquiries across multiple channels and engage their customers with financial insights and upsell/cross-sell.
Our IVA is transforming financial institutions’ call centers from a cost to a revenue center.
Our core technology is built 100% in-house with several breakthroughs in Natural Language Understanding. Our parser is built based on zero-shot learning that helps us to launch industry-specific IVA that can achieve over 90% accuracy on Day-1.
We are 45 people strong with employees spread across India and US locations. Many of them come from ML teams at Apple, Microsoft, and Salesforce in the US along with enterprise architects with over 20+ years of experience building large-scale systems. Our India team consists of people from ISB, IIMs, and many who have been previously part of early-stage startups.
We are a fully remote team.
Founders come from Banking and Enterprise Technology backgrounds with previous experience scaling companies from scratch to $50M+ in revenues.
As a Site Reliability Engineer you will be in charge of:
- Designing, analyzing and troubleshooting large-scale distributed systems
- Engaging in cross-functional team discussions on design, deployment, operation, and maintenance, in a fast-moving, collaborative set up
- Building automation scripts to validate the stability, scalability, and reliability of interface.ai’s products & services as well as enhance interface.ai’s employees’ productivity
- Debugging and optimizing code and automating routine tasks
- Troubleshoot and diagnose issues (hardware or software), propose and implement solutions to ensure they occur with reduced frequency
- Perform the periodic on-call duty to handle security, availability, and reliability of interface.ai’s products
- You will follow and write good code and solid engineering practices
Requirements
You can be a great fit if you are :
- Extremely self motivated
- Ability to learn quickly
- Growth Mindset (read this if you don't know what it means - https://www.amazon.com/Mindset-Psychology-Carol-S-Dweck/dp/0345472322" target="_blank">link)
- Emotional Maturity (read this if you don't know what it means - https://medium.com/@krisgage/15-signs-of-emotional-maturity-38b1a2ab9766" target="_blank">link)
- Passionate about the possibilities at the intersection of AI + Banking
- Worked in a startup of 5 to 30 employees
- Developer with a strong interest in systems Design. You will be building, maintaining, and scaling our cloud infrastructure through software tooling and automation.
- 4-8 years of industry experience developing and troubleshooting large-scale infrastructure on the cloud
- Have a solid understanding of system availability, latency, and performance
- Strong programming skills in at least one major programming language and the ability to learn new languages as needed
- Strong System/network debugging skills
- Experience with management/automation tools such as Terraform/Puppet/Chef/SALT
- Experience with setting up production-level monitoring and telemetry
- Expertise in Container management & AWS
- Experience with kubernetes is a plus
- Experience building CI/CD pipelines
- Experience working with Web sockets, Redis, Postgres, Elastic search, Logstash
- Experience working in an agile team environment and proficient understanding of code versioning tools, such as Git.
- Ability to effectively articulate technical challenges and solutions.
- Proactive outlook for ways to make our systems more reliable
![skill icon](/_next/image?url=https%3A%2F%2Fcdn.cutshort.io%2Fpublic%2Fimages%2Fskill_icons%2Fpython.png&w=32&q=75)
Contract to hire
Total 8 years of experience and relevant 4 years
• Experience with building and deploying software in the cloud, preferably on Google Cloud Platform (GCP)
• Sound knowledge to build infrastructure as code with Terraform
• Comfortable with test-driven development, testing frameworks and building CI/CD pipelines with version control software Gitlab
• Strong skills of containerisation with Docker, Kubernetes and Helm
• Familiar with Gitlab, systems integration and BDD
• Solid networking skills e.g. IP, DNS, VPN, HTTP/HTTPS
• Scripting experience (Bash, Python, etc.)
• Experience in Linux/Unix administration
• Experience with agile methods and practices (Scrum, Kanban, Continuous Integration, Pair Programming, TDD)
![icon](/_next/image?url=https%3A%2F%2Fcdn.cutshort.io%2Fpublic%2Fimages%2Fsearch.png&w=48&q=75)
![companies logos](/_next/image?url=https%3A%2F%2Fcdn.cutshort.io%2Fpublic%2Fimages%2Fhiring_companies_logos-v2.webp&w=3840&q=80)