
You will be responsible for
1. Setting up, maintaining cloud (AWS/GCP/Azure) and kubernetes cluster and automating
their operation
2. All operational aspects of devtron platform including maintenance, upgrades,
automation.
3. Providing kubernetes expertise to facilitate smooth and fast customer onboarding on
devtron platform
Responsibilities:
1. Manage devtron platform on multiple kubernetes clusters
2. Designing and embedding industry best practices for online services including disaster
recovery, business continuity, monitoring/alerting, and service health measurement
3. Providing operational support for day to day activities involving the deployment of
services
4. Identify opportunities for improving the security, reliability, and scalability of the platform
5. Facilitate smooth and fast customer onboarding on devtron platform
6. Drive customer engagement
Requirements:
● Bachelor's Degree in Computer Science or a related field.
● 2+ years working as a devops engineer
● Proficient in 1 or more programming languages (e.g. Python, Go, Ruby).
● Familiar with shell scripts, Linux commands, network fundamentals
● Understanding of large scale distributed systems
● Basic understanding of cloud computing (AWS/GCP/Azure)
Preferred Qualifications:
● Great analytical and interpersonal skills
● Passion for creating efficient, reliable, reusable programs/scripts.
● Excited about technology, have a strong interest in learning about and playing with the
latest technologies and doing POC.
● Strong customer focus, ownership, urgency and drive.
● Knowledge and experience with cloud native tools like prometheus, kubernetes, docker,
grafana.

About Devtron Inc.
About
Devtron is open-source DevOps platform that is specifically designed for Kubernetes. The platform offers a range of features including CI, CD, security, debugging, and cost optimization, all accessible through an intuitive user interface. With Devtron, customers can easily debug applications, monitor events, and check configurations, all from a single screen without the need to switch to cloud watch. The platform also provides metrics that measure deployment frequencies, as well as a single-pane view for application debugging, which helps to increase the stability of applications.
Devtron is founded in 2019 by Nishant Kumar, Prashant Ghildiyal, and Rajesh Razdan, Devtron is headquartered in Del Mar, California.
Similar jobs
Job Title: DevOps Engineer
Location: Remote
Type: Full-time
About Us:
At Tese, we are committed to advancing sustainability through innovative technology solutions. Our platform empowers SMEs, financial institutions, and enterprises to achieve their Environmental, Social, and Governance (ESG) goals. We are looking for a skilled and passionate DevOps Engineer to join our team and help us build and maintain scalable, reliable, and efficient infrastructure.
Role Overview:
As a DevOps Engineer, you will be responsible for designing, implementing, and managing the infrastructure that supports our applications and services. You will work closely with our development, QA, and data science teams to ensure smooth deployment, continuous integration, and continuous delivery of our products. Your role will be critical in automating processes, enhancing system performance, and maintaining high availability.
Key Responsibilities:
- Infrastructure Management:
- Design, implement, and maintain scalable cloud infrastructure on platforms such as AWS, Google Cloud, or Azure.
- Manage server environments, including provisioning, monitoring, and maintenance.
- CI/CD Pipeline Development:
- Develop and maintain continuous integration and continuous deployment pipelines using tools like Jenkins, GitLab CI/CD, or CircleCI.
- Automate deployment processes to ensure quick and reliable releases.
- Configuration Management and Automation:
- Implement infrastructure as code (IaC) using tools like Terraform, Ansible, or CloudFormation.
- Automate system configurations and deployments to improve efficiency and reduce manual errors.
- Monitoring and Logging:
- Set up and manage monitoring tools (e.g., Prometheus, Grafana, ELK Stack) to track system performance and troubleshoot issues.
- Implement logging solutions to ensure effective incident response and system analysis.
- Security and Compliance:
- Ensure systems are secure and compliant with industry standards and regulations.
- Implement security best practices, including identity and access management, network security, and vulnerability assessments.
- Collaboration and Support:
- Work closely with development and QA teams to support application deployments and troubleshoot issues.
- Provide support for infrastructure-related inquiries and incidents.
Qualifications:
- Education:
- Bachelor's degree in Computer Science, Engineering, or a related field, or equivalent practical experience.
- Experience:
- 3-5 years of experience in DevOps, system administration, or related roles.
- Hands-on experience with cloud platforms such as AWS, Google Cloud Platform, or Azure.
- Technical Skills:
- Proficiency in scripting languages like Bash, Python, or Ruby.
- Strong experience with containerization technologies like Docker and orchestration tools like Kubernetes.
- Knowledge of configuration management tools (Ansible, Puppet, Chef).
- Experience with CI/CD tools (Jenkins, GitLab CI/CD, CircleCI).
- Familiarity with monitoring and logging tools (Prometheus, Grafana, ELK Stack).
- Understanding of networking concepts and security best practices.
- Soft Skills:
- Strong problem-solving skills and attention to detail.
- Excellent communication and collaboration abilities.
- Ability to work in a fast-paced environment and manage multiple tasks.
Preferred Qualifications:
- Experience with infrastructure as code (IaC) tools like Terraform or CloudFormation.
- Knowledge of microservices architecture and serverless computing.
- Familiarity with database administration (SQL and NoSQL databases).
- Experience with Agile methodologies and working in a Scrum or Kanban environment.
- Passion for sustainability and interest in ESG initiatives.
Benefits:
- Competitive salary and benefits package,and performance bonuses.
- Flexible working hours and remote work options.
- Opportunity to work on impactful projects that promote sustainability.
- Professional development opportunities, including access to training and conferences.
We are seeking a skilled and proactive Kubernetes Administrator with strong hands-on experience in managing Red Hat OpenShift environments. The ideal candidate will have a solid background in Kubernetes administration, ArgoCD, and Jenkins.
This role demands a self-motivated, quick learner who can confidently manage OpenShift-based infrastructure in production environments, communicate effectively with stakeholders, and escalate issues promptly when needed.
Key Skills & Qualifications
- Strong experience with Red Hat OpenShift and Kubernetes administration (OpenShift or Kubernetes certification a plus).
- Proven expertise in managing containerized workloads on OpenShift platforms.
- Experience with ArgoCD, GitLab CI/CD, and Helm for deployment automation.
- Ability to troubleshoot issues in high-pressure production environments.
- Strong communication and customer-facing skills.
- Quick learner with a positive attitude toward problem-solving.
DevOps Lead Engineer
We are seeking a skilled DevOps Lead Engineer with 8 to 10 yrs. of experience who handles the entire DevOps lifecycle and is accountable for the implementation of the process. A DevOps Lead Engineer is liable for automating all the manual tasks for developing and deploying code and data to implement continuous deployment and continuous integration frameworks. They are also held responsible for maintaining high availability of production and non-production work environments.
Essential Requirements (must have):
• Bachelor's degree preferable in Engineering.
• Solid 5+ experience with AWS, DevOps, and related technologies
Skills Required:
Cloud Performance Engineering
• Performance scaling in a Micro-Services environment
• Horizontal scaling architecture
• Containerization (such as Dockers) & Deployment
• Container Orchestration (such as Kubernetes) & Scaling
DevOps Automation
• End to end release automation.
• Solid Experience in DevOps tools like GIT, Jenkins, Docker, Kubernetes, Terraform, Ansible, CFN etc.
• Solid experience in Infra Automation (Infrastructure as Code), Deployment, and Implementation.
• Candidates must possess experience in using Linux, Jenkins, and ample experience in Configuring and automating the monitoring tools.
• Strong scripting knowledge
• Strong analytical and problem-solving skills.
• Cloud and On-prem deployments
Infrastructure Design & Provisioning
• Infra provisioning.
• Infrastructure Sizing
• Infra Cost Optimization
• Infra security
• Infra monitoring & site reliability.
Job Responsibilities:
• Responsible for creating software deployment strategies that are essential for the successful
deployment of software in the work environment and provide stable environment for delivery of
quality.
• The DevOps Lead Engineer is accountable for designing, building, configuring, and optimizing
automation systems that help to execute business web and data infrastructure platforms.
• The DevOps Lead Engineer is involved in creating technology infrastructure, automation tools,
and maintaining configuration management.
• The Lead DevOps Engineer oversees and leads the activities of the DevOps team. They are
accountable for conducting training sessions for the juniors in the team, mentoring, career
support. They are also answerable for the architecture and technical leadership of the complete
DevOps infrastructure.
Requirements
Core skills:
● Strong background in Linux / Unix Administration and
troubleshooting
● Experience with AWS (ideally including some of the following:
VPC, Lambda, EC2, Elastic Cache, Route53, SNS, Cloudwatch,
Cloudfront, Redshift, Open search, ELK etc.)
● Experience with Infra Automation and Orchestration tools
including Terraform, Packer, Helm, Ansible.
● Hands on Experience on container technologies like Docker,
Kubernetes/EKS, Gitlab and Jenkins as Pipeline.
● Experience in one or more of Groovy, Perl, Python, Go or
scripting experience in Shell.
● Good understanding of with Continuous Integration(CI) and
Continuous Deployment(CD) pipelines using tools like Jenkins,
FlexCD, ArgoCD, Spinnaker etc
● Working knowledge of key value stores, database technologies
(SQL and NoSQL), Mongo, mySQL
● Experience with application monitoring tools like Prometheus,
Grafana, APM tools like NewRelic, Datadog, Pinpoint
● Good exposure on middleware components like ELK, Redis, Kafka
and IOT based systems including Redis, NewRelic, Akamai,
Apache / Nginx, ELK, Grafana, Prometheus etc
Good to have:
● Prior experience in Logistics, Payment and IOT based applications
● Experience in unmanaged mongoDB cluster, automations &
operations, analytics
● Write procedures for backup and disaster recovery
Core Experience
● 3-5 years of hands-on DevOps experience
● 2+ years of hands-on Kubernetes experience
● 3+ years of Cloud Platform experience with special focus on
Lambda, R53, SNS, Cloudfront, Cloudwatch, Elastic Beanstalk,
RDS, Open Search, EC2, Security tools
● 2+ years of scripting experience in Python/Go, shell
● 2+ years of familiarity with CI/CD, Git, IaC, Monitoring, and
Logging tools
Job description
The ideal candidate is a self-motivated, multi-tasker, and demonstrated team player. You will be a lead developer responsible for the development of new software security policies and enhancements to security on existing products. You should excel in working with large-scale applications and frameworks and have outstanding communication and leadership skills.
Responsibilities
- Consulting with management on the operational requirements of software solutions.
- Contributing expertise on information system options, risk, and operational impact.
- Mentoring junior software developers in gaining experience and assuming DevOps responsibilities.
- Managing the installation and configuration of solutions.
- Collaborating with developers on software requirements, as well as interpreting test stage data.
- Developing interface simulators and designing automated module deployments.
- Completing code and script updates, as well as resolving product implementation errors.
- Overseeing routine maintenance procedures and performing diagnostic tests.
- Documenting processes and monitoring performance metrics.
- Conforming to best practices in network administration and cybersecurity.
Qualifications
- Minimum of 2 years of hands-on experience in software development and DevOps, specifically managing AWS Infrastructure such as EC2s, RDS, Elastic cache, S3, IAM, cloud trail and other services provided by AWS.
- Experience Building a multi-region highly available auto-scaling infrastructure that optimises performance and cost. plan for future infrastructure as well as Maintain & optimise existing infrastructure.
- Conceptualise, architect and build automated deployment pipelines in a CI/CD environment like Jenkins.
- Conceptualise, architect and build a containerised infrastructure using Docker, Mesosphere or similar SaaS platforms.
- Conceptualise, architect and build a secured network utilising VPCs with inputs from the security team.
- Work with developers & QA to institute a policy of Continuous Integration with Automated testing Architect, build and manage dashboards to provide visibility into delivery, production application functional and performance status.
- Work with developers to institute systems, policies and workflows which allow for rollback of deployments Triage release of applications to production environment on a daily basis.
- Interface with developers and triage SQL queries that need to be executed in production environments.
- Assist the developers and on calls for other teams with post mortem, follow up and review of issues affecting production availability.
- Minimum 2 years’ experience in Ansible.
- Must have written playbook to automate provisioning of AWS infrastructure as well as automation of routine maintenance tasks.
- Must have had prior experience automating deployments to production and lower environments.
- Experience with APM tools like New Relic and log management tools.
- Our entire platform is hosted on AWS, comprising of web applications, webservices, RDS, Redis and Elastic Search clusters and several other AWS resources like EC2, S3, Cloud front, Route53 and SNS.
- Essential Functions System Architecture Process Design and Implementation
- Minimum of 2 years scripting experience in Ruby/Python (Preferable) and Shell Web Application Deployment Systems Continuous Integration tools (Ansible)Establishing and enforcing Network Security Policy (AWS VPC, Security Group) & ACLs.
- Establishing and enforcing systems monitoring tools and standards
- Establishing and enforcing Risk Assessment policies and standards
- Establishing and enforcing Escalation policies and standards
- Work towards improving the following 4 verticals - scalability, availability, security, and cost, for company's workflows and products.
- Help in provisioning, managing, optimizing cloud infrastructure in AWS (IAM, EC2, RDS, CloudFront, S3, ECS, Lambda, ELK etc.)
- Work with the development teams to design scalable, robust systems using cloud architecture for both 0-to-1 and 1-to-100 products.
- Drive technical initiatives and architectural service improvements.
- Be able to predict problems and implement solutions that detect and prevent outages.
- Mentor/manage a team of engineers.
- Design solutions with failure scenarios in mind to ensure reliability.
- Document rigorously to keep track of all changes/upgrades to the infrastructure and as well share knowledge with the rest of the team
- Identify vulnerabilities during development with actionable information to empower developers to remediate vulnerabilities
- Automate the build and testing processes to consistently integrate code
- Manage changes to documents, software, images, large web sites, and other collections of code, configuration, and metadata among disparate teams
Our client is a call management solutions company, which helps small to mid-sized businesses use its virtual call center to manage customer calls and queries. It is an AI and cloud-based call operating facility that is affordable as well as feature-optimized. The advanced features offered like call recording, IVR, toll-free numbers, call tracking, etc are based on automation and enhances the call handling quality and process, for each client as per their requirements. They service over 6,000 business clients including large accounts like Flipkart and Uber.
- Being involved in Configuration Management, Web Services Architectures, DevOps Implementation, Build & Release Management, Database management, Backups, and Monitoring.
- Ensuring reliable operation of CI/ CD pipelines
- Orchestrate the provisioning, load balancing, configuration, monitoring and billing of resources in the cloud environment in a highly automated manner
- Logging, metrics and alerting management.
- Creating Docker files
- Creating Bash/ Python scripts for automation.
- Performing root cause analysis for production errors.
What you need to have:
- Proficient in Linux Commands line and troubleshooting.
- Proficient in AWS Services. Deployment, Monitoring and troubleshooting applications in AWS.
- Hands-on experience with CI tooling preferably with Jenkins.
- Proficient in deployment using Ansible.
- Knowledge of infrastructure management tools (Infrastructure as cloud) such as terraform, AWS cloudformation etc.
- Proficient in deployment of applications behind load balancers and proxy servers such as nginx, apache.
- Scripting languages: Bash, Python, Groovy.
- Experience with Logging, Monitoring, and Alerting tools like ELK(Elastic-search, Logstash, Kibana), Nagios. Graylog, splunk Prometheus, Grafana is a plus.
As an Infrastructure Engineer at Navi, you will be building a resilient infrastructure platform, using modern Infrastructure engineering practices.
You will be responsible for the availability, scaling, security, performance and monitoring of the navi Cloud platform. You’ll be joining a team that follows best practices in infrastructure as code
Your Key Responsibilities
- Build out the Infrastructure components like API Gateway, Service Mesh, Service Discovery, container orchestration platform like kubernetes.
- Developing reusable Infrastructure code and testing frameworks
- Build meaningful abstractions to hide the complexities of provisioning modern infrastructure components
- Design a scalable Centralized Logging and Metrics platform
- Drive solutions to reduce Mean Time To Recovery(MTTR), enable High Availability.
What to Bring
- Good to have experience in managing large scale cloud infrastructure, preferable AWS and Kubernetes
- Experience in developing applications using programming languages like Java, Python and Go
- Experience in handling logs and metrics at a high scale.
- Systematic problem-solving approach, coupled with strong communication skills and a sense of ownership and drive.
Responsibilities
- Building and maintenance of resilient and scalable production infrastructure
Improvement of monitoring systems
- Improvement of monitoring systems
- Creation and support of development automation processes (CI / CD)
- Participation in infrastructure development
- Detection of problems in architecture and proposing of solutions for solving them
- Creation of tasks for system improvements for system scalability, performance and monitoring
- Analysis of product requirements in the aspect of devops
- Incident analysis and fixing
Skills and Experience
- Understanding of the distributed systems principles
- Understanding of principles for building a resistant network infrastructure
- Experience of Ubuntu Linux administration (Debian-like will be a plus)
- Strong knowledge of Bash
- Experience of working with LXC-containers
- Understanding and experience with infrastructure as a code approach
- Experience of development idempotent Ansible roles
- Experience of working with git
Preferred experience
Experience with relational databases (PostgeSQL), ability to create simple SQL queries
- Experience with monitoring and metric collect systems (Prometheus, Grafana, Zabbix)
- Understanding of dynamic routing (OSPF)
- Knowledge and experience of working with network equipment Cisco
- Experience of working with Cisco NX-OS
- Experience of working with IPsec, VXLAN, Open vSwitch
- Knowledge of principles of multicast protocols IGMP, PIM
- Experience of setting multicast on Cisco equipment
- Experience administering Atlassian products
A strong background in Azure OR Amazon Web Services (AWS) or a similar cloud platform is a must-have, certification is a plus.
Excellent technical skills and knowledge include but not limited to: cloud methodologies like PaaS and SaaS; programming languages such as Python, Java, .Net; orchestration systems such as Chef, Ansible, Terraform; Azure IaaS servers; PowerShell scripting.
Fully aware of the DevOps cycle with hands-on on deployment models to the cloud.
Experience working with Docker and related containerization technologies.
Experience working on orchestration platforms such as AKS, RedHat OpenShift etc.
Extensive knowledge working with logging and monitoring tools such as EFK, visualization tools such as Grafana and Prometheus.
Exposure to security alert monitoring tools.
Experience in building both microservices and public facing API's.- Experience in working with any of the API gateways.









