
- 3+ years experience leading a team of DevOps engineers
- 8+ years experience managing DevOps for large engineering teams developing cloud-native software
- Strong in networking concepts
- In-depth knowledge of AWS and cloud architectures/services.
- Experience within the container and container orchestration space (Docker, Kubernetes)
- Passion for CI/CD pipeline using tools such as Jenkins etc.
- Familiarity with config management tools like Ansible Terraform etc
- Proven record of measuring and improving DevOps metrics
- Familiarity with observability tools and experience setting them up
- Passion for building tools and productizing services that empower development teams.
- Excellent knowledge of Linux command-line tools and ability to write bash scripts.
- Strong in Unix / Linux administration and management,
KEY ROLES/RESPONSIBILITIES:
- Own and manage the entire cloud infrastructure
- Create the entire CI/CD pipeline to build and release
- Explore new technologies and tools and recommend those that best fit the team and organization
- Own and manage the site reliability
- Strong decision-making skills and metric-driven approach
- Mentor and coach other team members

Similar jobs
staging, QA, and development of cloud infrastructures running in 24×7 environments.
● Most of our deployments are in K8s, You will work with the team to run and manage multiple K8s
environments 24/7
● Implement and oversee all aspects of the cloud environment including provisioning, scale,
monitoring, and security.
● Nurture cloud computing expertise internally and externally to drive cloud adoption.
● Implement systems solutions, and processes needed to manage cloud cost, monitoring, scalability,
and redundancy.
● Ensure all cloud solutions adhere to security and compliance best practices.
● Collaborate with Enterprise Architecture, Data Platform, DevOps, and Integration Teams to ensure
cloud adoption follows standard best practices.
Responsibilities :
● Bachelor’s degree in Computer Science, Computer Engineering or Information Technology or
equivalent experience.
● Experience with Kubernetes on cloud and deployment technologies such as Helm is a major plus
● Expert level hands on experience with AWS (Azure and GCP experience are a big plus)
● 10 or more years of experience.
● Minimum of 5 years’ experience building and supporting cloud solutions
What we look for:
As a DevOps Developer, you will contribute to a thriving and growing AIGovernance Engineering team. You will work in a Kubernetes-based microservices environment to support our bleeding-edge cloud services. This will include custom solutions, as well as open source DevOps tools (build and deploy automation, monitoring and data gathering for our software delivery pipeline). You will also be contributing to our continuous improvement and continuous delivery while increasing maturity of DevOps and agile adoption practices.
Responsibilities:
- Ability to deploy software using orchestrators /scripts/Automation on Hybrid and Public clouds like AWS
- Ability to write shell/python/ or any unix scripts
- Working Knowledge on Docker & Kubernetes
- Ability to create pipelines using Jenkins or any CI/CD tool and GitOps tool like ArgoCD
- Working knowledge of Git as a source control system and defect tracking system
- Ability to debug and troubleshoot deployment issues
- Ability to use tools for faster resolution of issues
- Excellent communication and soft skills
- Passionate and ability work and deliver in a multi-team environment
- Good team player
- Flexible and quick learner
- Ability to write docker files, Kubernetes yaml files / Helm charts
- Experience with monitoring tools like Nagios, Prometheus and visualisation tools such as Grafana.
- Ability to write Ansible, terraform scripts
- Linux System experience and Administration
- Effective cross-functional leadership skills: working with engineering and operational teams to ensure systems are secure, scalable, and reliable.
- Ability to review deployment and operational environments, i.e., execute initiatives to reduce failure, troubleshoot issues across the entire infrastructure stack, expand monitoring capabilities, and manage technical operations.
We are looking for an experienced DevOps engineer that will help our team establish DevOps
practice. You will work closely with the technical lead to identify and establish DevOps practices in the company.You will also help us build scalable, efficient cloud infrastructure. You’ll implement monitoring for automated system health checks. Lastly, you’ll build our CI pipeline, and train and guide the team in DevOps practices. This would be a hybrid role and the person would be expected to also do some application-level programming in their downtime.
Responsibilities
- Deployment, automation, management, and maintenance of production systems.
- Ensuring availability, performance, security, and scalability of production systems.
- Evaluation of new technology alternatives and vendor products.
- System troubleshooting and problem resolution across various application domains and
platforms.
- Providing recommendations for architecture and process improvements.
- Definition and deployment of systems for metrics, logging, and monitoring on AWS
platform.
- Manage the establishment and configuration of SaaS infrastructure in an agile way
by storing infrastructure as code and employing automated configuration
management tools with a goal to be able to re-provision environments at any point in
time.
- Be accountable for proper backup and disaster recovery procedures.
- Drive operational cost reductions through service optimizations and demand based
auto scaling.
- Have on call responsibilities.
- Perform root cause analysis for production errors
- Uses open source technologies and tools to accomplish specific use cases encountered
within the project.
- Uses coding languages or scripting methodologies to solve a problem with a custom
workflow.
Requirements
- Systematic problem-solving approach, coupled with strong communication skills and a
sense of ownership and drive.
- Prior experience as a software developer in a couple of high level programming
languages.
- Extensive experience in any Javascript based framework since we will be deploying
services to NodeJS on AWS Lambda (Serverless)
- Extensive experience with web servers such as Nginx/Apache
- Strong Linux system administration background.
- Ability to present and communicate the architecture in a visual form.
- Strong knowledge of AWS (e.g. IAM, EC2, VPC, ELB, ALB, Autoscaling, Lambda, NAT
gateway, DynamoDB)
- Experience maintaining and deploying highly-available, fault-tolerant systems at scale (~
1 Lakh users a day)
- A drive towards automating repetitive tasks (e.g. scripting via Bash, Python, Ruby, etc)
- Expertise with Git
- Experience implementing CI/CD (e.g. Jenkins, TravisCI)
- Strong experience with databases such as MySQL, NoSQL, Elasticsearch, Redis and/or
Mongo.
- Stellar troubleshooting skills with the ability to spot issues before they become problems.
- Current with industry trends, IT ops and industry best practices, and able to identify the
ones we should implement.
- Time and project management skills, with the capability to prioritize and multitask as
needed.
Roles and Responsibilities
● Managing Availability, Performance, Capacity of infrastructure and applications.
● Building and implementing observability for applications health/performance/capacity.
● Optimizing On-call rotations and processes.
● Documenting “tribal” knowledge.
● Managing Infra-platforms like
- Mesos/Kubernetes
- CICD
- Observability(Prometheus/New Relic/ELK)
- Cloud Platforms ( AWS/ Azure )
- Databases
- Data Platforms Infrastructure
● Providing help in onboarding new services with the production readiness review process.
● Providing reports on services SLO/Error Budgets/Alerts and Operational Overhead.
● Working with Dev and Product teams to define SLO/Error Budgets/Alerts.
● Working with the Dev team to have an in-depth understanding of the application architecture and its bottlenecks.
● Identifying observability gaps in product services, infrastructure and working with stake owners to fix it.
● Managing Outages and doing detailed RCA with developers and identifying ways to avoid that situation.
● Managing/Automating upgrades of the infrastructure services.
● Automate toil work.
Experience & Skills
● 3+ Years of experience as an SRE/DevOps/Infrastructure Engineer on large scale microservices and infrastructure.
● A collaborative spirit with the ability to work across disciplines to influence, learn, and deliver.
● A deep understanding of computer science, software development, and networking principles.
● Demonstrated experience with languages, such as Python, Java, Golang etc.
● Extensive experience with Linux administration and good understanding of the various linux kernel subsystems (memory, storage, network etc).
● Extensive experience in DNS, TCP/IP, UDP, GRPC, Routing and Load Balancing.
● Expertise in GitOps, Infrastructure as a Code tools such as Terraform etc.. and Configuration Management Tools such as Chef, Puppet, Saltstack, Ansible.
● Expertise of Amazon Web Services (AWS) and/or other relevant Cloud Infrastructure solutions like Microsoft Azure or Google Cloud.
● Experience in building CI/CD solutions with tools such as Jenkins, GitLab, Spinnaker, Argo etc.
● Experience in managing and deploying containerized environments using Docker,
Mesos/Kubernetes is a plus.
● Experience with multiple datastores is a plus (MySQL, PostgreSQL, Aerospike,
Couchbase, Scylla, Cassandra, Elasticsearch).
● Experience with data platforms tech stacks like Hadoop, Hive, Presto etc is a plus
Job Dsecription:
○ Develop best practices for team and also responsible for the architecture
○ solutions and documentation operations in order to meet the engineering departments quality and standards
○ Participate in production outage and handle complex issues and works towards Resolution
○ Develop custom tools and integration with existing tools to increase engineering Productivity
Required Experience and Expertise
○ Having a good knowledge of Terraform + someone who has worked on large TF code bases.
○ Deep understanding of Terraform with best practices & writing TF modules.
○ Hands-on experience of GCP and AWS and knowledge on AWS Services like VPC and VPC related services like (route tables, vpc endpoints, privatelinks) EKS, S3, IAM. Cost aware mindset towards Cloud services.
○ Deep understanding of Kernel, Networking and OS fundamentals
NOTICE PERIOD - Max - 30 days
DevOps Engineer
Company Introduction
https://www.cometchat.com/">CometChat harnesses the power of chat by helping thousands of businesses around the world create customized in-app messaging experiences. Our products allow developers to seamlessly add voice, video and text chat to their websites and mobile apps so that their users can communicate with each other, resulting in a unified customer experience, increased engagement and retention, and revenue growth.
In 2019, CometChat was selected into the exclusive Techstars Boulder Accelerator. CometChat (Industry CPaaS: communication-platform-as-a-service) has also been listed among the top 10 best SaaS companies by G2 Crowd. With solid financials, strong organic growth and increasing interest in developer tool-focused companies (from the market and with top technical talent), we’re heading into an exciting period of growth and acceleration. https://www.crunchbase.com/organization/cometchat">CometChat is backed by seasoned investors such as iSeed Ventures, Range Ventures, Silicon Badia, eonCapital and Matchstick Ventures.
A global business from the start, we have 60+ team members across our Denver and Mumbai offices serving over 50,000 customers around the world. We’ve had an exciting journey so far, and we know this is just the beginning!
CometChat’s Mission
Enable meaningful connections between real people in an increasingly digital world.
CometChat’s Products
CometChat offers a robust suite of cloud hosted text, voice and video options that meet businesses where they are–whether they need drag and drop plugins that can be ready within 30 minutes or if they want more advanced features and can invest development resources to launch the experience that will best serve their users.
● Quickly build a reliable & full featured chat experience into any mobile or web app
● Fully customizable SDKs and API designed to help companies ship faster
At every step, CometChat helps customers solve complex infrastructure, performance and security challenges, regardless of the platform. But there is so much more! With over 20 ready to use extensions, customers can build an experience and get the data, analysis and insights they need to drive their business forward.
CometChat’s solutions are perfect for every kind of chat including:
● Social community – Allowing people in online communities to interact without moving the conversation to another platform
● Marketplace – Enabling communications between buyers and sellers
● Events – Bringing thousands of users together to interact without diminishing the quality of the experience
● Telemedicine – Making connections between patients and providers more accessible
● Dating – Keeping people engaged while they connect with one another
● And more!
CometChat is committed to fostering a culture of innovation & collaboration. Our people are our strength so we respect and nurture their individual talent and potential. Join us if you are looking to be a part of a high growth team!
Position Overview & Priorities:
The DevOps Engineer will be responsible for effective provisioning, installation/configuration, operation, and maintenance of systems and software using Infrastructure as Code. This can include the provision of cloud instances, streamlining deployments, configuring virtual instances, scaling out DB servers.
Primary responsibility would be:
- Oversight of all server environments, from Dev through Production.
- Work on an infrastructure that is 100% on AWS.
- Work on CI/CD tooling which is used to build and deploy code to our cloud.
- Assist with day-to-day issue management.
- Work on internal tooling which simplifies workflows.
- Research, design and implement solutions for fault tolerance, monitoring, performance enhancement, capacity optimization, and configuration management of systems and applications.
Work Location:
We operate on a Hybrid model – you choose where you work from! Remotely or from our offices. Currently, our talent is spread across 14 different cities globally.
Prioritized Experiences and Capabilities:
- 2-4 years of experience working as a DevOps Engineer/currently practicing DevOps methodology
- Experience in AWS Infrastructure
- Hands-on experience with Infrastructure as Code (Cloud Formation / Terraform, Puppet / Chef / Ansible)
- Strong background in Linux/Unix Administration
- DevOps automation with CI/CD, a pipeline that enforces proper versioning and branching practices
- Experience in Docker and Kubernetes.
Job Location: Jaipur
Experience Required: Minimum 3 years
About the role:
As a DevOps Engineer for Punchh, you will be working with our developers, SRE, and DevOps teams implementing our next generation infrastructure. We are looking for a self-motivated, responsible, team player who love designing systems that scale. Punchh provides a rich engineering environment where you can be creative, learn new technologies, solve engineering problems, all while delivering business objectives. The DevOps culture here is one with immense trust and responsibility. You will be given the opportunity to make an impact as there are no silos here.
Responsibilities:
- Deliver SLA and business objectives through whole lifecycle design of services through inception to implementation.
- Ensuring availability, performance, security, and scalability of AWS production systems
- Scale our systems and services through continuous integration, infrastructure as code, and gradual refactoring in an agile environment.
- Maintain services once a project is live by monitoring and measuring availability, latency, and overall system and application health.
- Write and maintain software that runs the infrastructure that powers the Loyalty and Data platform for some of the world’s largest brands.
- 24x7 in shifts on call for Level 2 and higher escalations
- Respond to incidents and write blameless RCA’s/postmortems
- Implement and practice proper security controls and processes
- Providing recommendations for architecture and process improvements.
- Definition and deployment of systems for metrics, logging, and monitoring on platform.
Must have:
- Minimum 3 Years of Experience in DevOps.
- BS degree in Computer Science, Mathematics, Engineering, or equivalent practical experience.
- Strong inter-personal skills.
- Must have experience in CI/CD tooling such as Jenkins, CircleCI, TravisCI
- Must have experience in Docker, Kubernetes, Amazon ECS or Mesos
- Experience in code development in at least one high-level programming language fromthis list: python, ruby, golang, groovy
- Proficient in shell scripting, and most importantly, know when to stop scripting and start developing.
- Experience in creation of highly automated infrastructures with any Configuration Management tools like: Terraform, Cloudformation or Ansible.
- In-depth knowledge of the Linux operating system and administration.
- Production experience with a major cloud provider such Amazon AWS.
- Knowledge of web server technologies such as Nginx or Apache.
- Knowledge of Redis, Memcache, or one of the many in-memory data stores.
- Experience with various load balancing technologies such as Amazon ALB/ELB, HA Proxy, F5.
- Comfortable with large-scale, highly-available distributed systems.
Good to have:
- Understanding of Web Standards (REST, SOAP APIs, OWASP, HTTP, TLS)
- Production experience with Hashicorp products such as Vault or Consul
- Expertise in designing, analyzing troubleshooting large-scale distributed systems.
- Experience in an PCI environment
- Experience with Big Data distributions from Cloudera, MapR, or Hortonworks
- Experience maintaining and scaling database applications
- Knowledge of fundamental systems engineering principles such as CAP Theorem, Concurrency Control, etc.
- Understanding of the network fundamentals: OSI, TCI/IP, topologies, etc.
- Understanding of Auditing of Infrastructure and help org. to control Infrastructure costs.
- Experience in Kafka, RabbitMQ or any messaging bus.
We are looking for an experienced software engineer with a strong background in DevOps and handling traffic & infrastructure at scale.
Responsibilities :
Work closely with product engineers to implement scalable and highly reliable systems.
Scale existing backend systems to handle ever-increasing amounts of traffic and new product requirements.
Collaborate with other developers to understand & setup tooling needed for - Continuous Integration/Delivery/
Build & operate infrastructure to support website, backend cluster, ML projects in the organization.
Monitor and track performance and reliability of our services and software to meet promised SLA
2+ years of experience working on distributed systems and shipping high-quality product features on schedule
Intimate knowledge of the whole web stack (Front end, APIs, database, networks etc.)
Ability to build highly scalable, robust, and fault-tolerant services and stay up-to-date with the latest architectural trends
Experience with container based deployment, microservices, in-memory caches, relational databases, key-value stores
Hands-on experience with cloud infrastructure provisioning, deployment, monitoring (we are on AWS and use ECS, RDS, ELB, EC2, Elasticache, Elasticsearch, S3, CloudWatch)
Mandatory Skills Sets
- Excellent problem-solving skills in technical challenges
- Deep knowledge of at least one cloud platform (AWS Preferred)
- Understanding of Latest cloud computing technologies
- Experience in architecting solutions based on knowledge of infrastructure & application architectures including the integration approaches
- Complete hands-on with ability to grasp evolving technologies and coding languages
- Excellent communication skills which would involve customer facing role
- Design thinking
- Customer facing skills and strong technical capabilities to review the teams work as well as guide the team
- Experience working/building/contributing to proposals for architecture, estimations
Preferred Skills Sets
- Experience architecting infrastructure solutions using both Linux/Unix and Windows with specific recommendations on server, load balancing, HA/DR, & storage architectures.
- Experience architecting or deploying Cloud/Virtualization solutions in enterprise customers.
- Person must have performed Application Architect Role for 3+ years
- AWS platform specific experience a bonus.
- Enterprise application and database architecture a bonus.

Radical is a platform connecting data, medicine and people -- through machine learning, and usable, performant products. Software has never been the strong suit of the medical industry -- and we are changing that. We believe that the same sophistication and performance that powers our daily needs through millions of consumer applications -- be it your grocery, your food delivery or your movie tickets -- when applied to healthcare, has a massive potential to transform the industry, and positively impact lives of patients and doctors. Radical works with some of the largest hospitals and public health programmes in India, and has a growing footprint both inside the country and abroad.
As a DevOps Engineer at Radical, you will:
Work closely with all stakeholders in the healthcare ecosystem - patients, doctors, paramedics and administrators - to conceptualise and bring to life the ideal set of products that add value to their time
Work alongside Software Developers and ML Engineers to solve problems and assist in architecture design
Work on systems which have an extraordinary emphasis on capturing data that can help build better workflows, algorithms and tools
Work on high performance systems that deal with several million transactions, multi-modal data and large datasets, with a close attention to detail
We’re looking for someone who has:
Familiarity and experience with writing working, well-documented and well-tested scripts, Dockerfiles, Puppet/Ansible/Chef/Terraform scripts.
Proficiency with scripting languages like Python and Bash.
Knowledge of systems deployment and maintainence, including setting up CI/CD and working alongside Software Developers, monitoring logs, dashboards, etc.
Experience integrating with a wide variety of external tools and services
Experience navigating AWS and leveraging appropriate services and technologies rather than DIY solutions (such as hosting an application directly on EC2 vs containerisation, or an Elastic Beanstalk)
It’s not essential, but great if you have:
An established track record of deploying and maintaining systems.
Experience with microservices and decomposition of monolithic architectures
Proficiency in automated tests.
Proficiency with the linux ecosystem
Experience in deploying systems to production on cloud platforms such as AWS
The position is open now, and we are onboarding immediately.
Please write to us with an updated resume, and one thing you would like us to see as part of your application. This one thing can be anything that you think makes you stand apart among candidates.
Radical is based out of Delhi NCR, India, and we look forward to working with you!
We're looking for people who may not know all the answers, but are obsessive about finding them, and take pride in the code that they write. We are more interested in the ability to learn fast, think rigorously and for people who aren’t afraid to challenge assumptions, and take large bets -- only to work hard and prove themselves correct. You're encouraged to apply even if your experience doesn't precisely match the job description. Join us.

