
General Description:Owns all technical aspects of software development for assigned applications
Participates in the design and development of systems & application programs
Functions as Senior member of an agile team and helps drive consistent development practices – tools, common components, and documentation
Required Skills:
In depth experience configuring and administering EKS clusters in AWS.
In depth experience in configuring **Splunk SaaS** in AWS environments especially in **EKS**
In depth understanding of OpenTelemetry and configuration of **OpenTelemetry Collectors**
In depth knowledge of observability concepts and strong troubleshooting experience.
Experience in implementing comprehensive monitoring and logging solutions in AWS using **CloudWatch**.
Experience in **Terraform** and Infrastructure as code.
Experience in **Helm**Strong scripting skills in Shell and/or python.
Experience with large-scale distributed systems and architecture knowledge (Linux/UNIX and Windows operating systems, networking, storage) in a cloud computing or traditional IT infrastructure environment.
Must have a good understanding of cloud concepts (Storage /compute/network).
Experience collaborating with several cross functional teams to architect observability pipelines for various GCP services like GKE, cloud run Big Query etc.
Experience with Git and GitHub.Experience with code build and deployment using GitHub actions, and Artifact Registry.
Proficient in developing and maintaining technical documentation, ADRs, and runbooks.

Similar jobs
We’re hiring a DevOps Engineer who’s passionate about automation, reliability, and scaling infrastructure for modern cloud-native applications. If you thrive in dynamic environments and love problem-solving at scale, we’d love to meet you!
🔧 Key Responsibilities
- Manage and support production systems with on-call rotations
- Deploy and maintain scalable infrastructure on AWS (ECS, EC2, EKS, S3, RDS, ELB, IAM, Lambda)
- Build infrastructure using Terraform
- Manage and monitor Kubernetes clusters and Docker containers
- Automate deployment and configuration using Ansible or similar tools
- Ensure systems reliability using robust monitoring and alerting tools
- Work with Linux OS, and network protocols like HTTP, DNS, SMTP, LDAP
- Manage services like Nginx, HAProxy, MySQL, SSH
- Collaborate with development, QA, and product teams
- Document systems and infrastructure best practices
✅ Required Skills
- 4+ years in DevOps, SRE, or Systems Administration
- Hands-on experience with AWS, Kubernetes, Docker
- Proficient with Terraform, Ansible, and Linux systems
- Strong understanding of networking, system logs, and debugging
- Excellent communication and documentation skills
Devops CICD
∙Managing large-scale AWS deployments using Infrastructure as Code (IaC) and k8s developer tools
∙Managing build/test/deployment of very large-scale systems, bridging between developers and live stacks
∙Actively troubleshoot issues that arise during development and production
∙Owning, learning, and deploying SW in support of customer-facing applications
∙Help establish DevOps best practices
∙Actively work to reduce system costs
∙Work with open-source technologies, helping to ensure robustness and secureness of said technologies
∙Actively work with CI/CD, GIT and other component parts of the build and deployment system
∙Leading skills with AWS cloud stack
∙Proven implementation experience with Infrastructure as Code (Terraform, Terragrunt, Flux, Helm charts)
at scale
∙Proven experience with Kubernetes at scale
∙Proven experience with cloud management tools beyond AWS console (k9s, lens)
∙Strong communicator who people want to work with – must be thought of as the ultimate collaborator
∙Solid team player
∙Strong experience with Linux-based infrastructures and AWS
∙Strong experience with databases such as MySQL, Redshift, Elasticsearch, Mongo, and others
∙Strong knowledge of JavaScript, GIT
∙Agile practitioner
Company Overview
Adia Health revolutionizes clinical decision support by enhancing diagnostic accuracy and personalizing care. It modernizes the diagnostic process by automating optimal lab test selection and interpretation, utilizing a combination of expert medical insights, real-world data, and artificial intelligence. This approach not only streamlines the diagnostic journey but also ensures precise, individualized patient care by integrating comprehensive medical histories and collective platform knowledge.
Position Overview
We are seeking a talented and experienced Site Reliability Engineer/DevOps Engineer to join our dynamic team. The ideal candidate will be responsible for ensuring the reliability, scalability, and performance of our infrastructure and applications. You will collaborate closely with development, operations, and product teams to automate processes, implement best practices, and improve system reliability.
Key Responsibilities
- Design, implement, and maintain highly available and scalable infrastructure solutions using modern DevOps practices.
- Automate deployment, monitoring, and maintenance processes to streamline operations and increase efficiency.
- Monitor system performance and troubleshoot issues, ensuring timely resolution to minimize downtime and impact on users.
- Implement and manage CI/CD pipelines to automate software delivery and ensure code quality.
- Manage and configure cloud-based infrastructure services to optimize performance and cost.
- Collaborate with development teams to design and implement scalable, reliable, and secure applications.
- Implement and maintain monitoring, logging, and alerting solutions to proactively identify and address potential issues.
- Conduct periodic security assessments and implement appropriate measures to ensure the integrity and security of systems and data.
- Continuously evaluate and implement new tools and technologies to improve efficiency, reliability, and scalability.
- Participate in on-call rotation and respond to incidents promptly to ensure system uptime and availability.
Qualifications
- Bachelor's degree in Computer Science, Engineering, or related field
- Proven experience (5+ years) as a Site Reliability Engineer, DevOps Engineer, or similar role
- Strong understanding of cloud computing principles and experience with AWS
- Experience of building and supporting complex CI/CD pipelines using Github
- Experience of building and supporting infrastructure as a code using Terraform
- Proficiency in scripting and automating tools
- Solid understanding of networking concepts and protocols
- Understanding of security best practices and experience implementing security controls in cloud environments
- Knowing modern security requirements like SOC2, HIPAA, HITRUST will be a solid advantage.
Required qualifications and must have skills
-
5+ years of experience managing a team of 5+ infrastructure software engineers
-
5+ years of experience in building and scaling technical infrastructure
-
5+ years of experience in delivering software
-
Experience leading by influence in multi-team, cross-functional projects
-
Demonstrated experience recruiting and managing technical teams, including performance management and managing engineers
-
Experience with cloud service providers such as AWS, GCP, or Azure
-
Experience with containerization technologies such as Kubernetes and Docker
Nice to have Skills
-
Experience with Hadoop, Hive and Presto
-
Application/infrastructure benchmarking and optimization
-
Familiarity with modern CI/CD practices
-
Familiarity with reliability best practices
CI/CD tools Jenkins/Bamboo/Teamcity/CircleCI, DevSecOps Pipeline, Cloud Services (AWS/Azure/GCP), Ansible, Terraform, Docker, Helm, Cloud formation template, Webserver deployment & config, Databases(SQL/NoSQL) deployment & config, Git, Artifactory, Monitoring tools (Nagios, Grafana, Prometheus etc), Application logs (ELK/EFK, Splunk etc.), API Gateways, Security tools, Vault. |
- Understanding customer requirements and project KPIs
- Implementing various development, testing, automation tools, and IT infrastructure
- Planning the team structure, activities, and involvement in project management activities.
- Managing stakeholders and external interfaces
- Setting up tools and required infrastructure
- Defining and setting development, test, release, update, and support processes for https://www.simplilearn.com/top-benefits-of-learning-devops-article" target="_blank">DevOps operation
- Have the technical skill to review, verify, and validate the software code developed in the project.
- Troubleshooting techniques and fixing the code bugs
- Monitoring the processes during the entire lifecycle for its adherence and updating or creating new processes for improvement and minimizing the wastage
- Encouraging and building automated processes wherever possible
- Identifying and deploying cybersecurity measures by continuously performing vulnerability assessment and risk management
- Incidence management and root cause analysis
- Coordination and communication within the team and with customers
- Selecting and deploying appropriate https://www.simplilearn.com/best-ci-cd-tools-article" target="_blank">CI/CD tools
- Strive for continuous improvement and build continuous integration, continuous development, and constant deployment pipeline (https://www.simplilearn.com/open-source-pipeline-tools-for-devops-article" target="_blank">CI/CD Pipeline)
- Mentoring and guiding the team members
- Monitoring and measuring customer experience and KPIs
- Managing periodic reporting on the progress to the management and the customer
Why you should join us
- You will join the mission to create positive impact on millions of peoples lives
- You get to work on the latest technologies in a culture which encourages experimentation - You get to work with super humans (Psst: Look up these super human1, super human2, super human3, super human4)
- You get to work in an accelerated learning environment
What you will do
- You will provide deep technical expertise to your team in building future ready systems.
- You will help develop a robust roadmap for ensuring operational excellence
- You will setup infrastructure on AWS that will be represented as code
- You will work on several automation projects that provide great developer experience
- You will setup secure, fault tolerant, reliable and performant systems
- You will establish clean and optimised coding standards for your team that are well documented
- You will set up systems in a way that are easy to maintain and provide a great developer experience
- You will actively mentor and participate in knowledge sharing forums
- You will work in an exciting startup environment where you can be ambitious and try new things :)
You should apply if
- You have a strong foundation in Computer Science concepts and programming fundamentals
- You have been working on cloud infrastructure setup, especially on AWS since 8+ years
- You have set up and maintained reliable systems that operate at high scale
- You have experience in hardening and securing cloud infrastructures
- You have a solid understanding of computer networking, network security and CDNs
- Extensive experience in AWS, Kubernetes and optionally Terraform
- Experience in building automation tools for code build and deployment (preferably in JS)
- You understand the hustle of a startup and are good with handling ambiguity
- You are curious, a quick learner and someone who loves to experiment
- You insist on highest standards of quality, maintainability and performance
- You work well in a team to enhance your impact
Intuitive is the fastest growing top-tier Cloud Solutions and Services company supporting Global Enterprise Customer across Americas, Europe and Middle East. This is an excellent opportunity to join ITP’s global world class technology teams, working with some of the best and brightest engineers while also developing your skills and furthering your career working with some of the largest customers.
Job Description:
Must-Have’s:
- Proficient in Java, Node or Python
- Experience with NewRelic, Splunk, SignalFx, DataDog etc.
- Monitoring and alerting experience
- Full stack development experience
- Hands-on with building and deploying micro services in Cloud (AWS/Azure)
- Experience with terraform w.r.t Infrastructure As Code
- Should have experience troubleshooting live production systems using monitoring/log analytics tools
- Should have experience leading a team (2 or more engineers)
- Experienced using Jenkins or similar deployment pipeline tools
- Understanding of distributed architectures
KaiOS is a mobile operating system for smart feature phones that stormed the scene to become the 3rd largest mobile OS globally (2nd in India ahead of iOS). We are on 100M+ devices in 100+ countries. We recently closed a Series B round with Cathay Innovation, Google and TCL.
What we are looking for:
- BE/B-Tech in Computer Science or related discipline
- 3+ years of overall commercial DevOps / infrastructure experience, cross-functional delivery-focused agile environment; previous experience as a software engineer and ability to see into and reason about the internals of an application a major plus
- Extensive experience designing, delivering and operating highly scalable resilient mission-critical backend services serving millions of users; experience operating and evolving multi-regional services a major plus
- Strong engineering skills; proficiency in programming languages
- Experience in Infrastructure-as-code tools, for example Terraform, CloudFormation
- Proven ability to execute on a technical and product roadmap
- Extensive experience with iterative development and delivery
- Extensive experience with observability for operating robust distributed systems – instrumentation, log shipping, alerting, etc.
- Extensive experience with test and deployment automation, CI / CD
- Extensive experience with infrastructure-as-code and tooling, such as Terraform
- Extensive experience with AWS, including services such as ECS, Lambda, DynamoDB, Kinesis, EMR, Redshift, Elasticsearch, RDS, CloudWatch, CloudFront, IAM, etc.
- Strong knowledge of backend and web technology; infrastructure experience with at-scale modern data technology a major plus
- Deep expertise in server-side security concerns and best practices
- Fluency with at least one programming language. We mainly use Golang, Python and JavaScript so far
- Outstanding analytical thinking, problem solving skills and attention to detail
- Excellent verbal and written communication skills
Requirements
Designation: DevOps Engineer
Location: Bangalore OR Hongkong
Experience: 4 to 7 Years
Notice period: 30 days or Less

