
General Description:Owns all technical aspects of software development for assigned applications
Participates in the design and development of systems & application programs
Functions as Senior member of an agile team and helps drive consistent development practices – tools, common components, and documentation
Required Skills:
In depth experience configuring and administering EKS clusters in AWS.
In depth experience in configuring **Splunk SaaS** in AWS environments especially in **EKS**
In depth understanding of OpenTelemetry and configuration of **OpenTelemetry Collectors**
In depth knowledge of observability concepts and strong troubleshooting experience.
Experience in implementing comprehensive monitoring and logging solutions in AWS using **CloudWatch**.
Experience in **Terraform** and Infrastructure as code.
Experience in **Helm**Strong scripting skills in Shell and/or python.
Experience with large-scale distributed systems and architecture knowledge (Linux/UNIX and Windows operating systems, networking, storage) in a cloud computing or traditional IT infrastructure environment.
Must have a good understanding of cloud concepts (Storage /compute/network).
Experience collaborating with several cross functional teams to architect observability pipelines for various GCP services like GKE, cloud run Big Query etc.
Experience with Git and GitHub.Experience with code build and deployment using GitHub actions, and Artifact Registry.
Proficient in developing and maintaining technical documentation, ADRs, and runbooks.

Similar jobs
Job Title : Senior DevOps Engineer
Location : Remote
Experience Level : 5+ Years
Role Overview :
We are a funded AI startup seeking a Senior DevOps Engineer to design, implement, and maintain a secure, scalable, and efficient infrastructure. In this role, you will focus on automating operations, optimizing deployment processes, and enabling engineering teams to deliver high-quality products seamlessly.
Key Responsibilities:
Infrastructure Scalability & Reliability :
- Architect and manage cloud infrastructure on AWS, GCP, or Azure for high availability, reliability, and cost-efficiency.
- Implement container orchestration using Kubernetes or Docker Compose.
- Utilize Infrastructure as Code (IaC) tools like Pulumi or Terraform to manage and configure infrastructure.
Deployment Automation :
- Design and maintain CI/CD pipelines using GitHub Actions, Jenkins, or similar tools.
- Implement deployment strategies such as canary or blue-green deployments, and create rollback mechanisms to ensure seamless updates.
Monitoring & Observability :
- Leverage tools like OpenTelemetry, Grafana, and Datadog to monitor system health and performance.
- Establish centralized logging systems and create real-time dashboards for actionable insights.
Security & Compliance :
- Securely manage secrets using tools like HashiCorp Vault or Doppler.
- Conduct static code analysis with tools such as SonarQube or Snyk to ensure compliance with security standards.
Collaboration & Team Enablement :
- Mentor and guide team members on DevOps best practices and workflows.
- Document infrastructure setups, incident runbooks, and troubleshooting workflows to enhance team efficiency.
Required Skills :
- Expertise in managing cloud platforms like AWS, GCP, or Azure.
- In-depth knowledge of Kubernetes, Docker, and IaC tools like Terraform or Pulumi.
- Advanced scripting capabilities in Python or Bash.
- Proficiency in CI/CD tools such as GitHub Actions, Jenkins, or similar.
- Experience with observability tools like Grafana, OpenTelemetry, and Datadog.
- Strong troubleshooting skills for debugging production systems and optimizing performance.
Preferred Qualifications :
- Experience in scaling AI or ML-based applications.
- Familiarity with distributed systems and microservices architecture.
- Understanding of agile methodologies and DevSecOps practices.
- Certifications in AWS, Azure, or Kubernetes.
What We Offer :
- Opportunity to work in a fast-paced AI startup environment.
- Flexible remote work culture.
- Competitive salary and equity options.
- Professional growth through challenging projects and learning opportunities.
About the Job
This is a full-time role for a Lead DevOps Engineer at Spark Eighteen. We are seeking an experienced DevOps professional to lead our infrastructure strategy, design resilient systems, and drive continuous improvement in our deployment processes. In this role, you will architect scalable solutions, mentor junior engineers, and ensure the highest standards of reliability and security across our cloud infrastructure. The job location is flexible with preference for the Delhi NCR region.
Responsibilities
- Lead and mentor the DevOps/SRE team
- Define and drive DevOps strategy and roadmaps
- Oversee infrastructure automation and CI/CD at scale
- Collaborate with architects, developers, and QA teams to integrate DevOps practices
- Ensure security, compliance, and high availability of platforms
- Own incident response, postmortems, and root cause analysis
- Budgeting, team hiring, and performance evaluation
Requirements
Technical Skills
- Bachelor's or Master's degree in Computer Science, Engineering, or related field.
- 7+ years of professional DevOps experience with demonstrated progression.
- Strong architecture and leadership background
- Deep hands-on knowledge of infrastructure as code, CI/CD, and cloud
- Proven experience with monitoring, security, and governance
- Effective stakeholder and project management
- Experience with tools like Jenkins, ArgoCD, Terraform, Vault, ELK, etc.
- Strong understanding of business continuity and disaster recovery
Soft Skills
- Cross-functional communication excellence with ability to lead technical discussions.
- Strong mentorship capabilities for junior and mid-level team members.
- Advanced strategic thinking and ability to propose innovative solutions.
- Excellent knowledge transfer skills through documentation and training.
- Ability to understand and align technical solutions with broader business strategy.
- Proactive problem-solving approach with focus on continuous improvement.
- Strong leadership skills in guiding team performance and technical direction.
- Effective collaboration across development, QA, and business teams.
- Ability to make complex technical decisions with minimal supervision.
- Strategic approach to risk management and mitigation.
What We Offer
- Professional Growth: Continuous learning opportunities through diverse projects and mentorship from experienced leaders
- Global Exposure: Work with clients from 20+ countries, gaining insights into different markets and business cultures
- Impactful Work: Contribute to projects that make a real difference, with solutions generating over $1B in revenue
- Work-Life Balance: Flexible arrangements that respect personal wellbeing while fostering productivity
- Career Advancement: Clear progression pathways as you develop skills within our growing organization
- Competitive Compensation: Attractive salary packages that recognize your contributions and expertise
Our Culture
At Spark Eighteen, our culture centers on innovation, excellence, and growth. We believe in:
- Quality-First: Delivering excellence rather than just quick solutions
- True Partnership: Building relationships based on trust and mutual respect
- Communication: Prioritizing clear, effective communication across teams
- Innovation: Encouraging curiosity and creative approaches to problem-solving
- Continuous Learning: Supporting professional development at all levels
- Collaboration: Combining diverse perspectives to achieve shared goals
- Impact: Measuring success by the value we create for clients and users
Apply Here - https://tinyurl.com/t6x23p9b
Required qualifications and must have skills
-
5+ years of experience managing a team of 5+ infrastructure software engineers
-
5+ years of experience in building and scaling technical infrastructure
-
5+ years of experience in delivering software
-
Experience leading by influence in multi-team, cross-functional projects
-
Demonstrated experience recruiting and managing technical teams, including performance management and managing engineers
-
Experience with cloud service providers such as AWS, GCP, or Azure
-
Experience with containerization technologies such as Kubernetes and Docker
Nice to have Skills
-
Experience with Hadoop, Hive and Presto
-
Application/infrastructure benchmarking and optimization
-
Familiarity with modern CI/CD practices
-
Familiarity with reliability best practices
About BootLabs
https://www.google.com/url?q=https://www.bootlabs.in/&sa=D&source=calendar&ust=1667803146567128&usg=AOvVaw1r5g0R_vYM07k6qpoNvvh6" target="_blank">https://www.bootlabs.in/
-We are a Boutique Tech Consulting partner, specializing in Cloud Native Solutions.
-We are obsessed with anything “CLOUD”. Our goal is to seamlessly automate the development lifecycle, and modernize infrastructure and its associated applications.
-With a product mindset, we enable start-ups and enterprises on the cloud
transformation, cloud migration, end-to-end automation and managed cloud services.
-We are eager to research, discover, automate, adapt, empower and deliver quality solutions on time.
-We are passionate about customer success. With the right blend of experience and exuberant youth in our in-house team, we have significantly impacted customers.
Technical Skills:
• Expertise in any one hyper scaler (AWS/AZURE/GCP), including basic services like networking,
data and workload management.
- AWS
Networking: VPC, VPC Peering, Transit Gateway, Route Tables, Security Groups, etc.
Data: RDS, DynamoDB, Elastic Search
Workload: EC2, EKS, Lambda, etc.
- Azure
Data: Azure MySQL, Azure MSSQL, etc.
Workload: AKS, Virtual Machines, Azure Functions
- GCP
Data: Cloud Storage, DataFlow, Cloud SQL, Firestore, BigTable, BigQuery
Workload: GKE, Instances, App Engine, Batch, etc.
• Experience in any one of the CI/CD tools (Gitlab/Github/Jenkins) including runner setup,
templating and configuration.
• Kubernetes experience or Ansible Experience (EKS/AKS/GKE), basics like pod, deployment,
networking, service mesh. Used any package manager like helm.
• Scripting experience (Bash/python), automation in pipelines when required, system service.
• Infrastructure automation (Terraform/pulumi/cloud formation), write modules, setup pipeline and version the code.
Optional:
• Experience in any programming language is not required but is appreciated.
• Good experience in GIT, SVN or any other code management tool is required.
• DevSecops tools like (Qualys/SonarQube/BlackDuck) for security scanning of artifacts, infrastructure and code.
• Observability tools (Opensource: Prometheus, Elasticsearch, Open Telemetry; Paid: Datadog,
24/7, etc)
Role : SRE
Experience : 4 - 8 Years
- Experience in building, deploying and operating cloud solutions on Kubernetes
- Strong expertise administrating and scaling Kubernetes on bare metal and CKA preferred
- Expertise on K8s Interfaces CNI, CSI, CRI and Service meshe
- Hands-on experience as a DevOps or Automation development
- Demonstrable knowledge of TCP/IP, Linux operating system internals, filesystems, disk/storage technologies and storage protocols.
- Experience working with Helm Charts and building out Infrastructure As Code (IaC)
- Experience in writing software to automate orchestration tasks at scale; we commonly use Python, Go, and Shell scripting
- Knowledge of systems (Linux, GNU tooling), networking (OSI model, DNS, routing) and virtualization vs containerization
- Expertise in CI/CD tooling for cloud-based applications specifically Terraform / CloudFormation, Jenkins and Git
- Architected CNF Orchestration with Kubernetes
- Strong understanding of the principles of 12-factor apps and modern containerized microservices
- Plan for reliability by designing systems to work across our multi-region and multi-cloud environments
- Experience developing and using Application & Integration stacks/tools such as Kafka, Spring Cloud, Apache Camel, Kubernetes, Docker, Redis, Knative, and NoSQL
- Develop and Maintain IAC using Terraform and Ansible
- Draft design documents that translate requirements into code.
- Deal with challenges associated with scale.
- Assume responsibilities from technical design through technical client support.
- Manage expectations with internal stakeholders and context-switch in a fast paced environment.
- Thrive in an environment that uses Elasticsearch extensively.
- Keep abreast of technology and contribute to the engineering strategy.
- Champion best development practices and provide mentorship
An AWS Certified Engineer with strong skills in
- Terraform o Ansible
- *nix and shell scripting
- Elasticsearch
- Circle CI
- CloudFormation
- Python
- Packer
- Docker
- Prometheus and Grafana
- Challenges of scale
- Production support
- Sharp analytical and problem-solving skills.
- Strong sense of ownership.
- Demonstrable desire to learn and grow.
- Excellent written and oral communication skills.
- Mature collaboration and mentoring abilities.
We are front runners of the technological revolution with an inexhaustible passion for technology! DevOn is the technical organization that originated from Prowareness. We are the company at the forefront of leading DevOps transformations and setting up High Performance Distributed DevOps teams with leading companies worldwide. DevOn helps market leaders to take the next step in software delivery. We consist of a dynamic team, in which personal growth is central!
About You
You have 6+ years of experience in AWS infra Automation. This is a fantastic opportunity to work in a fast-paced operations environment and to develop your career in Cloud technologies, particularly Amazon Web Services.
You are building and monitoring CI/CD pipeline in AWS cloud. This is a highly scalable backend application building on Java platform. We need a resource who can troubleshoot, diagnose and rectify system service issues.
You’re cloud native with Terraform as an orchestration. You would use Terraform as a key Orchestration in Infrastructure as Code.
You're comfortable driving. You prefer to own your work streams and enjoy working in autonomy to progress towards your goals.
You provide an incredible support to the team. You sweat the small stuff but keep the big picture in mind. You know that a pair programming can give better result
An ideal candidate is/are:
This is a key role within our DevOps team and will involve working as part of a collaborative agile team in a shared services DevOps organization to support and deliver innovative technology solutions that directly align with the delivery of business value and enhanced customer experience. The primary objective is to provide support to Amazon Web Services hosted environment, ensure continuous availability, working closely with development teams to ensure best value for money, and effective estate management.
- Setup CI/CD Pipeline from scratch along with integration of appropriate quality gates.
- Expertise level knowledge in AWS cloud. Provision and configure infrastructure as code using Terraform
- Build and configure Kubernetes-based infrastructure, networking policies, LBs, and cluster security. Define autoscaling and cost strategies.
- Automate the build of containerized systems with CI/CD tooling, Helm charts, and more
- Manage deployments and rollbacks of applications
- Implement monitoring and metrics with Cloud watch, Newrelic
- Troubleshoot and optimize containerized workload deployments for clients
- Automate operational tasks, and assist in the transition to service ownership models.
- JD: • 10+ years of overall industry experience
• 5+ years of cloud experience
• 2+ years of architect experience
• Varied background preferred between systems and development
o Experience working with applications, not pure infra experience
• Azure experience – strong background using Azure for application migrations
• Terraform experience – should mention automation technologies in job experience
• Hands on experience delivering in the cloud
• Must have job experience designing solutions for customers
• IaaS Cloud architect
workload migrations to AWS and/or Azure
• Security architecture considerations experience
• CI/CD experience
• Proven applications migration track of record.

2. Has done Infrastructure coding using Cloudformation/Terraform and Configuration also understands it very clearly
3. Deep understanding of the microservice design and aware of centralized Caching(Redis),centralized configuration(Consul/Zookeeper)
4. Hands-on experience of working on containers and its orchestration using Kubernetes
5. Hands-on experience of Linux and Windows Operating System
6. Worked on NoSQL Databases like Cassandra, Aerospike, Mongo or
Couchbase, Central Logging, monitoring and Caching using stacks like ELK(Elastic) on the cloud, Prometheus, etc.
7. Has good knowledge of Network Security, Security Architecture and Secured SDLC practices










