
Role Overview:
As a DevOps Engineer (L2), you will play a key role in designing, implementing, and optimizing infrastructure. You will take ownership of automating processes, improving system reliability, and supporting the development lifecycle.
Key Responsibilities:
- Design and manage scalable, secure, and highly available cloud infrastructure.
- Lead efforts in implementing and optimizing CI/CD pipelines.
- Automate repetitive tasks and develop robust monitoring solutions.
- Ensure the security and compliance of systems, including IAM, VPCs, and network configurations.
- Troubleshoot complex issues across development, staging, and production environments.
- Mentor and guide L1 engineers on best practices.
- Stay updated on emerging DevOps tools and technologies.
- Manage cloud resources efficiently using Infrastructure as Code (IaC) tools like Terraform and AWS CloudFormation.
Qualifications:
- Bachelor’s degree in Computer Science, IT, or a related field.
- Proven experience with CI/CD pipelines and tools like Jenkins, GitLab, or Azure DevOps.
- Advanced knowledge of cloud platforms (AWS, Azure, or GCP) with hands-on experience in deployments, migrations, and optimizations.
- Strong expertise in containerization (Docker) and orchestration tools (Kubernetes).
- Proficiency in scripting languages like Python, Bash, or PowerShell.
- Deep understanding of system security, networking, and load balancing.
- Strong analytical skills and problem-solving mindset.
- Certifications (e.g., AWS Certified Solutions Architect, Kubernetes Administrator) are a plus.
What We Offer:
- Opportunity to work with a cutting-edge tech stack in a product-first company.
- Collaborative and growth-oriented environment.
- Competitive salary and benefits.
- Freedom to innovate and contribute to impactful projects.

About Alyke
About
Similar jobs
REVIEW CRITERIA:
MANDATORY:
- Strong Hands-On AWS Cloud Engineering / DevOps Profile
- Mandatory (Experience 1): Must have 12+ years of experience in AWS Cloud Engineering / Cloud Operations / Application Support
- Mandatory (Experience 2): Must have strong hands-on experience supporting AWS production environments (EC2, VPC, IAM, S3, ALB, CloudWatch)
- Mandatory (Infrastructure as a code): Must have hands-on Infrastructure as Code experience using Terraform in production environments
- Mandatory (AWS Networking): Strong understanding of AWS networking and connectivity (VPC design, routing, NAT, load balancers, hybrid connectivity basics)
- Mandatory (Cost Optimization): Exposure to cost optimization and usage tracking in AWS environments
- Mandatory (Core Skills): Experience handling monitoring, alerts, incident management, and root cause analysis
- Mandatory (Soft Skills): Strong communication skills and stakeholder coordination skills
ROLE & RESPONSIBILITIES:
We are looking for a hands-on AWS Cloud Engineer to support day-to-day cloud operations, automation, and reliability of AWS environments. This role works closely with the Cloud Operations Lead, DevOps, Security, and Application teams to ensure stable, secure, and cost-effective cloud platforms.
KEY RESPONSIBILITIES:
- Operate and support AWS production environments across multiple accounts
- Manage infrastructure using Terraform and support CI/CD pipelines
- Support Amazon EKS clusters, upgrades, scaling, and troubleshooting
- Build and manage Docker images and push to Amazon ECR
- Monitor systems using CloudWatch and third-party tools; respond to incidents
- Support AWS networking (VPCs, NAT, Transit Gateway, VPN/DX)
- Assist with cost optimization, tagging, and governance standards
- Automate operational tasks using Python, Lambda, and Systems Manager
IDEAL CANDIDATE:
- Strong hands-on AWS experience (EC2, VPC, IAM, S3, ALB, CloudWatch)
- Experience with Terraform and Git-based workflows
- Hands-on experience with Kubernetes / EKS
- Experience with CI/CD tools (GitHub Actions, Jenkins, etc.)
- Scripting experience in Python or Bash
- Understanding of monitoring, incident management, and cloud security basics
NICE TO HAVE:
- AWS Associate-level certifications
- Experience with Karpenter, Prometheus, New Relic
- Exposure to FinOps and cost optimization practices
Key Skills Required:
· You will be part of the DevOps engineering team, configuring project environments, troubleshooting integration issues in different systems also be involved in building new features for next generation of cloud recovery services and managed services.
· You will directly guide the technical strategy for our clients and build out a new capability within the company for DevOps to improve our business relevance for customers.
· You will be coordinating with Cloud and Data team for their requirements and verify the configurations required for each production server and come with Scalable solutions.
· You will be responsible to review infrastructure and configuration of micro services and packaging and deployment of application
To be the right fit, you'll need:
· Expert in Cloud Services like AWS.
· Experience in Terraform Scripting.
· Experience in container technology like Docker and orchestration like Kubernetes.
· Good knowledge of frameworks such as Jenkins, CI/CD pipeline, Bamboo Etc.
· Experience with various version control system like GIT, build tools (Mavan, ANT, Gradle ) and cloud automation tools (Chef, Puppet, Ansible)
Please Apply - https://zrec.in/7EYKe?source=CareerSite
About Us
Infra360 Solutions is a services company specializing in Cloud, DevSecOps, Security, and Observability solutions. We help technology companies adapt DevOps culture in their organization by focusing on long-term DevOps roadmap. We focus on identifying technical and cultural issues in the journey of successfully implementing the DevOps practices in the organization and work with respective teams to fix issues to increase overall productivity. We also do training sessions for the developers and make them realize the importance of DevOps. We provide these services - DevOps, DevSecOps, FinOps, Cost Optimizations, CI/CD, Observability, Cloud Security, Containerization, Cloud Migration, Site Reliability, Performance Optimizations, SIEM and SecOps, Serverless automation, Well-Architected Review, MLOps, Governance, Risk & Compliance. We do assessments of technology architecture, security, governance, compliance, and DevOps maturity model for any technology company and help them optimize their cloud cost, streamline their technology architecture, and set up processes to improve the availability and reliability of their website and applications. We set up tools for monitoring, logging, and observability. We focus on bringing the DevOps culture to the organization to improve its efficiency and delivery.
Job Description
Job Title: Senior DevOps Engineer / SRE
Department: Technology
Location: Gurgaon
Work Mode: On-site
Working Hours: 10 AM - 7 PM
Terms: Permanent
Experience: 4-6 years
Education: B.Tech/MCA
Notice Period: Immediately
About Us
At Infra360.io, we are a next-generation cloud consulting and services company committed to delivering comprehensive, 360-degree solutions for cloud, infrastructure, DevOps, and security. We partner with clients to transform and optimize their technology landscape, ensuring resilience, scalability, cost efficiency and innovation.
Our core services include Cloud Strategy, Site Reliability Engineering (SRE), DevOps, Cloud Security Posture Management (CSPM), and related Managed Services. We specialize in driving operational excellence across multi-cloud environments, helping businesses achieve their goals with agility and reliability.
We thrive on ownership, collaboration, problem-solving, and excellence, fostering an environment where innovation and continuous learning are at the forefront. Join us as we expand and redefine what’s possible in cloud technology and infrastructure.
Role Summary
We are seeking a Senior DevOps Engineer (SRE) to manage and optimize large-scale, mission-critical production systems. The ideal candidate will have a strong problem-solving mindset, extensive experience in troubleshooting, and expertise in scaling, automating, and enhancing system reliability. This role requires hands-on proficiency in tools like Kubernetes, Terraform, CI/CD, and cloud platforms (AWS, GCP, Azure), along with scripting skills in Python or Go. The candidate will drive observability and monitoring initiatives using tools like Prometheus, Grafana, and APM solutions (Datadog, New Relic, OpenTelemetry).
Strong communication, incident management skills, and a collaborative approach are essential. Experience in team leadership and multi-client engagement is a plus.
Ideal Candidate Profile
- Solid 4-6 years of experience as an SRE and DevOps with a proven track record of handling large-scale production environments
- Bachelor's or Master's degree in Computer Science, Engineering, or a related field
- Strong Hands-on experience with managing Large Scale Production Systems
- Strong Production Troubleshooting Skills and handling high-pressure situations.
- Strong Experience with Databases (PostgreSQL, MongoDB, ElasticSearch, Kafka)
- Worked on making production systems more Scalable, Highly Available and Fault-tolerant
- Hands-on experience with ELK or other logging and observability tools
- Hands-on experience with Prometheus, Grafana & Alertmanager and on-call processes like Pagerduty
- Problem-Solving Mindset
- Strong with skills - K8s, Terraform, Helm, ArgoCD, AWS/GCP/Azure etc
- Good with Python/Go Scripting Automation
- Strong with fundamentals like DNS, Networking, Linux
- Experience with APM tools like - Newrelic, Datadog, OpenTelemetry
- Good experience with Incident Response, Incident Management, Writing detailed RCAs
- Experience with Applications best practices in making apps more reliable and fault-tolerant
- Strong leadership skills and the ability to mentor team members and provide guidance on best practices.
- Able to manage multiple clients and take ownership of client issues.
- Experience with Git and coding best practices
Good to have
- Team-leading Experience
- Multiple Client Handling
- Requirements gathering from clients
- Good Communication
Key Responsibilities
- Design and Development:
- Architect, design, and develop high-quality, scalable, and secure cloud-based software solutions.
- Collaborate with product and engineering teams to translate business requirements into technical specifications.
- Write clean, maintainable, and efficient code, following best practices and coding standards.
- Cloud Infrastructure:
- Develop and optimise cloud-native applications, leveraging cloud services like AWS, Azure, or Google Cloud Platform (GCP).
- Implement and manage CI/CD pipelines for automated deployment and testing.
- Ensure the security, reliability, and performance of cloud infrastructure.
- Technical Leadership:
- Mentor and guide junior engineers, providing technical leadership and fostering a collaborative team environment.
- Participate in code reviews, ensuring adherence to best practices and high-quality code delivery.
- Lead technical discussions and contribute to architectural decisions.
- Problem Solving and Troubleshooting:
- Identify, diagnose, and resolve complex software and infrastructure issues.
- Perform root cause analysis for production incidents and implement preventative measures.
- Continuous Improvement:
- Stay up-to-date with the latest industry trends, tools, and technologies in cloud computing and software engineering.
- Contribute to the continuous improvement of development processes, tools, and methodologies.
- Drive innovation by experimenting with new technologies and solutions to enhance the platform.
- Collaboration:
- Work closely with DevOps, QA, and other teams to ensure smooth integration and delivery of software releases.
- Communicate effectively with stakeholders, including technical and non-technical team members.
- Client Interaction & Management:
- Will serve as a direct point of contact for multiple clients.
- Able to handle the unique technical needs and challenges of two or more clients concurrently.
- Involve both direct interaction with clients and internal team coordination.
- Production Systems Management:
- Must have extensive experience in managing, monitoring, and debugging production environments.
- Will work on troubleshooting complex issues and ensure that production systems are running smoothly with minimal downtime.

Skills required:
Strong knowledge and experience of cloud infrastructure (AWS, Azure or GCP), systems, network design, and cloud migration projects.
Strong knowledge and understanding of CI/CD processes tools (Jenkins/Azure DevOps) is a must.
Strong knowledge and understanding of Docker & Kubernetes is a must.
Strong knowledge of Python, along with one more language (Shell, Groovy, or Java).
Strong prior experience using automation tools like Ansible, Terraform.
Architect systems, infrastructure & platforms using Cloud Services.
Strong communication skills. Should have demonstrated the ability to collaborate across teams and organizations.
Benefits of working with OpsTree Solutions:
Opportunity to work on the latest cutting edge tools/technologies in DevOps
Knowledge focused work culture
Collaboration with very enthusiastic DevOps experts
High growth trajectory
Opportunity to work with big shots in the IT industry
- Automate deployments of infrastructure components and repetitive tasks.
- Drive changes strictly via the infrastructure-as-code methodology.
- Promote the use of source control for all changes including application and system-level changes.
- Design & Implement self-recovering systems after failure events.
- Participate in system sizing and capacity planning of various components.
- Create and maintain technical documents such as installation/upgrade MOPs.
- Coordinate & collaborate with internal teams to facilitate installation & upgrades of systems.
- Support 24x7 availability for corporate sites & tools.
- Participate in rotating on-call schedules.
- Actively involved in researching, evaluating & selecting new tools & technologies.
- Cloud computing – AWS, OCI, OpenStack
- Automation/Configuration management tools such as Terraform & Chef
- Atlassian tools administration (JIRA, Confluence, Bamboo, Bitbucket)
- Scripting languages - Ruby, Python, Bash
- Systems administration experience – Linux (Redhat), Mac, Windows
- SCM systems - Git
- Build tools - Maven, Gradle, Ant, Make
- Networking concepts - TCP/IP, Load balancing, Firewall
- High-Availability, Redundancy & Failover concepts
- SQL scripting & queries - DML, DDL, stored procedures
- Decisive and ability to work under pressure
- Prioritizing workload and multi-tasking ability
- Excellent written and verbal communication skills
- Database systems – Postgres, Oracle, or other RDBMS
- Mac automation tools - JAMF or other
- Atlassian Datacenter products
- Project management skills
Qualifications
- 3+ years of hands-on experience in the field or related area
- Requires MS or BS in Computer Science or equivalent field
- 3-6 years of relevant work experience in a DevOps role.
- Deep understanding of Amazon Web Services or equivalent cloud platforms.
- Proven record of infra automation and programming skills in any of these languages - Python, Ruby, Perl, Javascript.
- Implement DevOps Industry best practices and the application of procedures to achieve a continuously deployable system
- Continuously improve and increase the capabilities of the CI/CD pipeline
- Support engineering teams in the implementation of life-cycle infrastructure solutions and documentation operations in order to meet the engineering departments quality and standards
- Participate in production outages and handle complex issues and works towards resolution
Technical Experience/Knowledge Needed :
- Cloud-hosted services environment.
- Proven ability to work in a Cloud-based environment.
- Ability to manage and maintain Cloud Infrastructure on AWS
- Must have strong experience in technologies such as Dockers, Kubernetes, Functions, etc.
- Knowledge in orchestration tools Ansible
- Experience with ELK Stack
- Strong knowledge in Micro Services, Container-based architecture and the corresponding deployment tools and techniques.
- Hands-on knowledge of implementing multi-staged CI / CD with tools like Jenkins and Git.
- Sound knowledge on tools like Kibana, Kafka, Grafana, Instana and so on.
- Proficient in bash Scripting Languages.
- Must have in-depth knowledge of Clustering, Load Balancing, High Availability and Disaster Recovery, Auto Scaling, etc.
-
AWS Certified Solutions Architect or/and Linux System Administrator
- Strong ability to work independently on complex issues
- Collaborate efficiently with internal experts to resolve customer issues quickly
- No objection to working night shifts as the production support team works on 24*7 basis. Hence, rotational shifts will be assigned to the candidates weekly to get equal opportunity to work in a day and night shifts. But if you get candidates willing to work the night shift on a need basis, discuss with us.
- Early Joining
- Willingness to work in Delhi NCR
Requirements:-
- Bachelor’s Degree or Master’s in Computer Science, Engineering,Software Engineering or a relevant field.
- Strong experience with Windows/Linux-based infrastructures, Linux/Unix administration.
- knowledge of Jira, Bitbucket, Jenkins, Xray, Ansible, Windows and .Net. as their Core Skill.
- Strong experience with databases such as SQL, MS SQL, MySQL, NoSQL.
- Knowledge of scripting languages such as Shell Scripting /Python/ PHP/Groovy, Bash.
- Experience with project management and workflow tools such as Agile, Jira / WorkFront etc.
- Experience with open-source technologies and cloud services.
- Experience in working with Puppet or Chef for automation and configuration.
- Strong communication skills and ability to explain protocol and processes with team and management.
- Experience in a DevOps Engineer role (or similar role)
- AExperience in software development and infrastructure development is a plus
Job Specifications:-
- Building and maintaining tools, solutions and micro services associated with deployment and our operations platform, ensuring that all meet our customer service standards and reduce errors.
- Actively troubleshoot any issues that arise during testing and production, catching and solving issues before launch.
- Test our system integrity, implemented designs, application developments and other processes related to infrastructure, making improvements as needed
- Deploy product updates as required while implementing integrations when they arise.
- Automate our operational processes as needed, with accuracy and in compliance with our security requirements.
- Specifying, documenting and developing new product features, and writing automating scripts. Manage code deployments, fixes, updates and related processes.
- Work with open-source technologies as needed.
- Work with CI and CD tools, and source control such as GIT and SVN.
- Lead the team through development and operations.
- Offer technical support where needed, developing software for our back-end systems.
Role
We are looking for an experienced DevOps engineer that will help our team establish DevOps practice. You will work closely with the technical lead to identify and establish DevOps practices in the company.
You will also help us build scalable, efficient cloud infrastructure. You’ll implement monitoring for automated system health checks. Lastly, you’ll build our CI pipeline, and train and guide the team in DevOps practices.
This would be a hybrid role and the person would be expected to also do some application level programming in their downtime.
Responsibilities
- Deployment, automation, management, and maintenance of production systems.
- Ensuring availability, performance, security, and scalability of production systems.
- Evaluation of new technology alternatives and vendor products.
- System troubleshooting and problem resolution across various application domains and platforms.
- Providing recommendations for architecture and process improvements.
- Definition and deployment of systems for metrics, logging, and monitoring on AWS platform.
- Manage the establishment and configuration of SaaS infrastructure in an agile way by storing infrastructure as code and employing automated configuration management tools with a goal to be able to re-provision environments at any point in time.
- Be accountable for proper backup and disaster recovery procedures.
- Drive operational cost reductions through service optimizations and demand based auto scaling.
- Have on call responsibilities.
- Perform root cause analysis for production errors
- Uses open source technologies and tools to accomplish specific use cases encountered within the project.
- Uses coding languages or scripting methodologies to solve a problem with a custom workflow.
Requirements
- Systematic problem-solving approach, coupled with strong communication skills and a sense of ownership and drive.
- Prior experience as a software developer in a couple of high level programming languages.
- Extensive experience in any Javascript based framework since we will be deploying services to NodeJS on AWS Lambda (Serverless)
- Extensive experience with web servers such as Nginx/Apache
- Strong Linux system administration background.
- Ability to present and communicate the architecture in a visual form.
- Strong knowledge of AWS (e.g. IAM, EC2, VPC, ELB, ALB, Autoscaling, Lambda, NAT gateway, DynamoDB)
- Experience maintaining and deploying highly-available, fault-tolerant systems at scale (~ 1 Lakh users a day)
- A drive towards automating repetitive tasks (e.g. scripting via Bash, Python, Ruby, etc)
- Expertise with Git
- Experience implementing CI/CD (e.g. Jenkins, TravisCI)
- Strong experience with databases such as MySQL, NoSQL, Elasticsearch, Redis and/or Mongo.
- Stellar troubleshooting skills with the ability to spot issues before they become problems.
- Current with industry trends, IT ops and industry best practices, and able to identify the ones we should implement.
- Time and project management skills, with the capability to prioritize and multitask as needed.








