
Please Apply - https://zrec.in/7EYKe?source=CareerSite
About Us
Infra360 Solutions is a services company specializing in Cloud, DevSecOps, Security, and Observability solutions. We help technology companies adapt DevOps culture in their organization by focusing on long-term DevOps roadmap. We focus on identifying technical and cultural issues in the journey of successfully implementing the DevOps practices in the organization and work with respective teams to fix issues to increase overall productivity. We also do training sessions for the developers and make them realize the importance of DevOps. We provide these services - DevOps, DevSecOps, FinOps, Cost Optimizations, CI/CD, Observability, Cloud Security, Containerization, Cloud Migration, Site Reliability, Performance Optimizations, SIEM and SecOps, Serverless automation, Well-Architected Review, MLOps, Governance, Risk & Compliance. We do assessments of technology architecture, security, governance, compliance, and DevOps maturity model for any technology company and help them optimize their cloud cost, streamline their technology architecture, and set up processes to improve the availability and reliability of their website and applications. We set up tools for monitoring, logging, and observability. We focus on bringing the DevOps culture to the organization to improve its efficiency and delivery.
Job Description
Job Title: Senior DevOps Engineer / SRE
Department: Technology
Location: Gurgaon
Work Mode: On-site
Working Hours: 10 AM - 7 PM
Terms: Permanent
Experience: 4-6 years
Education: B.Tech/MCA
Notice Period: Immediately
About Us
At Infra360.io, we are a next-generation cloud consulting and services company committed to delivering comprehensive, 360-degree solutions for cloud, infrastructure, DevOps, and security. We partner with clients to transform and optimize their technology landscape, ensuring resilience, scalability, cost efficiency and innovation.
Our core services include Cloud Strategy, Site Reliability Engineering (SRE), DevOps, Cloud Security Posture Management (CSPM), and related Managed Services. We specialize in driving operational excellence across multi-cloud environments, helping businesses achieve their goals with agility and reliability.
We thrive on ownership, collaboration, problem-solving, and excellence, fostering an environment where innovation and continuous learning are at the forefront. Join us as we expand and redefine what’s possible in cloud technology and infrastructure.
Role Summary
We are seeking a Senior DevOps Engineer (SRE) to manage and optimize large-scale, mission-critical production systems. The ideal candidate will have a strong problem-solving mindset, extensive experience in troubleshooting, and expertise in scaling, automating, and enhancing system reliability. This role requires hands-on proficiency in tools like Kubernetes, Terraform, CI/CD, and cloud platforms (AWS, GCP, Azure), along with scripting skills in Python or Go. The candidate will drive observability and monitoring initiatives using tools like Prometheus, Grafana, and APM solutions (Datadog, New Relic, OpenTelemetry).
Strong communication, incident management skills, and a collaborative approach are essential. Experience in team leadership and multi-client engagement is a plus.
Ideal Candidate Profile
- Solid 4-6 years of experience as an SRE and DevOps with a proven track record of handling large-scale production environments
- Bachelor's or Master's degree in Computer Science, Engineering, or a related field
- Strong Hands-on experience with managing Large Scale Production Systems
- Strong Production Troubleshooting Skills and handling high-pressure situations.
- Strong Experience with Databases (PostgreSQL, MongoDB, ElasticSearch, Kafka)
- Worked on making production systems more Scalable, Highly Available and Fault-tolerant
- Hands-on experience with ELK or other logging and observability tools
- Hands-on experience with Prometheus, Grafana & Alertmanager and on-call processes like Pagerduty
- Problem-Solving Mindset
- Strong with skills - K8s, Terraform, Helm, ArgoCD, AWS/GCP/Azure etc
- Good with Python/Go Scripting Automation
- Strong with fundamentals like DNS, Networking, Linux
- Experience with APM tools like - Newrelic, Datadog, OpenTelemetry
- Good experience with Incident Response, Incident Management, Writing detailed RCAs
- Experience with Applications best practices in making apps more reliable and fault-tolerant
- Strong leadership skills and the ability to mentor team members and provide guidance on best practices.
- Able to manage multiple clients and take ownership of client issues.
- Experience with Git and coding best practices
Good to have
- Team-leading Experience
- Multiple Client Handling
- Requirements gathering from clients
- Good Communication
Key Responsibilities
- Design and Development:
- Architect, design, and develop high-quality, scalable, and secure cloud-based software solutions.
- Collaborate with product and engineering teams to translate business requirements into technical specifications.
- Write clean, maintainable, and efficient code, following best practices and coding standards.
- Cloud Infrastructure:
- Develop and optimise cloud-native applications, leveraging cloud services like AWS, Azure, or Google Cloud Platform (GCP).
- Implement and manage CI/CD pipelines for automated deployment and testing.
- Ensure the security, reliability, and performance of cloud infrastructure.
- Technical Leadership:
- Mentor and guide junior engineers, providing technical leadership and fostering a collaborative team environment.
- Participate in code reviews, ensuring adherence to best practices and high-quality code delivery.
- Lead technical discussions and contribute to architectural decisions.
- Problem Solving and Troubleshooting:
- Identify, diagnose, and resolve complex software and infrastructure issues.
- Perform root cause analysis for production incidents and implement preventative measures.
- Continuous Improvement:
- Stay up-to-date with the latest industry trends, tools, and technologies in cloud computing and software engineering.
- Contribute to the continuous improvement of development processes, tools, and methodologies.
- Drive innovation by experimenting with new technologies and solutions to enhance the platform.
- Collaboration:
- Work closely with DevOps, QA, and other teams to ensure smooth integration and delivery of software releases.
- Communicate effectively with stakeholders, including technical and non-technical team members.
- Client Interaction & Management:
- Will serve as a direct point of contact for multiple clients.
- Able to handle the unique technical needs and challenges of two or more clients concurrently.
- Involve both direct interaction with clients and internal team coordination.
- Production Systems Management:
- Must have extensive experience in managing, monitoring, and debugging production environments.
- Will work on troubleshooting complex issues and ensure that production systems are running smoothly with minimal downtime.

About Infra360 Solutions Pvt Ltd
About
At Infra360.io, we are a next-generation cloud consulting and services company committed to delivering comprehensive, 360-degree solutions for Cloud, Infrastructure, DevOps, MLOps and Security. We partner with clients to modernize and optimize their cloud, ensuring resilience, scalability, cost efficiency and innovation.
We thrive on ownership, collaboration, problem-solving, and excellence, fostering an environment where innovation and continuous learning are at the forefront. Join us as we expand and redefine what’s possible in cloud technology and infrastructure.
Candid answers by the company
Our core services include Cloud Strategy, Site Reliability Engineering (SRE), DevOps, Cloud Security Posture Management (CSPM), and related Managed Services. We specialize in driving operational excellence across multi-cloud environments, helping businesses achieve their goals with agility and reliability.
Similar jobs
We are seeking an experienced Lead DevOps Engineer with deep expertise in Kubernetes infrastructure design and implementation. This role requires someone who can architect, build, and manage enterprise-grade Kubernetes clusters from the ground up. You’ll lead modernization initiatives, shape infrastructure strategy, and work with cutting-edge cloud-native technologies.
🚀 Key Responsibilities
Infrastructure Design & Implementation
- Architect and design enterprise-grade Kubernetes clusters across AWS, Azure, and GCP.
- Build production-ready Kubernetes infrastructure with HA, scalability, and security best practices.
- Implement Infrastructure as Code with Terraform, Helm, and GitOps workflows.
- Set up monitoring, logging, and observability for Kubernetes workloads.
- Design and execute backup and disaster recovery strategies for containerized applications.
Leadership & Team Management
- Lead a team of 3–4 DevOps engineers, providing technical mentorship.
- Drive best practices in containerization, orchestration, and cloud-native development.
- Collaborate with development teams to optimize deployment strategies.
- Conduct code reviews and maintain infrastructure quality standards.
- Build knowledge-sharing culture with documentation and training.
Operational Excellence
- Manage and scale CI/CD pipelines integrated with Kubernetes.
- Implement security policies (RBAC, network policies, container scanning).
- Optimize cluster performance and cost-efficiency.
- Automate operations to minimize manual interventions.
- Ensure 99.9% uptime for production workloads.
Strategic Planning
- Define the infrastructure roadmap aligned with business needs.
- Evaluate and adopt new cloud-native technologies.
- Perform capacity planning and cloud cost optimization.
- Drive risk assessment and mitigation strategies.
🛠 Must-Have Technical Skills
Kubernetes Expertise
- 6+ years of hands-on Kubernetes experience in production.
- Deep knowledge of Kubernetes architecture (etcd, API server, scheduler, kubelet).
- Advanced Kubernetes networking (CNI, Ingress, Service mesh).
- Strong grasp of Kubernetes storage (CSI, PVs, StorageClasses).
- Experience with Operators and Custom Resource Definitions (CRDs).
Infrastructure as Code
- Terraform (advanced proficiency).
- Helm (developing and managing complex charts).
- Config management tools (Ansible, Chef, Puppet).
- GitOps workflows (ArgoCD, Flux).
Cloud Platforms
- Hands-on experience with at least 2 of the following:
- AWS: EKS, EC2, VPC, IAM, CloudFormation
- Azure: AKS, VNets, ARM templates
- GCP: GKE, Compute Engine, Deployment Manager
CI/CD & DevOps Tools
- Jenkins, GitLab CI, GitHub Actions, Azure DevOps
- Docker (advanced optimization and security practices)
- Container registries (ECR, ACR, GCR, Docker Hub)
- Strong Git workflows and branching strategies
Monitoring & Observability
- Prometheus & Grafana (metrics and dashboards)
- ELK/EFK stack (centralized logging)
- Jaeger/Zipkin (tracing)
- AlertManager (intelligent alerting)
💡 Good-to-Have Skills
- Service Mesh (Istio, Linkerd, Consul)
- Serverless (Knative, OpenFaaS, AWS Lambda)
- Running databases in Kubernetes (Postgres, MongoDB operators)
- ML pipelines (Kubeflow, MLflow)
- Security tools (Aqua, Twistlock, Falco, OPA)
- Compliance (SOC2, PCI-DSS, GDPR)
- Python/Go for automation
- Advanced Shell scripting (Bash/PowerShell)
🎓 Qualifications
- Bachelor’s in Computer Science, Engineering, or related field.
- Certifications (preferred):
- Certified Kubernetes Administrator (CKA)
- Certified Kubernetes Application Developer (CKAD)
- Cloud provider certifications (AWS/Azure/GCP).
Experience
- 6–7 years of DevOps/Infrastructure engineering.
- 4+ years of Kubernetes in production.
- 2+ years in a lead role managing teams.
- Experience with large-scale distributed systems and microservices.
We are looking for very hands-on DevOps engineers with 3 to 6 years of experience. The person will be part of a team that is responsible for designing & implementing automation from scratch for medium to large scale cloud infrastructure and providing 24x7 services to our North American / European customers. This also includes ensuring ~100% uptime for almost 50+ internal sites. The person is expected to deliver with both high speed and high quality as well as work for 40 hours per week (~6.5 hours per day, 6 days per week) in shifts that will rotate every month.
This person MUST have:
-
B.E Computer Science or equivalent
-
2+ Years of hands-on experience troubleshooting/setting up of the Linux environment, who can write shell scripts for any given requirement.
-
1+ Years of hands-on experience setting up/configuring AWS or GCP services from SCRATCH and maintaining them.
-
1+ Years of hands-on experience setting up/configuring Kubernetes & EKS and ensuring high availability of container orchestration.
-
1+ Years of hands-on experience setting up CICD from SCRATCH in Jenkins & Gitlab.
-
Experience configuring/maintaining one monitoring tool.
-
Excellent verbal & written communication skills.
-
Candidates with certifications - AWS, GCP, CKA, etc will be preferred
-
Hands-on experience with databases (Cassandra, MongoDB, MySQL, RDS).
Experience:
-
Min 3 years of experience as DevOps automation engineer buildingg, running, and maintaining production sites.
-
Not looking for candidates who have experience only as L1/L2 or Build & Deploy.
Location: Remotely, anywhere in India.
Timings:
-
The person is expected to deliver with both high speed and high quality as well as work for 40 hours per week (~6.5 hours per day, 6 days per week) in shifts that will rotate every month.
Position:
-
Full time/Direct
-
We have great benefits such as PF, medical insurance, 12 annual company holidays, 12 PTO leaves per year, annual increments, Diwali bonus, spot bonuses and other incentives etc.
-
We dont believe in locking in people with large notice periods. You will stay here because you love the company. We have only a 15 days notice period.
Bito is a startup that is using AI (ChatGPT, OpenAI, etc) to create game-changing productivity experiences for software developers in their IDE and CLI. Already, over 100,000 developers are using Bito to increase their productivity by 31% and performing more than 1 million AI requests per week.
Our founders have previously started, built, and taken a company public (NASDAQ: PUBM), worth well over $1B. We are looking to take our learnings, learn a lot along with you, and do something more exciting this time. This journey will be incredibly rewarding, and is incredibly difficult!
We are building this company with a fully remote approach, with our main teams for time zone management in the US and in India. The founders happen to be in Silicon Valley and India.
We are hiring a DevOps Engineer to join our team.
Responsibilities:
- Collaborate with the development team to design, develop, and implement Java-based applications
- Perform analysis and provide recommendations for Cloud deployments and identify opportunities for efficiency and cost reduction
- Build and maintain clusters for various technologies such as Aerospike, Elasticsearch, RDS, Hadoop, etc
- Develop and maintain continuous integration (CI) and continuous delivery (CD) frameworks
- Provide architectural design and practical guidance to software development teams to improve resilience, efficiency, performance, and costs
- Evaluate and define/modify configuration management strategies and processes using Ansible
- Collaborate with DevOps engineers to coordinate work efforts and enhance team efficiency
- Take on leadership responsibilities to influence the direction, schedule, and prioritization of the automation effort
Requirements:
- Minimum 4+ years of relevant work experience in a DevOps role
- At least 3+ years of experience in designing and implementing infrastructure as code within the AWS/GCP/Azure ecosystem
- Expert knowledge of any cloud core services, big data managed services, Ansible, Docker, Terraform/CloudFormation, Amazon ECS/Kubernetes, Jenkins, and Nginx
- Expert proficiency in at least two scripting/programming languages such as Bash, Perl, Python, Go, Ruby, etc.
- Mastery in configuration automation tool sets such as Ansible, Chef, etc
- Proficiency with Jira, Confluence, and Git toolset
- Experience with automation tools for monitoring and alerts such as Nagios, Grafana, Graphite, Cloudwatch, New Relic, etc
- Proven ability to manage and prioritize multiple diverse projects simultaneously
What do we offer:
At Bito, we strive to create a supportive and rewarding work environment that enables our employees to thrive. Join a dynamic team at the forefront of generative AI technology.
· Work from anywhere
· Flexible work timings
· Competitive compensation, including stock options
· A chance to work in the exciting generative AI space
· Quarterly team offsite events
We are hiring DevOps Engineers for luxury-commerce platform that is well-funded and is now ready for its next level of growth. It is backed by reputed investors and is already a leader in its space. The focus for the coming years will be heavily on scaling the platform through technology. Market-driven competitive salary or the right candidate
Job Title : DevOps System Engineer
Responsibilities:
- Implementing, maintaining, monitoring, and supporting the IT infrastructure
- Writing scripts for service quality analysis, monitoring, and operation
- Designing procedures for system troubleshooting and maintenance
- Investigating and resolving technical issues by deploying updates/fixes
- Implementing automation tools and frameworks for automatic code deployment (CI/CD)
- Quality control and management of the codebase
- Ownership of infrastructure and deployments in various environments
Requirements:
- Degree in Computer Science, Engineering or a related field
- Prior experience as a DevOps engineer
- Good knowledge of various operating systems - Linux, Windows, Mac.
- Good Knowledge of Networking, virtualization, Containerization technologies.
- Familiarity with software release management and deployment (Git, CI/CD)
- Familiarity with one or more popular cloud platforms such as AWS, Azure, etc.
- Solid understanding of DevOps principles and practices
- Knowledge of systems and platforms security
- Good problem-solving skills and attention to detail
Skills: Linux, Networking, Docker, Kubernetes, AWS/Azure, Git/GitHub, Jenkins, Selenium, Puppet/Chef/Ansible, Nagios
Experience : 5+ years
Location: Prabhadevi, Mumbai
Interested candidates can apply with their updated profiles.
Regards,
HR Team
Aza Fashions
We are looking for a Senior Platform Engineer responsible for handling our GCP/AWS clouds. The
candidate will be responsible for automating the deployment of cloud infrastructure and services to
support application development and hosting (architecting, engineering, deploying, and operationally
managing the underlying logical and physical cloud computing infrastructure).
Location: Bangalore
Reporting Manager: VP, Engineering
Job Description:
● Collaborate with teams to build and deliver solutions implementing serverless,
microservice-based, IaaS, PaaS, and containerized architectures in GCP/AWS environments.
● Responsible for deploying highly complex, distributed transaction processing systems.
● Work on continuous improvement of the products through innovation and learning. Someone with
a knack for benchmarking and optimization
● Hiring, developing, and cultivating a high and reliable cloud support team
● Building and operating complex CI/CD pipelines at scale
● Work with GCP Services, Private Service Connect, Cloud Run, Cloud Functions, Pub/Sub, Cloud
Storage, Networking in general
● Collaborate with Product Management and Product Engineering teams to drive excellence in
Google Cloud products and features.
● Ensures efficient data storage and processing functions in accordance with company security
policies and best practices in cloud security.
● Ensuring scaled database setup/montioring with near zero downtime
Key Skills:
● Hands-on software development experience in Python, NodeJS, or Java
● 5+ years of Linux/Unix Administration monitoring, reliability, and security of Linux-based, online,
high-traffic services and Web/eCommerce properties
● 5+ years of production experience in large-scale cloud-based Infrastructure (GCP preferred)
● Strong experience with Log Analysis and Monitoring tools such as CloudWatch, Splunk,Dynatrace, Nagios, etc.
● Hands-on experience with AWS Cloud – EC2, S3 Buckets, RDS
● Hands-on experience with Infrastructure as a Code (e.g., cloud formation, ARM, Terraform,Ansible, Chef, Puppet) and Version control tools
● Hands-on experience with configuration management (Chef/Ansible)
● Experience in designing High Availability infrastructure and planning for Disaster Recovery solutions
Regards
Team Merito
This company is a network of the world's best developers - full-time, long-term remote software jobs with better compensation and career growth. We enable our clients to accelerate their Cloud Offering, and Capitalize on Cloud. We have our own IOT/AI platform and we provide professional services on that platform to build custom clouds for their IOT devices. We also build mobile apps, run 24x7 devops/site reliability engineering for our clients.
We are looking for very hands-on SRE (Site Reliability Engineering) engineers with 3 to 6 years of experience. The person will be part of team that is responsible for designing & implementing automation from scratch for medium to large scale cloud infrastructure and providing 24x7 services to our North American / European customers. This also includes ensuring ~100% uptime for almost 50+ internal sites. The person is expected to deliver with both high speed and high quality as well as work for 40 Hours per week (~6.5 hours per day, 6 days per week) in shifts which will rotate every month.
This person MUST have:
- B.E Computer Science or equivalent
- 2+ Years of hands-on experience troubleshooting/setting up of the Linux environment, who can write shell scripts for any given requirement.
- 1+ Years of hands-on experience setting up/configuring AWS or GCP services from SCRATCH and maintaining them.
- 1+ Years of hands-on experience setting up/configuring Kubernetes & EKS and ensuring high availability of container orchestration.
- 1+ Years of hands-on experience setting up CICD from SCRATCH in Jenkins & Gitlab.
- Experience configuring/maintaining one monitoring tool.
- Excellent verbal & written communication skills.
- Candidates with certifications - AWS, GCP, CKA, etc will be preferred
- Hands-on experience with databases (Cassandra, MongoDB, MySQL, RDS).
Experience:
- Min 3 years of experience as SRE automation engineer building, running, and maintaining production sites. Not looking for candidates who have experience only as L1/L2 or Build & Deploy..
Location:
- Remotely, anywhere in India
Timings:
- The person is expected to deliver with both high speed and high quality as well as work for 40 Hours per week (~6.5 hours per day, 6 days per week) in shifts which will rotate every month.
Position:
- Full time/Direct
- We have great benefits such as PF, medical insurance, 12 annual company holidays, 12 PTO leaves per year, annual increments, Diwali bonus, spot bonuses and other incentives etc.
- We dont believe in locking in people with large notice periods. You will stay here because you love the company. We have only a 15 days notice period.

EXP:: 4 - 7 yrs
- Any scripting language:: Python, Scala, shell or bash
- Cloud:: AWS
- Database:: Relational (SQL) & non-relational (NoSQL)
- CI/CD tools and Version controlling
• Develop and maintain CI/CD tools to build and deploy scalable web and responsive applications in production environment
• Design and implement monitoring solutions that identify both system bottlenecks and production issues
• Design and implement workflows for continuous integration, including provisioning, deployment, testing, and version control of the software.
• Develop self-service solutions for the engineering team in order to deliver sites/software with great speed and quality
o Automating Infra creation
o Provide easy to use solutions to engineering team
• Conduct research, tests, and implements new metrics collection systems that can be reused and applied as engineering best practices
o Update our processes and design new processes as needed.
o Establish DevOps Engineer team best practices.
o Stay current with industry trends and source new ways for our business to improve.
• Identify, design, and implement internal process improvements: automating manual processes, optimizing data delivery, re-designing infrastructure for greater scalability, etc.
• Manage timely resolution of all critical and/or complex problems
• Maintain, monitor, and establish best practices for containerized environments.
• Mentor new DevOps engineers
What you will bring
• The desire to work in fast-paced environment.
• 5+ years’ experience building, maintaining, and deploying production infrastructures in AWS or other cloud providers
• Containerization experience with applications deployed on Docker and Kubernetes
• Understanding of NoSQL and Relational Database with respect to deployment and horizontal scalability
• Demonstrated knowledge of Distributed and Scalable systems Experience with maintaining and deployment of critical infrastructure components through Infrastructure-as-Code and configuration management tooling across multiple environments (Ansible, Terraform etc)
• Strong knowledge of DevOps and CI/CD pipeline (GitHub, BitBucket, Artifactory etc)
• Strong understanding of cloud and infrastructure components (server, storage, network, data, and applications) to deliver end-to-end cloud Infrastructure architectures and designs and recommendations
o AWS services like S3, CloudFront, Kubernetes, RDS, Data Warehouses to come up with architecture/suggestions for new use cases.
• Test our system integrity, implemented designs, application developments and other processes related to infrastructure, making improvements as needed
Good to have
• Experience with code quality tools, static or dynamic code analysis and compliance and undertaking and resolving issues identified from vulnerability and compliance scans of our infrastructure
• Good knowledge of REST/SOAP/JSON web service API implementation
•
Engineering group to plan ongoing feature development, product maintenance.
• Familiar with Virtualization, Containers - Kubernetes, Core Networking, Cloud Native
Development, Platform as a Service – Cloud Foundry, Infrastructure as a Service, Distributed
Systems etc
• Implementing tools and processes for deployment, monitoring, alerting, automation, scalability,
and ensuring maximum availability of server infrastructure
• Should be able to manage distributed big data systems such as hadoop, storm, mongoDB,
elastic search and cassandra etc.,
• Troubleshooting multiple deployment servers, Software installation, Managing licensing etc,.
• Plan, coordinate, and implement network security measures in order to protect data, software, and
hardware.
• Monitor the performance of computer systems and networks, and to coordinate computer network
access and use.
• Design, configure and test computer hardware, networking software, and operating system
software.
• Recommend changes to improve systems and network configurations, and determine hardware or
software requirements related to such changes.







