
CLOUD OPERATIONS AND MONITORING ENGINEER – THE GUARDIAN OF UPTIME
at AI Powered Software Development (Product Company)
🚀 RECRUITING BOND HIRING
Role: CLOUD OPERATIONS & MONITORING ENGINEER - (THE GUARDIAN OF UPTIME)
⚡ THIS IS NOT A MONITORING ROLE
THIS IS A COMMAND ROLE
You don’t watch dashboards.
You control outcomes.
You don’t react to incidents.
You eliminate them before they escalate.
This role powers an AI-driven SaaS + IoT platform where:
---> Uptime is non-negotiable
---> Latency is hunted
---> Failures are never allowed to repeat
Incidents don’t grow.
Problems don’t hide.
Uptime is enforced.
🧠 WHAT YOU’LL OWN
(Real Work. Real Impact.)
🔍 Total Observability
---> Real-time visibility across cloud, application, database & infrastructure
---> High-signal dashboards (Grafana + cloud-native tools)
---> Performance trends tracked before growth breaks systems
🚨 Smart Alerting (No Noise)
---> Alerts that fire only when action is required
---> Zero false positives. Zero alert fatigue
Right signal → right person → right time
⚙ Automation as a Weapon
---> End-to-end automation of operational tasks
---> Standardized logging, metrics & alerting
---> Systems that scale without human friction
🧯 Incident Command & Reliability
---> First responder for critical incidents (on-call rotation)
---> Root cause analysis across network, app, DB & storage
Fix fast — then harden so it never breaks the same way again
📘 Operational Excellence
---> Battle-tested runbooks
---> Documentation that actually works under pressure
Every incident → a stronger platform
🛠️ TECHNOLOGIES YOU’LL MASTER
☁ Cloud: AWS | Azure | Google Cloud
📊 Monitoring: Grafana | Metrics | Traces | Logs
📡 Alerting: Production-grade alerting systems
🌐 Networking: DNS | Routing | Load Balancers | Security
🗄 Databases: Production systems under real pressure
⚙ DevOps: Automation | Reliability Engineering
🎯 WHO WE’RE LOOKING FOR
Engineers who take uptime personally.
You bring:
---> 3+ years in Cloud Ops / DevOps / SRE
---> Live production SaaS experience
---> Deep AWS / Azure / GCP expertise
---> Strong monitoring & alerting experience
---> Solid networking fundamentals
---> Calm, methodical incident response
---> Bonus (Highly Preferred):
---> B2B SaaS + IoT / hybrid platforms
---> Strong automation mindset
---> Engineers who think in systems, not tickets
💼 JOB DETAILS
📍 Bengaluru
🏢 Hybrid (WFH)
💰 (Final CTC depends on experience & interviews)
🌟 WHY THIS ROLE?
Most cloud teams manage uptime. We weaponize it.
Your work won’t just keep systems running — it will keep customers confident, operations flawless, and competitors wondering how it all works so smoothly.
📩 APPLY / REFER : 🔗 Know someone who lives for reliability, observability & cloud excellence?

Similar jobs
Please Apply - https://zrec.in/L51Qf?source=CareerSite
About Us
Infra360 Solutions is a services company specializing in Cloud, DevSecOps, Security, and Observability solutions. We help technology companies adapt DevOps culture in their organization by focusing on long-term DevOps roadmap. We focus on identifying technical and cultural issues in the journey of successfully implementing the DevOps practices in the organization and work with respective teams to fix issues to increase overall productivity. We also do training sessions for the developers and make them realize the importance of DevOps. We provide these services - DevOps, DevSecOps, FinOps, Cost Optimizations, CI/CD, Observability, Cloud Security, Containerization, Cloud Migration, Site Reliability, Performance Optimizations, SIEM and SecOps, Serverless automation, Well-Architected Review, MLOps, Governance, Risk & Compliance. We do assessments of technology architecture, security, governance, compliance, and DevOps maturity model for any technology company and help them optimize their cloud cost, streamline their technology architecture, and set up processes to improve the availability and reliability of their website and applications. We set up tools for monitoring, logging, and observability. We focus on bringing the DevOps culture to the organization to improve its efficiency and delivery.
Job Description
Job Title: Senior DevOps Engineer (Infrastructure/SRE)
Department: Technology
Location: Gurgaon
Work Mode: On-site
Working Hours: 10 AM - 7 PM
Terms: Permanent
Experience: 4-6 years
Education: B.Tech/MCA
Notice Period: Immediately
About Us
At Infra360.io, we are a next-generation cloud consulting and services company committed to delivering comprehensive, 360-degree solutions for cloud, infrastructure, DevOps, and security. We partner with clients to transform and optimize their technology landscape, ensuring resilience, scalability, cost efficiency and innovation.
Our core services include Cloud Strategy, Site Reliability Engineering (SRE), DevOps, Cloud Security Posture Management (CSPM), and related Managed Services. We specialize in driving operational excellence across multi-cloud environments, helping businesses achieve their goals with agility and reliability.
We thrive on ownership, collaboration, problem-solving, and excellence, fostering an environment where innovation and continuous learning are at the forefront. Join us as we expand and redefine what’s possible in cloud technology and infrastructure.
Role Summary
We are looking for a Senior DevOps Engineer (Infrastructure) to design, automate, and manage cloud-based and datacentre infrastructure for diverse projects. The ideal candidate will have deep expertise in a public cloud platform (AWS, GCP, or Azure), with a strong focus on cost optimization, security best practices, and infrastructure automation using tools like Terraform and CI/CD pipelines.
This role involves designing scalable architectures (containers, serverless, and VMs), managing databases, and ensuring system observability with tools like Prometheus and Grafana. Strong leadership, client communication, and team mentoring skills are essential. Experience with VPN technologies and configuration management tools (Ansible, Helm) is also critical. Multi-cloud experience and familiarity with APM tools are a plus.
Ideal Candidate Profile
- Solid 4-6 years of experience as a DevOps engineer with a proven track record of architecting and automating solutions on Cloud
- Experience in troubleshooting production incidents and handling high-pressure situations.
- Strong leadership skills and the ability to mentor team members and provide guidance on best practices.
- Bachelor's or Master's degree in Computer Science, Engineering, or a related field.
- Extensive experience with Kubernetes, Terraform, ArgoCD, and Helm.
- Strong with at least one public cloud AWS/GCP/Azure
- Strong with Cost Optimization and Security Best practices
- Strong with Infrastructure automation using Terraform and CI/CD automation
- Strong with Configuration Management using Ansible, Helm etc
- Good with designing architectures (Containers, Serverless, VMs etc)
- Hands-on Experience working on Multiple Projects
- Strong with Client communication and requirements gathering
- Databases management experience
- Good experience with Prometheus, Grafana & Alert Manager
- Able to manage multiple clients and take ownership of client issues.
- Experience with Git and coding best practices
- Proficiency in cloud networking, including VPCs, DNS, VPNs (OpenVPN, OpenSwan, Pritunl, Site-to-Site VPNs), load balancers, and firewalls, ensuring secure and efficient connectivity.
- Strong understanding of cloud security best practices, identity and access management (IAM), and compliance requirements for modern infrastructure.
Good to have
- Multi-cloud experience with AWS, GCP & Azure
- Experience with APM & Observability tools like - Newrelic, Datadog, and OpenTelemetry
- Proficiency in scripting languages (Python, Go) for automation and tooling to improve infrastructure and application reliability.
Key Responsibilities
- Design and Development:
- Architect, design, and develop high-quality, scalable, and secure cloud-based software solutions.
- Collaborate with product and engineering teams to translate business requirements into technical specifications.
- Write clean, maintainable, and efficient code, following best practices and coding standards.
- Cloud Infrastructure:
- Develop and optimise cloud-native applications, leveraging cloud services like AWS, Azure, or Google Cloud Platform (GCP).
- Implement and manage CI/CD pipelines for automated deployment and testing.
- Ensure the security, reliability, and performance of cloud infrastructure.
- Technical Leadership:
- Mentor and guide junior engineers, providing technical leadership and fostering a collaborative team environment.
- Participate in code reviews, ensuring adherence to best practices and high-quality code delivery.
- Lead technical discussions and contribute to architectural decisions.
- Problem Solving and Troubleshooting:
- Identify, diagnose, and resolve complex software and infrastructure issues.
- Perform root cause analysis for production incidents and implement preventative measures.
- Continuous Improvement:
- Stay up-to-date with the latest industry trends, tools, and technologies in cloud computing and software engineering.
- Contribute to the continuous improvement of development processes, tools, and methodologies.
- Drive innovation by experimenting with new technologies and solutions to enhance the platform.
- Collaboration:
- Work closely with DevOps, QA, and other teams to ensure smooth integration and delivery of software releases.
- Communicate effectively with stakeholders, including technical and non-technical team members.
- Client Interaction & Management:
- Will serve as a direct point of contact for multiple clients.
- Able to handle the unique technical needs and challenges of two or more clients concurrently.
- Involve both direct interaction with clients and internal team coordination.
- Production Systems Management:
- Must have extensive experience in managing, monitoring, and debugging production environments.
- Will work on troubleshooting complex issues and ensure that production systems are running smoothly with minimal downtime.
Required Skills
• Automation is a part of your daily functions, so thorough familiarity with Unix Bourne shell scripting and Python is a critical survival skill.
• Integration and maintenance of automated tools
• Strong analytical and problem-solving skills
• Working experience in source control tools such as GIT/Github/Gitlab/TFS
• Have experience with modern virtualization technologies (Docker, KVM, AWS, OpenStack, or any orchestration platforms)
• Automation of deployment, customization, upgrades, and monitoring through modern DevOps tools (Ansible, Kubernetes, OpenShift, etc) • Advanced Linux admin experience
• Using Jenkins or similar tools
• Deep understanding of Container orchestration(Preferably Kubernetes )
• Strong knowledge of Object Storage(Preferably Cept on Rook)
• Experience in installing, managing & tuning microservices environments using Kubernetes & Docker both on-premise and on the cloud.
• Experience in deploying and managing spring boot applications.
• Experience in deploying and managing Python applications using Django, FastAPI, Flask.
• Experience in deploying machine learning pipelines/data pipelines using Airflow/Kubeflow /Mlflow.
• Experience in web server and reverse Proxy like Nginx, Apache Server, HAproxy
• Experience in monitoring tools like Prometheus, Grafana.
• Experience in provisioning & maintaining SQL/NoSQL databases.
Desired Skills
• Configuration software: Ansible
• Excellent communication and collaboration skills
• Good experience on Networking Technologies like a Load balancer, ACL, Firewall, VIP, DNS
• Programmatic experience with AWS, DO, or GCP storage & machine images
• Experience on various Linux distributions
• Knowledge of Azure DevOps Server
• Docker management and troubleshooting
• Familiarity with micro-services and RESTful systems
• AWS / GCP / Azure certification
• Interact with the Engineering for supporting/maintaining/designing backend infrastructure for product support
• Create fully automated global cloud infrastructure that spans multiple regions.
• Great learning attitude to the newest technology and a Team player
FINTECH CANDIDATES ONLY
About the job:
Emint is a fintech startup with the mission to ‘Make the best investing product that Indian consumers love to use, with simplicity & intelligence at the core’. We are creating a platformthat
gives a holistic view of market dynamics which helps our users make smart & disciplined
investment decisions. Emint is founded by a stellar team of individuals who come with decades of
experience of investing in Indian & global markets. We are building a team of highly skilled &
disciplined team of professionals and looking at equally motivated individuals to be part of
Emint. Currently are looking at hiring a Devops to join our team at Bangalore.
Job Description :
Must Have:
• Hands on experience on AWS DEVOPS
• Experience in Unix with BASH scripting is must
• Experience working with Kubernetes, Docker.
• Experience in Gitlab, Github or Bitbucket artifactory
• Packaging, deployment
• CI/CD pipeline experience (Jenkins is preferable)
• CI/CD best practices
Good to Have:
• Startup Experience
• Knowledge of source code management guidelines
• Experience with deployment tools like Ansible/puppet/chef is preferable
• IAM knowledge
• Coding knowledge of Python adds value
• Test automation setup experience
Qualifications:
• Bachelor's degree or equivalent experience in Computer Science or related field
• Graduates from IIT / NIT/ BITS / IIIT preferred
• Professionals with fintech ( stock broking / banking ) preferred
• Experience in building & scaling B2C apps preferred
- Experience using AWS (that’s just common sense)
- Experience designing and building web environments on AWS, which includes working with services like EC2, ELB, RDS, and S3
- Experience building and maintaining cloud-native applications
- A solid background in Linux/Unix and Windows server system administration
- Experience using https://www.simplilearn.com/tutorials/devops-tutorial/devops-tools" target="_blank">DevOps tools in a cloud environment, such as Ansible, Artifactory, https://www.simplilearn.com/tutorials/docker-tutorial/what-is-docker-container" target="_blank">Docker, GitHub, https://www.simplilearn.com/tutorials/jenkins-tutorial/what-is-jenkins" target="_blank">Jenkins, https://www.simplilearn.com/tutorials/kubernetes-tutorial/what-is-kubernetes" target="_blank">Kubernetes, Maven, and Sonar Qube
- Experience installing and configuring different application servers such as JBoss, Tomcat, and WebLogic
- Experience using monitoring solutions like CloudWatch, ELK Stack, and Prometheus
- An understanding of writing Infrastructure-as-Code (IaC), using tools like CloudFormation or Terraform
- Knowledge of one or more of the most-used programming languages available for today’s cloud computing (i.e., SQL data, XML data, R math, Clojure math, Haskell functional, Erlang functional, Python procedural, and Go procedural languages)
- Experience in troubleshooting distributed systems
- Proficiency in script development and scripting languages
- The ability to be a team player
- The ability and skill to train other people in procedural and technical topics
- Strong communication and collaboration skills
As a special aside, an AWS engineer who works in DevOps should also have experience with:
- The theory, concepts, and real-world application of Continuous Delivery (CD), which requires familiarity with tools like AWS CodeBuild, AWS CodeDeploy, and AWS CodePipeline
- An understanding of automation
Do Your Thng
DYT - Do Your Thing, is an app, where all social media users can share brands they love with their followers and earn money while doing so! We believe everyone is an influencer. Our aim is to democratise social media and allow people to be rewarded for the content they post. How does DYT help you? It accelerates your career through collaboration opportunities with top brands and gives you access to a community full of experts in the influencer space.
Role: DevOps
Job Description:
We are looking for experienced DevOps Engineers to join our Engineering team. The candidate will be working with our engineers and interact with the tech team for high quality web applications for a product.
Required Experience
- Devops Engineer with 2+ years of experience in development and production operations supporting for Linux & Windows based applications and Cloud deployments (AWS/GC stack)
- Experience working with Continuous Integration and Continuous Deployment Pipeline
- Exposure to managing LAMP stack-based applications
- Experience Resource provisioning automation using tools such as CloudFormation, terraform and ARM Templates.
- Experience in working closely with clients, understanding their requirements, design and implement quality solutions to meet their needs.
- Ability to take ownership on the carried-out work
- Experience coordinating with rest of the team to deliver well-architected and high-quality solutions.
- Experience deploying Docker based applications
- Experience with AWS services.
- Excellent verbal and written communication skills
Desired Experience
- Exposure to AWS, google cloud and Azure Cloud
- Experience in Jenkins, Ansible, Terraform
- Build Monitoring tools and respond to alarms triggered in production environment
- Willingness to quickly become a member of the team and to do what it takes to get the job done
- Ability to work well in a fast-paced environment and listen and learn from stakeholders
- Demonstrate a strong work ethic and incorporate company values in your everyday work.
- Collaborate with Dev, QA and Data Science teams on environment maintenance, monitoring (ELK, Prometheus or equivalent), deployments and diagnostics
- Administer a hybrid datacenter, including AWS and EC2 cloud assets
- Administer, automate and troubleshoot container based solutions deployed on AWS ECS
- Be able to troubleshoot problems and provide feedback to engineering on issues
- Automate deployment (Ansible, Python), build (Git, Maven. Make, or equivalent) and integration (Jenkins, Nexus) processes
- Learn and administer technologies such as ELK, Hadoop etc.
- A self-starter and enthusiasm to learn and pick up new technologies in a fast-paced environment.
Need to have
- Hands-on Experience in Cloud based DevOps
- Experience working in AWS (EC2, S3, CloudFront, ECR, ECS etc)
- Experience with any programming language.
- Experience using Ansible, Docker, Jenkins, Kubernetes
- Experience in Python.
- Should be very comfortable working in Linux/Unix environment.
- Exposure to Shell Scripting.
- Solid troubleshooting skills
- Equal Experts is an innovative software delivery consultancy specializing in the delivery of custom software solutions for blue-chip enterprise and public sector clients across a range of industry sectors.
- We deliver market-leading propositions across the digital, online and mobile channels, and are recognized for our leadership in the application of Agile and Lean delivery methods to assure delivery.
- We are focused on hiring DevOps Engineers with skills in AWS, Automation tools, CI/CD tools and the likes.
- The DevOps Consultant will be working with an on-site client team and will be expected to mentor and share their skills and knowledge with the existing client developers.
Your Responsibilities Include:
- Continuous Delivery of quality software created by our project teams. Ensuring smooth build and release with continuous integration.
- Configuration Management - Design and implementation of deployment strategies for multiple projects.
- Virtualization and Infrastructure Provisioning.
- Maintenance and upgrade of our cloud environment (AWS).
- Version control and source code administration.
- Administration of Web Servers, Application Servers and Servlet Containers
- Setting up and managing the Automation efforts for multiple projects.
Technical Expertise:
- Should have worked on Agile projects featuring weekly iterations and releases.
- Should have extensive hands-on experience with:
- Continuous Integration tools like Jenkins.
- Configuration Management tools like Terraform or Ansible.
- Cloud computing using AWS, EC2.
- Hands-on experience on Kubernetes.
- Virtualization tools like Vagrant, Docker or VMWare
- You have an active presence in the DevOps community through your blogs, Stack Overflow and GitHub profiles.
- You are passionate about DevOps. You love to mentor people and evangelize about best practices and innovations in DevOps.
Job Description:
Mandatory Skills:
Should have strong working experience with Cloud technologies like AWS and Azure.
Should have strong working experience with CI/CD tools like Jenkins and Rundeck.
Must have experience with configuration management tools like Ansible.
Must have working knowledge on tools like Terraform.
Must be good at Scripting Languages like shell scripting and python.
Should be expertise in DevOps practices and should have demonstrated the ability to apply that knowledge across diverse projects and teams.
Preferable skills:
Experience with tools like Docker, Kubernetes, Puppet, JIRA, gitlab and Jfrog.
Experience in scripting languages like groovy.
Experience with GCP
Summary & Responsibilities:
Write build pipelines and IaaC (ARM templates, terraform or cloud formation).
Develop ansible playbooks to install and configure various products.
Implement Jenkins and Rundeck jobs( and pipelines).
Must be a self-starter and be able to work well in a fast paced, dynamic environment
Work independently and resolve issues with minimal supervision.
Strong desire to learn new technologies and techniques
Strong communication (written / verbal ) skills
Qualification:
Bachelor's degree in Computer Science or equivalent.
4+ years of experience in DevOps and AWS.
2+ years of experience in Python, Shell scripting and Azure.
Goodera is looking for an experienced and motivated DevOps professional to be an integral part of its core infrastructure team. As a DevOps Engineer, you must be able to troubleshoot production issues, design, implement, and deploy monitoring tools, collaborate with team members to improve the existing and develop new engineering tools, optimize company's computing architecture, design and conduct security, performance, availability and availability tests.
Responsibilities:
This is a highly accountable role and the candidate must meet the following professional expectations:
• Owning and improving the scalability and reliability of our products.
• Working directly with product engineering and infrastructure teams.
• Designing and developing various monitoring system tools.
• Accountable for developing deployment strategies and build configuration management.
• Deploying and updating system and application software.
• Ensure regular, effective communication with team members and cross-functional resources.
• Maintaining a positive and supportive work culture.
• First point of contact for handling customer (may be internal stakeholders) issues, providing guidance and recommendations to increase efficiency and reduce customer incidents.
• Develop tooling and processes to drive and improve customer experience, create playbooks.
• Eliminate manual tasks via configuration management.
• Intelligently migrate services from one AWS region to other AWS regions.
• Create, implement and maintain security policies to ensure ISO/ GDPR / SOC / PCI compliance.
• Verify infrastructure Automation meets compliance goals and is current with disaster recovery plan.
• Evangelize configuration management and automation to other product developers.
• Keep himself updated with upcoming technologies to maintain the state of the art infrastructure.
Required Candidate profile :
• 3+ years of proven experience working in a DevOps environment.
• 3+ years of proven experience working in AWS Cloud environments.
• Solid understanding of networking and security best practices.
• Experience with infrastructure-as-code frameworks such as Ansible, Terraform, Chef, Puppet, CFEngine, etc.
• Experience in scripting or programming languages (Bash, Python, PHP, Node.js, Perl, etc.)
• Experience designing and building web application environments on AWS, including services such as ECS, ECR, Foregate, Lambda, SNS / SQS, CloudFront, Code Build, Code pipeline, Configuring CloudWatch, WAF, Active Directories, Kubernetes (EKS), EC2, S3, ELB, RDS, Redshift etc.
• Hands on Experience in Docker is a big plus.
• Experience working in an Agile, fast paced, DevOps environment.
• Strong Knowledge in DB such as MongoDB / MySQL / DynamoDB / Redis / Cassandra.
• Experience with Open Source and tools such as Haproxy, Apache, Nginx and Nagios etc.
• Fluency with version control systems with a preference for Git *
• Strong Linux-based infrastructures, Linux administration
• Experience with installing and configuring application servers such as WebLogic, JBoss and Tomcat.
• Hands-on in logging, monitoring and alerting tools like ELK, Grafana, Metabase, Monit, Zbbix etc.
• A team player capable of high performance, flexibility in a dynamic working environment and the ability to lead.
d ability to rain others on technical and procedural topics.
Radical is a platform connecting data, medicine and people -- through machine learning, and usable, performant products. Software has never been the strong suit of the medical industry -- and we are changing that. We believe that the same sophistication and performance that powers our daily needs through millions of consumer applications -- be it your grocery, your food delivery or your movie tickets -- when applied to healthcare, has a massive potential to transform the industry, and positively impact lives of patients and doctors. Radical works with some of the largest hospitals and public health programmes in India, and has a growing footprint both inside the country and abroad.
As a DevOps Engineer at Radical, you will:
Work closely with all stakeholders in the healthcare ecosystem - patients, doctors, paramedics and administrators - to conceptualise and bring to life the ideal set of products that add value to their time
Work alongside Software Developers and ML Engineers to solve problems and assist in architecture design
Work on systems which have an extraordinary emphasis on capturing data that can help build better workflows, algorithms and tools
Work on high performance systems that deal with several million transactions, multi-modal data and large datasets, with a close attention to detail
We’re looking for someone who has:
Familiarity and experience with writing working, well-documented and well-tested scripts, Dockerfiles, Puppet/Ansible/Chef/Terraform scripts.
Proficiency with scripting languages like Python and Bash.
Knowledge of systems deployment and maintainence, including setting up CI/CD and working alongside Software Developers, monitoring logs, dashboards, etc.
Experience integrating with a wide variety of external tools and services
Experience navigating AWS and leveraging appropriate services and technologies rather than DIY solutions (such as hosting an application directly on EC2 vs containerisation, or an Elastic Beanstalk)
It’s not essential, but great if you have:
An established track record of deploying and maintaining systems.
Experience with microservices and decomposition of monolithic architectures
Proficiency in automated tests.
Proficiency with the linux ecosystem
Experience in deploying systems to production on cloud platforms such as AWS
The position is open now, and we are onboarding immediately.
Please write to us with an updated resume, and one thing you would like us to see as part of your application. This one thing can be anything that you think makes you stand apart among candidates.
Radical is based out of Delhi NCR, India, and we look forward to working with you!
We're looking for people who may not know all the answers, but are obsessive about finding them, and take pride in the code that they write. We are more interested in the ability to learn fast, think rigorously and for people who aren’t afraid to challenge assumptions, and take large bets -- only to work hard and prove themselves correct. You're encouraged to apply even if your experience doesn't precisely match the job description. Join us.








