7+ Reliability engineering Jobs in Pune | Reliability engineering Job openings in Pune
Apply to 7+ Reliability engineering Jobs in Pune on CutShort.io. Explore the latest Reliability engineering Job opportunities across top companies like Google, Amazon & Adobe.
About NonStop io Technologies:
NonStop io Technologies is a value-driven company with a strong focus on process-oriented software engineering. We specialize in Product Development and have a decade's worth of experience in building web and mobile applications across various domains. NonStop io Technologies follows core principles that guide its operations and believes in staying invested in a product's vision for the long term. We are a small but proud group of individuals who believe in the 'givers gain' philosophy and strive to provide value in order to seek value. We are committed to and specialize in building cutting-edge technology products and serving as trusted technology partners for startups and enterprises. We pride ourselves on fostering innovation, learning, and community engagement. Join us to work on impactful projects in a collaborative and vibrant environment.
Brief Description:
We are looking for a skilled and proactive DevOps Engineer to join our growing engineering team. The ideal candidate will have hands-on experience in building, automating, and managing scalable infrastructure and CI CD pipelines. You will work closely with development, QA, and product teams to ensure reliable deployments, performance, and system security.
Roles and Responsibilities:
● Design, implement, and manage CI CD pipelines for multiple environments
● Automate infrastructure provisioning using Infrastructure as Code tools
● Manage and optimize cloud infrastructure on AWS, Azure, or GCP
● Monitor system performance, availability, and security
● Implement logging, monitoring, and alerting solutions
● Collaborate with development teams to streamline release processes
● Troubleshoot production issues and ensure high availability
● Implement containerization and orchestration solutions such as Docker and Kubernetes
● Enforce DevOps best practices across the engineering lifecycle
● Ensure security compliance and data protection standards are maintained
Requirements:
● 4 to 7 years of experience in DevOps or Site Reliability Engineering
● Strong experience with cloud platforms such as AWS, Azure, or GCP - Relevant Certifications will be a great advantage
● Hands-on experience with CI CD tools like Jenkins, GitHub Actions, GitLab CI, or Azure DevOps
● Experience working in microservices architecture
● Exposure to DevSecOps practices
● Experience in cost optimization and performance tuning in cloud environments
● Experience with Infrastructure as Code tools such as Terraform, CloudFormation, or ARM
● Strong knowledge of containerization using Docker
● Experience with Kubernetes in production environments
● Good understanding of Linux systems and shell scripting
● Experience with monitoring tools such as Prometheus, Grafana, ELK, or Datadog
● Strong troubleshooting and debugging skills
● Understanding of networking concepts and security best practices
Why Join Us?
● Opportunity to work on a cutting-edge healthcare product
● A collaborative and learning-driven environment
● Exposure to AI and software engineering innovations
● Excellent work ethic and culture
If you're passionate about technology and want to work on impactful projects, we'd love to hear from you!
🚀 RECRUITING BOND HIRING
Role: CLOUD OPERATIONS & MONITORING ENGINEER - (THE GUARDIAN OF UPTIME)
⚡ THIS IS NOT A MONITORING ROLE
THIS IS A COMMAND ROLE
You don’t watch dashboards.
You control outcomes.
You don’t react to incidents.
You eliminate them before they escalate.
This role powers an AI-driven SaaS + IoT platform where:
---> Uptime is non-negotiable
---> Latency is hunted
---> Failures are never allowed to repeat
Incidents don’t grow.
Problems don’t hide.
Uptime is enforced.
🧠 WHAT YOU’LL OWN
(Real Work. Real Impact.)
🔍 Total Observability
---> Real-time visibility across cloud, application, database & infrastructure
---> High-signal dashboards (Grafana + cloud-native tools)
---> Performance trends tracked before growth breaks systems
🚨 Smart Alerting (No Noise)
---> Alerts that fire only when action is required
---> Zero false positives. Zero alert fatigue
Right signal → right person → right time
⚙ Automation as a Weapon
---> End-to-end automation of operational tasks
---> Standardized logging, metrics & alerting
---> Systems that scale without human friction
🧯 Incident Command & Reliability
---> First responder for critical incidents (on-call rotation)
---> Root cause analysis across network, app, DB & storage
Fix fast — then harden so it never breaks the same way again
📘 Operational Excellence
---> Battle-tested runbooks
---> Documentation that actually works under pressure
Every incident → a stronger platform
🛠️ TECHNOLOGIES YOU’LL MASTER
☁ Cloud: AWS | Azure | Google Cloud
📊 Monitoring: Grafana | Metrics | Traces | Logs
📡 Alerting: Production-grade alerting systems
🌐 Networking: DNS | Routing | Load Balancers | Security
🗄 Databases: Production systems under real pressure
⚙ DevOps: Automation | Reliability Engineering
🎯 WHO WE’RE LOOKING FOR
Engineers who take uptime personally.
You bring:
---> 3+ years in Cloud Ops / DevOps / SRE
---> Live production SaaS experience
---> Deep AWS / Azure / GCP expertise
---> Strong monitoring & alerting experience
---> Solid networking fundamentals
---> Calm, methodical incident response
---> Bonus (Highly Preferred):
---> B2B SaaS + IoT / hybrid platforms
---> Strong automation mindset
---> Engineers who think in systems, not tickets
💼 JOB DETAILS
📍 Bengaluru
🏢 Hybrid (WFH)
💰 (Final CTC depends on experience & interviews)
🌟 WHY THIS ROLE?
Most cloud teams manage uptime. We weaponize it.
Your work won’t just keep systems running — it will keep customers confident, operations flawless, and competitors wondering how it all works so smoothly.
📩 APPLY / REFER : 🔗 Know someone who lives for reliability, observability & cloud excellence?
🚀 RECRUITING BOND HIRING
Role: CLOUD OPERATIONS & MONITORING ENGINEER - (THE GUARDIAN OF UPTIME)
⚡ THIS IS NOT A MONITORING ROLE
THIS IS A COMMAND ROLE
You don’t watch dashboards.
You control outcomes.
You don’t react to incidents.
You eliminate them before they escalate.
This role powers an AI-driven SaaS + IoT platform where:
---> Uptime is non-negotiable
---> Latency is hunted
---> Failures are never allowed to repeat
Incidents don’t grow.
Problems don’t hide.
Uptime is enforced.
🧠 WHAT YOU’LL OWN
(Real Work. Real Impact.)
🔍 Total Observability
---> Real-time visibility across cloud, application, database & infrastructure
---> High-signal dashboards (Grafana + cloud-native tools)
---> Performance trends tracked before growth breaks systems
🚨 Smart Alerting (No Noise)
---> Alerts that fire only when action is required
---> Zero false positives. Zero alert fatigue
Right signal → right person → right time
⚙ Automation as a Weapon
---> End-to-end automation of operational tasks
---> Standardized logging, metrics & alerting
---> Systems that scale without human friction
🧯 Incident Command & Reliability
---> First responder for critical incidents (on-call rotation)
---> Root cause analysis across network, app, DB & storage
Fix fast — then harden so it never breaks the same way again
📘 Operational Excellence
---> Battle-tested runbooks
---> Documentation that actually works under pressure
Every incident → a stronger platform
🛠️ TECHNOLOGIES YOU’LL MASTER
☁ Cloud: AWS | Azure | Google Cloud
📊 Monitoring: Grafana | Metrics | Traces | Logs
📡 Alerting: Production-grade alerting systems
🌐 Networking: DNS | Routing | Load Balancers | Security
🗄 Databases: Production systems under real pressure
⚙ DevOps: Automation | Reliability Engineering
🎯 WHO WE’RE LOOKING FOR
Engineers who take uptime personally.
You bring:
---> 3+ years in Cloud Ops / DevOps / SRE
---> Live production SaaS experience
---> Deep AWS / Azure / GCP expertise
---> Strong monitoring & alerting experience
---> Solid networking fundamentals
---> Calm, methodical incident response
---> Bonus (Highly Preferred):
---> B2B SaaS + IoT / hybrid platforms
---> Strong automation mindset
---> Engineers who think in systems, not tickets
💼 JOB DETAILS
📍 Bengaluru
🏢 Hybrid (WFH)
💰 (Final CTC depends on experience & interviews)
🌟 WHY THIS ROLE?
Most cloud teams manage uptime. We weaponize it.
Your work won’t just keep systems running — it will keep customers confident, operations flawless, and competitors wondering how it all works so smoothly.
📩 APPLY / REFER : 🔗 Know someone who lives for reliability, observability & cloud excellence?
Department: S&C – Site Reliability Engineering (SRE)
Experience Required: 4–8 Years
Location: Bangalore / Pune /Mumbai
Employment Type: Full-time
- Provide Tier 2/3 technical product support to internal and external stakeholders.
- Develop automation tools and scripts to improve operational efficiency and support processes.
- Manage and maintain system and software configurations; troubleshoot environment/application-related issues.
- Optimize system performance through configuration tuning or development enhancements.
- Plan, document, and deploy applications in Unix/Linux, Azure, and GCP environments.
- Collaborate with Development, QA, and Infrastructure teams throughout the release and deployment of lifecycles.
- Drive automation initiatives for release and deployment processes.
- Coordinate with infrastructure teams to manage hardware/software resources, maintenance, and scheduled downtimes across production and non-production environments.
- Participate in on-call rotations (minimum one week per month) to address critical incidents and off-hour maintenance tasks.
Key Competencies
- Strong analytical, troubleshooting, and critical thinking abilities.
- Excellent cross-functional collaboration skills.
- Strong focus on documentation, process improvement, and system reliability.
- Proactive, detail-oriented, and adaptable in a fast-paced work environment.
Dear Candidate,
Greetings from Wissen Technology.
We have an exciting Job opportunity for GCP SRE Engineer Professionals. Please refer to the Job Description below and share your profile if interested.
About Wissen Technology:
- The Wissen Group was founded in the year 2000. Wissen Technology, a part of Wissen Group, was established in the year 2015.
- Wissen Technology is a specialized technology company that delivers high-end consulting for organizations in the Banking & Finance, Telecom, and Healthcare domains. We help clients build world class products.
- Our workforce consists of 1000+ highly skilled professionals, with leadership and senior management executives who have graduated from Ivy League Universities like Wharton, MIT, IITs, IIMs, and NITs and with rich work experience in some of the biggest companies in the world.
- Wissen Technology has grown its revenues by 400% in these five years without any external funding or investments.
- Globally present with offices US, India, UK, Australia, Mexico, and Canada.
- We offer an array of services including Application Development, Artificial Intelligence & Machine Learning, Big Data & Analytics, Visualization & Business Intelligence, Robotic Process Automation, Cloud, Mobility, Agile & DevOps, Quality Assurance & Test Automation.
- Wissen Technology has been certified as a Great Place to Work®.
- Wissen Technology has been voted as the Top 20 AI/ML vendor by CIO Insider in 2020.
- Over the years, Wissen Group has successfully delivered $650 million worth of projects for more than 20 of the Fortune 500 companies.
- The technology and thought leadership that the company commands in the industry is the direct result of the kind of people Wissen has been able to attract. Wissen is committed to providing them the best possible opportunities and careers, which extends to providing the best possible experience and value to our clients.
We have served client across sectors like Banking, Telecom, Healthcare, Manufacturing, and Energy. They include likes of Morgan Stanley, MSCI, State Street Corporation, Flipkart, Swiggy, Trafigura, GE to name a few.
Job Description:
Please find below details:
Experience - 4+ Years
Location- Bangalore/Mumbai/Pune
Team Responsibilities
The successful candidate shall be part of the S&C – SRE Team. Our team provides a tier 2/3 support to S&C Business. This position involves collaboration with the client facing teams like Client Services, Product and Research teams and Infrastructure/Technology and Application development teams to perform Environment and Application maintenance and support.
Resource's key Responsibilities
• Provide Tier 2/3 product technical support.
• Building software to help operations and support activities.
• Manage system\software configurations and troubleshoot environment issues.
• Identify opportunities for optimizing system performance through changes in configuration or suggestions for development.
• Plan, document and deploy software applications on our Unix/Linux/Azure and GCP based systems.
• Collaborate with development and software testing teams throughout the release process.
• Analyze release and deployment processes to identify key areas for automation and optimization.
• Manage hardware and software resources & coordinate maintenance, planned downtimes with
infrastructure group across all the environments. (Production / Non-Production).
• Must spend minimum one week a month as on call support to help with off-hour emergencies and maintenance activities.
Required skills and experience
• Bachelor's degree, Computer Science, Engineering or other similar concentration (BE/MCA)
• Master’s degree a plus
• 6-8 years’ experience in Production Support/ Application Management/ Application Development (support/ maintenance) role.
• Excellent problem-solving/troubleshooting skills, fast learner
• Strong knowledge of Unix Administration.
• Strong scripting skills in Shell, Python, Batch is must.
• Strong Database experience – Oracle
• Strong knowledge of Software Development Life Cycle
• Power shell is nice to have
• Software development skillsets in Java or Ruby.
• Worked upon any of the cloud platforms – GCP/Azure/AWS is nice to have
Dear Candidate,
Greetings from Wissen Technology.
We have an exciting Job opportunity for GCP SRE Engineer Professionals. Please refer to the Job Description below and share your profile if interested.
About Wissen Technology:
- The Wissen Group was founded in the year 2000. Wissen Technology, a part of Wissen Group, was established in the year 2015.
- Wissen Technology is a specialized technology company that delivers high-end consulting for organizations in the Banking & Finance, Telecom, and Healthcare domains. We help clients build world class products.
- Our workforce consists of 1000+ highly skilled professionals, with leadership and senior management executives who have graduated from Ivy League Universities like Wharton, MIT, IITs, IIMs, and NITs and with rich work experience in some of the biggest companies in the world.
- Wissen Technology has grown its revenues by 400% in these five years without any external funding or investments.
- Globally present with offices US, India, UK, Australia, Mexico, and Canada.
- We offer an array of services including Application Development, Artificial Intelligence & Machine Learning, Big Data & Analytics, Visualization & Business Intelligence, Robotic Process Automation, Cloud, Mobility, Agile & DevOps, Quality Assurance & Test Automation.
- Wissen Technology has been certified as a Great Place to Work®.
- Wissen Technology has been voted as the Top 20 AI/ML vendor by CIO Insider in 2020.
- Over the years, Wissen Group has successfully delivered $650 million worth of projects for more than 20 of the Fortune 500 companies.
- The technology and thought leadership that the company commands in the industry is the direct result of the kind of people Wissen has been able to attract. Wissen is committed to providing them the best possible opportunities and careers, which extends to providing the best possible experience and value to our clients.
We have served client across sectors like Banking, Telecom, Healthcare, Manufacturing, and Energy. They include likes of Morgan Stanley, MSCI, State Street Corporation, Flipkart, Swiggy, Trafigura, GE to name a few
Job Description:
Please find below details:
Experience - 4+ Years
Location- Bangalore/Mumbai/Pune
Team Responsibilities
The successful candidate shall be part of the S&C – SRE Team. Our team provides a tier 2/3 support to S&C Business. This position involves collaboration with the client facing teams like Client Services, Product and Research teams and Infrastructure/Technology and Application development teams to perform Environment and Application maintenance and support.
Resource's key Responsibilities
• Provide Tier 2/3 product technical support.
• Building software to help operations and support activities.
• Manage system\software configurations and troubleshoot environment issues.
• Identify opportunities for optimizing system performance through changes in configuration or suggestions for development.
• Plan, document and deploy software applications on our Unix/Linux/Azure and GCP based systems.
• Collaborate with development and software testing teams throughout the release process.
• Analyze release and deployment processes to identify key areas for automation and optimization.
• Manage hardware and software resources & coordinate maintenance, planned downtimes with
infrastructure group across all the environments. (Production / Non-Production).
• Must spend minimum one week a month as on call support to help with off-hour emergencies and maintenance activities.
Required skills and experience
• Bachelor's degree, Computer Science, Engineering or other similar concentration (BE/MCA)
• Master’s degree a plus
• 6-8 years’ experience in Production Support/ Application Management/ Application Development (support/ maintenance) role.
• Excellent problem-solving/troubleshooting skills, fast learner
• Strong knowledge of Unix Administration.
• Strong scripting skills in Shell, Python, Batch is must.
• Strong Database experience – Oracle
• Strong knowledge of Software Development Life Cycle
• Power shell is nice to have
• Software development skillsets in Java or Ruby.
• Worked upon any of the cloud platforms – GCP/Azure/AWS is nice to have
Job Title: Site Reliability Engineer (SRE)
Experience: 4+ Years
Work Location: Bangalore / Chennai / Pune / Gurgaon
Work Mode: Hybrid or Onsite (based on project need)
Domain Preference: Candidates with past experience working in shoe/footwear retail brands (e.g., Nike, Adidas, Puma) are highly preferred.
🛠️ Key Responsibilities
- Design, implement, and manage scalable, reliable, and secure infrastructure on AWS.
- Develop and maintain Python-based automation scripts for deployment, monitoring, and alerting.
- Monitor system performance, uptime, and overall health using tools like Prometheus, Grafana, or Datadog.
- Handle incident response, root cause analysis, and ensure proactive remediation of production issues.
- Define and implement Service Level Objectives (SLOs) and Error Budgets in alignment with business requirements.
- Build tools to improve system reliability, automate manual tasks, and enforce infrastructure consistency.
- Collaborate with development and DevOps teams to ensure robust CI/CD pipelines and safe deployments.
- Conduct chaos testing and participate in on-call rotations to maintain 24/7 application availability.
✅ Must-Have Skills
- 4+ years of experience in Site Reliability Engineering or DevOps with a focus on reliability, monitoring, and automation.
- Strong programming skills in Python (mandatory).
- Hands-on experience with AWS cloud services (EC2, S3, Lambda, ECS/EKS, CloudWatch, etc.).
- Expertise in monitoring and alerting tools like Prometheus, Grafana, Datadog, CloudWatch, etc.
- Strong background in Linux-based systems and shell scripting.
- Experience implementing infrastructure as code using tools like Terraform or CloudFormation.
- Deep understanding of incident management, SLOs/SLIs, and postmortem practices.
- Prior working experience in footwear/retail brands such as Nike or similar is highly preferred.




