11+ Storage management Jobs in Bangalore (Bengaluru) | Storage management Job openings in Bangalore (Bengaluru)
Apply to 11+ Storage management Jobs in Bangalore (Bengaluru) on CutShort.io. Explore the latest Storage management Job opportunities across top companies like Google, Amazon & Adobe.
🚀 RECRUITING BOND HIRING
Role: CLOUD OPERATIONS & MONITORING ENGINEER - (THE GUARDIAN OF UPTIME)
⚡ THIS IS NOT A MONITORING ROLE
THIS IS A COMMAND ROLE
You don’t watch dashboards.
You control outcomes.
You don’t react to incidents.
You eliminate them before they escalate.
This role powers an AI-driven SaaS + IoT platform where:
---> Uptime is non-negotiable
---> Latency is hunted
---> Failures are never allowed to repeat
Incidents don’t grow.
Problems don’t hide.
Uptime is enforced.
🧠 WHAT YOU’LL OWN
(Real Work. Real Impact.)
🔍 Total Observability
---> Real-time visibility across cloud, application, database & infrastructure
---> High-signal dashboards (Grafana + cloud-native tools)
---> Performance trends tracked before growth breaks systems
🚨 Smart Alerting (No Noise)
---> Alerts that fire only when action is required
---> Zero false positives. Zero alert fatigue
Right signal → right person → right time
⚙ Automation as a Weapon
---> End-to-end automation of operational tasks
---> Standardized logging, metrics & alerting
---> Systems that scale without human friction
🧯 Incident Command & Reliability
---> First responder for critical incidents (on-call rotation)
---> Root cause analysis across network, app, DB & storage
Fix fast — then harden so it never breaks the same way again
📘 Operational Excellence
---> Battle-tested runbooks
---> Documentation that actually works under pressure
Every incident → a stronger platform
🛠️ TECHNOLOGIES YOU’LL MASTER
☁ Cloud: AWS | Azure | Google Cloud
📊 Monitoring: Grafana | Metrics | Traces | Logs
📡 Alerting: Production-grade alerting systems
🌐 Networking: DNS | Routing | Load Balancers | Security
🗄 Databases: Production systems under real pressure
⚙ DevOps: Automation | Reliability Engineering
🎯 WHO WE’RE LOOKING FOR
Engineers who take uptime personally.
You bring:
---> 3+ years in Cloud Ops / DevOps / SRE
---> Live production SaaS experience
---> Deep AWS / Azure / GCP expertise
---> Strong monitoring & alerting experience
---> Solid networking fundamentals
---> Calm, methodical incident response
---> Bonus (Highly Preferred):
---> B2B SaaS + IoT / hybrid platforms
---> Strong automation mindset
---> Engineers who think in systems, not tickets
💼 JOB DETAILS
📍 Bengaluru
🏢 Hybrid (WFH)
💰 (Final CTC depends on experience & interviews)
🌟 WHY THIS ROLE?
Most cloud teams manage uptime. We weaponize it.
Your work won’t just keep systems running — it will keep customers confident, operations flawless, and competitors wondering how it all works so smoothly.
📩 APPLY / REFER : 🔗 Know someone who lives for reliability, observability & cloud excellence?
JOB DETAILS:
- Job Title: Lead DevOps Engineer
- Industry: Ride-hailing
- Experience: 6-9 years
- Working Days: 5 days/week
- Work Mode: ONSITE
- Job Location: Bangalore
- CTC Range: Best in Industry
Required Skills: Cloud & Infrastructure Operations, Kubernetes & Container Orchestration, Monitoring, Reliability & Observability, Proficiency with Terraform, Ansible etc., Strong problem-solving skills with scripting (Python/Go/Shell)
Criteria:
1. Candidate must be from a product-based or scalable app-based start-ups company with experience handling large-scale production traffic.
2. Minimum 6 yrs of experience working as a DevOps/Infrastructure Consultant
3. Candidate must have 2 years of experience as an lead (handling team of 3 to 4 members at least)
4. Own end-to-end infrastructure right from non-prod to prod environment including self-managed
5. Candidate must have Self experience in database migration from scratch
6. Must have a firm hold on the container orchestration tool Kubernetes
7. Should have expertise in configuration management tools like Ansible, Terraform, Chef / Puppet
8. Understanding programming languages like GO/Python, and Java
9. Working on databases like Mongo/Redis/Cassandra/Elasticsearch/Kafka.
10. Working experience on Cloud platform -AWS
11. Candidate should have Minimum 1.5 years stability per organization, and a clear reason for relocation.
Description
Job Summary:
As a DevOps Engineer at company, you will be working on building and operating infrastructure at scale, designing and implementing a variety of tools to enable product teams to build and deploy their services independently, improving observability across the board, and designing for security, resiliency, availability, and stability. If the prospect of ensuring system reliability at scale and exploring cutting-edge technology to solve problems, excites you, then this is your fit.
Job Responsibilities:
● Own end-to-end infrastructure right from non-prod to prod environment including self-managed DBs
● Codify our infrastructure
● Do what it takes to keep the uptime above 99.99%
● Understand the bigger picture and sail through the ambiguities
● Scale technology considering cost and observability and manage end-to-end processes
● Understand DevOps philosophy and evangelize the principles across the organization
● Strong communication and collaboration skills to break down the silos
Job Requirements:
● B.Tech. / B.E. degree in Computer Science or equivalent software engineering degree/experience
● Minimum 6 yrs of experience working as a DevOps/Infrastructure Consultant
● Must have a firm hold on the container orchestration tool Kubernetes
● Must have expertise in configuration management tools like Ansible, Terraform, Chef / Puppet
● Strong problem-solving skills, and ability to write scripts using any scripting language
● Understanding programming languages like GO/Python, and Java
● Comfortable working on databases like Mongo/Redis/Cassandra/Elasticsearch/Kafka.
What’s there for you?
Company’s team handles everything – infra, tooling, and self-manages a bunch of databases, such as
● 150+ microservices with event-driven architecture across different tech stacks Golang/ java/ node
● More than 100,000 Request per second on our edge gateways
● ~20,000 events per second on self-managed Kafka
● 100s of TB of data on self-managed databases
● 100s of real-time continuous deployment to production
● Self-managed infra supporting
● 100% OSS
Required Skills:
- Experience in systems administration, SRE or DevOps focused role
- Experience in handling production support (on-call)
- Good understanding of the Linux operating system and networking concepts.
- Demonstrated competency with the following AWS services: ECS, EC2, EBS, EKS, S3, RDS, ELB, IAM, Lambda.
- Experience with Docker containers and containerization concepts
- Experience with managing and scaling Kubernetes clusters in a production environment
- Experience building scalable infrastructure in AWS with Terraform.
- Strong knowledge of Protocol-level such as HTTP/HTTPS, SMTP, DNS, and LDAP
- Experience monitoring production systems
- Expertise in leveraging Automation / DevOps principles, experience with operational tools, and able to apply best practices for infrastructure and software deployment (Ansible).
- HAProxy, Nginx, SSH, MySQL configuration and operation experience
- Ability to work seamlessly with software developers, QA, project managers, and business development
- Ability to produce and maintain written documentation
Job Title: DevOps SDE llI
Job Summary
Porter seeks an experienced cloud and DevOps engineer to join our infrastructure platform team. This team is responsible for the organization's cloud platform, CI/CD, and observability infrastructure. As part of this team, you will be responsible for providing a scalable, developer-friendly cloud environment by participating in the design, creation, and implementation of automated processes and architectures to achieve our vision of an ideal cloud platform.
Responsibilities and Duties
In this role, you will
- Own and operate our application stack and AWS infrastructure to orchestrate and manage our applications.
- Support our application teams using AWS by provisioning new infrastructure and contributing to the maintenance and enhancement of existing infrastructure.
- Build out and improve our observability infrastructure.
- Set up automated auditing processes and improve our applications' security posture.
- Participate in troubleshooting infrastructure issues and preparing root cause analysis reports.
- Develop and maintain our internal tooling and automation to manage the lifecycle of our applications, from provisioning to deployment, zero-downtime and canary updates, service discovery, container orchestration, and general operational health.
- Continuously improve our build pipelines, automated deployments, and automated testing.
- Propose, participate in, and document proof of concept projects to improve our infrastructure, security, and observability.
Qualifications and Skills
Hard requirements for this role:
- 5+ years of experience as a DevOps / Infrastructure engineer on AWS.
- Experience with git, CI / CD, and Docker. (We use GitHub, GitHub actions, Jenkins, ECS and Kubernetes).
- Experience in working with infrastructure as code (Terraform/CloudFormation).
- Linux and networking administration experience.
- Strong Linux Shell scripting experience.
- Experience with one programming language and cloud provider SDKs. (Python + boto3 is preferred)
- Experience with configuration management tools like Ansible and Packer.
- Experience with container orchestration tools. (Kubernetes/ECS).
- Database administration experience and the ability to write intermediate-level SQL queries. (We use Postgres)
- AWS SysOps administrator + Developer certification or equivalent knowledge
Good to have:
- Experience working with ELK stack.
- Experience supporting JVM applications.
- Experience working with APM tools is good to have. (We use datadog)
- Experience working in a XaaC environment. (Packer, Ansible/Chef, Terraform/Cloudformation, Helm/Kustomise, Open policy agent/Sentinel)
- Experience working with security tools. (AWS Security Hub/Inspector/GuardDuty)
- Experience with JIRA/Jira help desk.
Now, more than ever, the Toast team is committed to our customers. We’re taking steps to help restaurants navigate these unprecedented times with technology, resources, and community. Our focus is on building a restaurant platform that helps restaurants adapt, take control, and get back to what they do best: building the businesses they love. And because our technology is purpose-built for restaurants by restaurant people, restaurants can trust that we’ll deliver on their needs for today while investing in experiences that will power their restaurant of the future.
At Toast, our Site Reliability Engineers (SREs) are responsible for keeping all customer-facing services and other Toast production systems running smoothly. SREs are a blend of pragmatic operators and software craftspeople who apply sound software engineering principles, operational discipline, and mature automation to our environments and our codebase. Our decisions are based on instrumentation and continuous observability, as well as predictions and capacity planning.
About this roll* (Responsibilities)
- Gather and analyze metrics from both operating systems and applications to assist in performance tuning and fault finding
- Partner with development teams to improve services through rigorous testing and release procedures
- Participate in system design consulting, platform management, and capacity planning
- Create sustainable systems and services through automation and uplift
- Balance feature development speed and reliability with well-defined service level objectives
Troubleshooting and Supporting Escalations:
- Gather and analyze metrics from both operating systems and applications to assist in performance tuning and fault finding
- Diagnose performance bottlenecks and implement optimizations across infrastructure, databases, web, and mobile applications
- Implement strategies to increase system reliability and performance through on-call rotation and process optimization
- Perform and run blameless RCAs on incidents and outages aggressively, looking for answers that will prevent the incident from ever happening again
Do you have the right ingredients? (Requirements)
- Extensive industry experience with at least 7+ years in SRE and/or DevOps roles
- Polyglot technologist/generalist with a thirst for learning
- Deep understanding of cloud and microservice architecture and the JVM
- Experience with tools such as APM, Terraform, Ansible, GitHub, Jenkins, and Docker
- Experience developing software or software projects in at least four languages, ideally including two of Go, Python, and Java
- Experience with cloud computing technologies ( AWS cloud provider preferred)
Bread puns are encouraged but not required
Objectives :
- Building and setting up new development tools and infrastructure
- Working on ways to automate and improve development and release processes
- Testing code written by others and analyzing results
- Ensuring that systems are safe and secure against cybersecurity threats
- Identifying technical problems and developing software updates and ‘fixes’
- Working with software developers and software engineers to ensure that development follows established processes and works as intended
- Planning out projects and being involved in project management decisions
Daily and Monthly Responsibilities :
- Deploy updates and fixes
- Build tools to reduce occurrences of errors and improve customer experience
- Develop software to integrate with internal back-end systems
- Perform root cause analysis for production errors
- Investigate and resolve technical issues
- Develop scripts to automate visualization
- Design procedures for system troubleshooting and maintenance
Skills and Qualifications :
- Degree in Computer Science or Software Engineering or BSc in Computer Science, Engineering or relevant field
- 3+ years of experience as a DevOps Engineer or similar software engineering role
- Proficient with git and git workflows
- Good logical skills and knowledge of programming concepts(OOPS,Data Structures)
- Working knowledge of databases and SQL
- Problem-solving attitude
- Collaborative team spirit
-
4+ years of experience in IT and infrastructure
-
2+ years of experience in Azure Devops
-
Experience with Azure DevOps using both as CI / CD tool and Agile framework
-
Practical experience building and maintaining automated operational infrastructure
-
Experience in building React or Angular applications, .NET is must.
-
Practical experience using version control systems with Azure Repo
-
Developed and maintained scripts using Power Shell, ARM templates/ Terraform scripts for Infrastructure as a Code.
-
Experience in Linux shell scripting (Ubuntu) is must
-
Hands on experience with release automation, configuration and debugging.
-
Should have good knowledge of branching and merging
-
Integration of tools like static code analysis tools like SonarCube and Snky or static code analyser tools is a must.
What you will do:
- Handling Configuration Management, Web Services Architectures, DevOps Implementation, Build & Release Management, Database management, Backups and monitoring
- Logging, metrics and alerting management
- Creating Docker files
- Performing root cause analysis for production errors
What you need to have:
- 12+ years of experience in Software Development/ QA/ Software Deployment with 5+ years of experience in managing high performing teams
- Proficiency in VMware, AWS & cloud applications development, deployment
- Good knowledge in Java, Node.js
- Experience working with RESTful APIs, JSON etc
- Experience with Unit/ Functional automation is a plus
- Experience with MySQL, Mango DB, Redis, Rabbit MQ
- Proficiency in Jenkins. Ansible, Terraform/Chef/Ant
- Proficiency in Linux based Operating Systems
- Proficiency of Cloud Infrastructure like Dockers, Kubernetes
- Strong problem solving and analytical skills
- Good written and oral communication skills
- Sound understanding in areas of Computer Science such as algorithms, data structures, object oriented design, databases
- Proficiency in monitoring and observability
● Building and managing multiple application environments on AWS using automation tools like Terraform or
Cloudformation etc.
● Deploy applications with zero downtime via automation with configuration management tools such as Ansible.
● Setting up Infrastructure monitoring tools such as Prometheus, Grafana
● Setting up centralised logging using tools such as ELK.
● Containerisation of applications/microservices.
● Ensure application availability to 99.9% with highly available infrastructure.
● Monitoring performance of applications and databases.
● Ensuring that systems are safe and secure against cyber security threats.
● Working with software developers to ensure that release cycle and deployment processes are followed.
● Evaluating existing applications and platforms, give recommendations for enhancing performance via gap analysis,
identifying the most practical alternative solutions and assisting with modifications.
Skills -
● Strong knowledge of AWS Managed Services such as EC2, RDS, ECS, ECR, S3, Cloudfront, SES, Redshift, Elastic Cache,
AMQP etc.
● Experience in handling production workloads.
● Experience with Nginx web server.
● Experience with NoSql and Sql Databases such as MongoDB, Postgresql etc.
● Experience with Containerisation of applications/micro services using Docker.
● Understanding of system administration in Linux environments.
● Strong Knowledge of Infrastructure as a Code such as Terraform, Cloudformation etc.
● Strong knowledge of configuration management tools such as Ansible, Chef etc.
● Familiarity with tools such as GitLab, Jenkins, Vercel, JIRA etc.
● Proficiency in scripting languages including Bash, Python etc.
● Full understanding of software development lifecycle best practices and agile methodology
● Strong communication and documentation skills.
● An ability to drive to goals and milestones while valuing and maintaining a strong attention to detail
● Excellent judgment, analytical thinking, and problem-solving skills
● Self-motivated individual that possesses excellent time management and organizational skills
About Hop:
We are a London, UK based FinTech startup with a subsidiary in India. Hop is working towards building the next generation digital banking platform for seamless and economical currency exchange, with technology at the crux of it. In a technology driven era, many financial services platforms still lack the customer experience and are cumbersome to use. Hop aims at building a ‘state of the art’ tech-centric, customer focused solution.
moneyHOP is India’s first cross-border neo-bank providing millennials the ability to ‘Send’ & ‘Spend’ conveniently and economically across the globe using HOPRemit (An online remittance portal) and HOP app + Card (A multi-currency bank account).
This position is a crucially important position in the firm and the person hired will have the liberty to drive the product and provide direction in line with business needs.
Website: https://moneyhop.co/">https://moneyhop.co/
About Individual
Looking for an enthusiastic individual who is passionate about technology and has worked with either a start-up or a blue-chip firm in the past.
The candidate needs to be a multi-tasker, highly self-motivated, self-starter and have the ability to work in a high stress environment. He/she should be tech savvy and willing to embrace new technology comfortably.
Ideally, the candidate should have experience working with the technology stack in the scalable and high growth mobile application software.
General Skills
- 3-4 years of experience in DevOps.
- Bachelor's degree in Computer Science, Information Science, or equivalent practical experience.
- Exposure to Behaviour Driven Development and experience in programming and testing.
- Excellent verbal and written communication skills.
- Good time management and organizational skills.
- Dependability
- Accountability and Ownership
- Right attitude and growth mindset
- Trust-worthiness
- Ability to embrace new technologies
- Ability to get work done
- Should have excellent analytical and troubleshooting skills.
Technical Skills
- Work with developer teams with a focus on automating build and deployment using tools such as Jenkins.
- Implement CI/CD in projects (GitLabCI preferred).
- Enable software build and deploy.
- Provisioning both day to day operations and automation using tools, e. g. Ansible, Bash.
- Write, plan, create infra as a code using Terraform.
- Monitoring, ITSM automation incident creation from alerts using licensed and open source tools.
- Manage credentials for AWS cloud servers, github repos, Atlassian Cloud services, Jenkins, OpenVPN, and the developers environment.
- Building environments for unit tests, integration tests, system tests, and acceptance tests using Jenkins.
- Create and spin off resource instances.
- Experience implementing CI/CD.
- Experience with infrastructure automation solutions (Ansible, Chef, Puppet, etc. ).
- Experience with AWS.
- Should have expert Linux and Network administration skills to troubleshoot and trace symptoms back to the root cause.
- Knowledge of application clustering / load balancing concepts and technologies.
- Demonstrated ability to think strategically about developing solution strategies, and deliver results.
- Good understanding of design of native Cloud applications Cloud application design patterns and practices in AWS.
Day-to-Day requirements
- Work with the developer team to enhance the existing CI/CD pipeline.
- Adopt industry best practices to set up a UAT and prod environment for scalability.
- Manage the AWS resources including IAM users, access control, billing etc.
- Work with the test automation engineer to establish a CI/CD pipeline.
- Work on replication of environments easy to implement.
- Enable efficient software deployment.



