
- Understanding of maintenance of existing systems (Virtual machines), Linux stack
- Experience running, operating and maintainence of Kubernetes pods
- Strong Scripting skills
- Experience in AWS
- Knowledge of configuring/optimizing open source tools like Kafka, etc.
- Strong automation maintenance - ability to identify opportunities to speed up build and deploy process with strong validation and automation
- Optimizing and standardizing monitoring, alerting.
- Experience in Google cloud platform
- Experience/ Knowledge in Python will be an added advantage
- Experience on Monitoring Tools like Jenkins, Kubernetes ,Terraform etc

About EnterpriseMinds
About
Enterprise Minds, with core focus on engineering products, automation and intelligence, partners customers on the trajectory towards increasing outcomes, relevance and growth.
Harnessing the power of Data and the forces that define AI, Machine Learning and Data Science, we believe in institutionalising go-to-market models and not just explore possibilities.
We believe in a customer-centric ethic without and people-centric paradigm within. With a strong sense of community, ownership and collaboration our people work in a spirit of co-creation, co-innovation and co-development to engineer next-generation software products with the help of accelerators.
Through Communities we connect and attract talent that shares skills and expertise. Through Innovation Labs and global design studios we deliver creative solutions.
We create vertical isolated pods which has narrow but deep focus. We also create horizontal pods to collaborate and deliver sustainable outcomes.
We follow Agile methodologies to fail fast and deliver scalable and modular solutions. We constantly self-asses and realign to work with each customer in the most impactful manner.
Photos
Connect with the team
Company social profiles
Similar jobs
We are seeking a highly experienced Azure AI, AIOps & MLOps Architect to lead enterprise-scale AI platform engineering, cloud modernization, DevSecOps transformation, and intelligent automation initiatives.
The ideal candidate should possess deep expertise in Microsoft Azure, Azure AI Foundry, Azure OpenAI, Azure Machine Learning, Kubernetes, Terraform, Azure DevOps, and enterprise observability platforms. The role will focus on designing scalable AI platforms, implementing MLOps and AIOps capabilities, enabling Agentic AI architectures, and driving cloud-native engineering practices across the organization.
Key Responsibilities
Cloud Architecture & Engineering
• Design and implement scalable, secure, and highly available solutions on Microsoft Azure.
• Define cloud architecture standards, reference architectures, and best practices.
• Lead cloud migration and modernisation initiatives across enterprise workloads.
• Implement multi-region disaster recovery and business continuity strategies.
• Oversee Azure networking, identity, security, and governance frameworks.
DevOps & CI/CD
• Architect and implement end-to-end CI/CD pipelines using Azure DevOps or GitHub Actions.
• Drive DevSecOps culture — embedding security scanning, quality gates, and compliance into the delivery pipeline.
• Champion Infrastructure-as-Code (IaC) practices using Terraform, Bicep, or ARM templates.
• Establish branching strategies, release management, and environment promotion standards.
• Define and enforce platform engineering standards and internal developer tooling.
AI & Machine Learning Integration
• Architect AI/ML solutions leveraging Azure AI services — Azure OpenAI, Azure Machine Learning, Azure AI Foundry, and Cognitive Services.
• Design intelligent automation and agentic workflows integrated into enterprise DevOps processes.
• Implement AI-powered capabilities such as code review assistance, anomaly detection, predictive analytics, and natural language automation.
• Define AI governance frameworks: model evaluation, prompt management, responsible AI, and cost controls.
• Design and implement enterprise MLOps frameworks.
• Build automated model training, validation, deployment, and monitoring pipelines.
• Establish model governance and lifecycle management.
Generative AI & Agentic AI
• Design enterprise GenAI solutions using Azure OpenAI.
• Build AI Agents using Azure AI Foundry.
• Develop Agent-to-Agent communication patterns.
• Implement Retrieval Augmented Generation (RAG) architectures.
• Build enterprise Knowledge Management and AI Skill Registry platforms.
• Design multi-agent orchestration frameworks.
Leadership & Stakeholder Engagement
• Serve as the technical authority and subject matter expert for Azure AI and DevOps practices.
• Mentor and guide junior architects, developers, and DevOps engineers.
• Collaborate with business stakeholders, product owners, and vendors to translate requirements into technical solutions.
• Produce architecture documentation, decision records (ADRs), and roadmaps.
• Represent the technology function in enterprise architecture forums and governance boards.
Required Qualifications
• Bachelor's or Master's degree in Computer Science, Information Technology, or a related field.
• 10+ years of experience in cloud engineering and architecture.
• 5+ years of hands-on experience with Microsoft Azure across compute, networking, storage, identity, and data services.
• Proven experience designing and implementing enterprise-grade CI/CD pipelines.
• Strong hands-on expertise with Infrastructure-as-Code (Terraform, Bicep, or ARM).
• Demonstrated experience architecting and deploying AI/ML solutions on Azure (Azure OpenAI, Azure ML, AI Foundry).
• Deep knowledge of DevSecOps principles, tools, and practices.
• Experience with containerisation and orchestration: Docker, Kubernetes (AKS).
• Proficiency in scripting and development: Python, PowerShell, Bash.
• Excellent communication and stakeholder management skills.
Preferred Qualifications
• Microsoft Certified: Azure Solutions Architect Expert.
• Microsoft Certified: DevOps Engineer Expert.
• Microsoft Certified: Azure AI Engineer Associate.
• Experience with Azure API Management (APIM), Event Grid, and Azure Functions.
• Familiarity with Datadog, Prometheus, or equivalent observability platforms.
• Experience in the real estate, retail, or enterprise industry sector.
• Knowledge of agentic AI frameworks and LLM orchestration patterns (LangChain, Semantic Kernel, MCP).
• Background in building Internal Developer Platforms (IDP).
JOB DETAILS:
- Job Title: Lead DevOps Engineer
- Industry: Ride-hailing
- Experience: 6-9 years
- Working Days: 5 days/week
- Work Mode: ONSITE
- Job Location: Bangalore
- CTC Range: Best in Industry
Required Skills: Cloud & Infrastructure Operations, Kubernetes & Container Orchestration, Monitoring, Reliability & Observability, Proficiency with Terraform, Ansible etc., Strong problem-solving skills with scripting (Python/Go/Shell)
Criteria:
1. Candidate must be from a product-based or scalable app-based start-ups company with experience handling large-scale production traffic.
2. Minimum 6 yrs of experience working as a DevOps/Infrastructure Consultant
3. Candidate must have 2 years of experience as an lead (handling team of 3 to 4 members at least)
4. Own end-to-end infrastructure right from non-prod to prod environment including self-managed
5. Candidate must have Self experience in database migration from scratch
6. Must have a firm hold on the container orchestration tool Kubernetes
7. Should have expertise in configuration management tools like Ansible, Terraform, Chef / Puppet
8. Understanding programming languages like GO/Python, and Java
9. Working on databases like Mongo/Redis/Cassandra/Elasticsearch/Kafka.
10. Working experience on Cloud platform -AWS
11. Candidate should have Minimum 1.5 years stability per organization, and a clear reason for relocation.
Description
Job Summary:
As a DevOps Engineer at company, you will be working on building and operating infrastructure at scale, designing and implementing a variety of tools to enable product teams to build and deploy their services independently, improving observability across the board, and designing for security, resiliency, availability, and stability. If the prospect of ensuring system reliability at scale and exploring cutting-edge technology to solve problems, excites you, then this is your fit.
Job Responsibilities:
● Own end-to-end infrastructure right from non-prod to prod environment including self-managed DBs
● Codify our infrastructure
● Do what it takes to keep the uptime above 99.99%
● Understand the bigger picture and sail through the ambiguities
● Scale technology considering cost and observability and manage end-to-end processes
● Understand DevOps philosophy and evangelize the principles across the organization
● Strong communication and collaboration skills to break down the silos
Job Requirements:
● B.Tech. / B.E. degree in Computer Science or equivalent software engineering degree/experience
● Minimum 6 yrs of experience working as a DevOps/Infrastructure Consultant
● Must have a firm hold on the container orchestration tool Kubernetes
● Must have expertise in configuration management tools like Ansible, Terraform, Chef / Puppet
● Strong problem-solving skills, and ability to write scripts using any scripting language
● Understanding programming languages like GO/Python, and Java
● Comfortable working on databases like Mongo/Redis/Cassandra/Elasticsearch/Kafka.
What’s there for you?
Company’s team handles everything – infra, tooling, and self-manages a bunch of databases, such as
● 150+ microservices with event-driven architecture across different tech stacks Golang/ java/ node
● More than 100,000 Request per second on our edge gateways
● ~20,000 events per second on self-managed Kafka
● 100s of TB of data on self-managed databases
● 100s of real-time continuous deployment to production
● Self-managed infra supporting
● 100% OSS
Role & Responsibilities
- Develop and deliver automation software to build and improve platform functionality
- Ensure reliability, availability, and manageability of applications and cloud platforms
- Champion adoption of Infrastructure as Code (IaC) practices
- Design and build self-service, self-healing, monitoring, and alerting platforms
- Automate development and testing workflows through CI/CD pipelines (Git, Jenkins, SonarQube, Artifactory, Docker containers)
- Build and manage container hosting platforms using Kubernetes
Requirements
- Strong experience deploying and maintaining GCP cloud infrastructure
- Well-versed in service-oriented and cloud-based architecture design patterns
- Knowledge of cloud services including compute, storage, networking, messaging, and automation tools (e.g., CloudFormation/Terraform equivalents)
- Experience with relational and NoSQL databases (Postgres, Cassandra)
- Hands-on experience with automation/configuration tools (Puppet, Chef, Ansible, Terraform)
Additional Skills
- Strong Linux system administration and troubleshooting skills
- Programming/scripting exposure (Bash, Python, Core Java, or Scala)
- CI/CD pipeline experience (Jenkins, Git, Maven, etc.)
- Experience integrating solutions in multi-region environments
- Familiarity with Agile/Scrum/DevOps methodologies
Please Apply - https://zrec.in/7EYKe?source=CareerSite
About Us
Infra360 Solutions is a services company specializing in Cloud, DevSecOps, Security, and Observability solutions. We help technology companies adapt DevOps culture in their organization by focusing on long-term DevOps roadmap. We focus on identifying technical and cultural issues in the journey of successfully implementing the DevOps practices in the organization and work with respective teams to fix issues to increase overall productivity. We also do training sessions for the developers and make them realize the importance of DevOps. We provide these services - DevOps, DevSecOps, FinOps, Cost Optimizations, CI/CD, Observability, Cloud Security, Containerization, Cloud Migration, Site Reliability, Performance Optimizations, SIEM and SecOps, Serverless automation, Well-Architected Review, MLOps, Governance, Risk & Compliance. We do assessments of technology architecture, security, governance, compliance, and DevOps maturity model for any technology company and help them optimize their cloud cost, streamline their technology architecture, and set up processes to improve the availability and reliability of their website and applications. We set up tools for monitoring, logging, and observability. We focus on bringing the DevOps culture to the organization to improve its efficiency and delivery.
Job Description
Job Title: Senior DevOps Engineer / SRE
Department: Technology
Location: Gurgaon
Work Mode: On-site
Working Hours: 10 AM - 7 PM
Terms: Permanent
Experience: 4-6 years
Education: B.Tech/MCA
Notice Period: Immediately
About Us
At Infra360.io, we are a next-generation cloud consulting and services company committed to delivering comprehensive, 360-degree solutions for cloud, infrastructure, DevOps, and security. We partner with clients to transform and optimize their technology landscape, ensuring resilience, scalability, cost efficiency and innovation.
Our core services include Cloud Strategy, Site Reliability Engineering (SRE), DevOps, Cloud Security Posture Management (CSPM), and related Managed Services. We specialize in driving operational excellence across multi-cloud environments, helping businesses achieve their goals with agility and reliability.
We thrive on ownership, collaboration, problem-solving, and excellence, fostering an environment where innovation and continuous learning are at the forefront. Join us as we expand and redefine what’s possible in cloud technology and infrastructure.
Role Summary
We are seeking a Senior DevOps Engineer (SRE) to manage and optimize large-scale, mission-critical production systems. The ideal candidate will have a strong problem-solving mindset, extensive experience in troubleshooting, and expertise in scaling, automating, and enhancing system reliability. This role requires hands-on proficiency in tools like Kubernetes, Terraform, CI/CD, and cloud platforms (AWS, GCP, Azure), along with scripting skills in Python or Go. The candidate will drive observability and monitoring initiatives using tools like Prometheus, Grafana, and APM solutions (Datadog, New Relic, OpenTelemetry).
Strong communication, incident management skills, and a collaborative approach are essential. Experience in team leadership and multi-client engagement is a plus.
Ideal Candidate Profile
- Solid 4-6 years of experience as an SRE and DevOps with a proven track record of handling large-scale production environments
- Bachelor's or Master's degree in Computer Science, Engineering, or a related field
- Strong Hands-on experience with managing Large Scale Production Systems
- Strong Production Troubleshooting Skills and handling high-pressure situations.
- Strong Experience with Databases (PostgreSQL, MongoDB, ElasticSearch, Kafka)
- Worked on making production systems more Scalable, Highly Available and Fault-tolerant
- Hands-on experience with ELK or other logging and observability tools
- Hands-on experience with Prometheus, Grafana & Alertmanager and on-call processes like Pagerduty
- Problem-Solving Mindset
- Strong with skills - K8s, Terraform, Helm, ArgoCD, AWS/GCP/Azure etc
- Good with Python/Go Scripting Automation
- Strong with fundamentals like DNS, Networking, Linux
- Experience with APM tools like - Newrelic, Datadog, OpenTelemetry
- Good experience with Incident Response, Incident Management, Writing detailed RCAs
- Experience with Applications best practices in making apps more reliable and fault-tolerant
- Strong leadership skills and the ability to mentor team members and provide guidance on best practices.
- Able to manage multiple clients and take ownership of client issues.
- Experience with Git and coding best practices
Good to have
- Team-leading Experience
- Multiple Client Handling
- Requirements gathering from clients
- Good Communication
Key Responsibilities
- Design and Development:
- Architect, design, and develop high-quality, scalable, and secure cloud-based software solutions.
- Collaborate with product and engineering teams to translate business requirements into technical specifications.
- Write clean, maintainable, and efficient code, following best practices and coding standards.
- Cloud Infrastructure:
- Develop and optimise cloud-native applications, leveraging cloud services like AWS, Azure, or Google Cloud Platform (GCP).
- Implement and manage CI/CD pipelines for automated deployment and testing.
- Ensure the security, reliability, and performance of cloud infrastructure.
- Technical Leadership:
- Mentor and guide junior engineers, providing technical leadership and fostering a collaborative team environment.
- Participate in code reviews, ensuring adherence to best practices and high-quality code delivery.
- Lead technical discussions and contribute to architectural decisions.
- Problem Solving and Troubleshooting:
- Identify, diagnose, and resolve complex software and infrastructure issues.
- Perform root cause analysis for production incidents and implement preventative measures.
- Continuous Improvement:
- Stay up-to-date with the latest industry trends, tools, and technologies in cloud computing and software engineering.
- Contribute to the continuous improvement of development processes, tools, and methodologies.
- Drive innovation by experimenting with new technologies and solutions to enhance the platform.
- Collaboration:
- Work closely with DevOps, QA, and other teams to ensure smooth integration and delivery of software releases.
- Communicate effectively with stakeholders, including technical and non-technical team members.
- Client Interaction & Management:
- Will serve as a direct point of contact for multiple clients.
- Able to handle the unique technical needs and challenges of two or more clients concurrently.
- Involve both direct interaction with clients and internal team coordination.
- Production Systems Management:
- Must have extensive experience in managing, monitoring, and debugging production environments.
- Will work on troubleshooting complex issues and ensure that production systems are running smoothly with minimal downtime.
We are looking "Sr.Software Engineer(Devops)" for Reputed Client @ Bangalore Permanent Role.
Experience: 4+ Yrs
Responsibilities:
• As part of a team you will design, develop, and maintain scalable multi cloud DevOps blueprint.
• Understand overall virtualization platform architecture in cloud environments and design best of class solutions that fit the SaaS offering & legacy application modernization
• Continuously improve CI/CD pipeline, tools, processes and procedures and systems relating to Developer Productivity
• Collaborate continuously with the product development teams to implement CI/CD pipeline.
• Contribute to the subject matter on Developer Productivity, DevOps, Infrastructure Automation best practices.
Mandatory Skills:
• 1+ years of commercial server-side software development experience & 3+ years of commercial DevOps experience.
• Strong scripting skills (Java or Python) is a must.
• Experience with automation tools such as Ansible, Chef, Puppet etc.
• Hands-on experience with CI/CD tools such as GitLab, Jenkins, Nexus, Artifactory, Maven, Gradle
• Hands-on working experience in developing or deploying microservices is a must.
• Hands-on working experience of at least of the popular cloud infrastructure such as AWS / Azure / GCP / Red Hat OpenStack is a must.
• Knowledge about microservices hosted in leading cloud environments
• Experience with containerizing applications (Docker preferred) is a must
• Hands-on working experience of automating deployment, scaling, and management of containerized applications (Kubernetes) is a must.
• Strong problem-solving, analytical skills and good understanding of the best practices for building, testing, deploying and monitoring software
Mandatory Skills:
• Experience working with Secret management services such as HashiCorp Vault is desirable.
• Experience working with Identity and access management services such as Okta, Cognito is desirable.
• Experience with monitoring systems such as Prometheus, Grafana is desirable.
Educational Qualifications and Experience:
• B.E/B.Tech/MCA/M.Tech (Computer science/Information science/Information Technology is a Plus)
• 4 to 6 years of hands-on experience in server-side application development & DevOps
About us:
HappyFox is a software-as-a-service (SaaS) support platform. We offer an enterprise-grade help desk ticketing system and intuitively designed live chat software.
We serve over 12,000 companies in 70+ countries. HappyFox is used by companies that span across education, media, e-commerce, retail, information technology, manufacturing, non-profit, government and many other verticals that have an internal or external support function.
To know more, Visit! - https://www.happyfox.com/
Responsibilities
- Build and scale production infrastructure in AWS for the HappyFox platform and its products.
- Research, Build/Implement systems, services and tooling to improve uptime, reliability and maintainability of our backend infrastructure. And to meet our internal SLOs and customer-facing SLAs.
- Implement consistent observability, deployment and IaC setups
- Lead incident management and actively respond to escalations/incidents in the production environment from customers and the support team.
- Hire/Mentor other Infrastructure engineers and review their work to continuously ship improvements to production infrastructure and its tooling.
- Build and manage development infrastructure, and CI/CD pipelines for our teams to ship & test code faster.
- Lead infrastructure security audits
Requirements
- At least 7 years of experience in handling/building Production environments in AWS.
- At least 3 years of programming experience in building API/backend services for customer-facing applications in production.
- Proficient in managing/patching servers with Unix-based operating systems like Ubuntu Linux.
- Proficient in writing automation scripts or building infrastructure tools using Python/Ruby/Bash/Golang
- Experience in deploying and managing production Python/NodeJS/Golang applications to AWS EC2, ECS or EKS.
- Experience in security hardening of infrastructure, systems and services.
- Proficient in containerised environments such as Docker, Docker Compose, Kubernetes
- Experience in setting up and managing test/staging environments, and CI/CD pipelines.
- Experience in IaC tools such as Terraform or AWS CDK
- Exposure/Experience in setting up or managing Cloudflare, Qualys and other related tools
- Passion for making systems reliable, maintainable, scalable and secure.
- Excellent verbal and written communication skills to address, escalate and express technical ideas clearly
- Bonus points – Hands-on experience with Nginx, Postgres, Postfix, Redis or Mongo systems.
Job Description:
- Hands on experience with Ansible & Terraform.
- Scripting language, such as Python or Bash or PowerShell is required and willingness to learn and master others.
- Troubleshooting and resolving automation, build, and CI/CD related issues (in cloud environment like AWS or Azure).
- Experience with Kubernetes is mandate.
- To develop and maintain tooling and environments for test and production environments.
- Assist team members in the development and maintenance of tooling for integration testing, performance testing, security testing, as well as source control systems (that includes working in CI systems like Azure DevOps, Team City, and orchestration tools like Octopus).
- Good with Linux environment.
DevOps Engineer responsibilities include deploying product updates, identifying production issues, and implementing integrations that meet customer needs. If you have a solid background in working with cloud technologies, set up efficient deployment processes, and are motivated to work with diverse and talented teams, we’d like to meet you.
Ultimately, you will execute and automate operational processes fast, accurately, and securely.
Skills and Experience
-
4+ years of experience in building infrastructure experience with Cloud Providers ( AWS, Azure, GCP)
-
Experience in deploying containerized applications build on NodeJS/PHP/Python to kubernetes cluster.
-
Experience in monitoring production workload with relevant metrics and dashboards.
-
Experience in writing automation scripts using Shell, Python, Terraform, etc.
-
Experience in following security practices while setting up the infrastructure.
-
Self-motivated, able, and willing to help where help is needed
-
Able to build relationships, be culturally sensitive, have goal alignment, have learning agility
Roles and Responsibilities
-
Manage various resources across different cloud providers. (Azure, AWS, and GCP)
-
Monitor and optimize infrastructure cost.
-
Manage various kubernetes clusters with appropriate monitoring and alerting setup.
-
Build CI/CD pipelines to orchestrate provisioning and deployment of various services into kubernetes infrastructure.
-
Work closely with the development team on upcoming features to determine the correct infrastructure and related tools.
-
Assist the support team with escalated customer issues.
-
Develop, improve, and thoroughly document operational practices and procedures.
-
Responsible for setting up good security practices across various clouds.











