11+ Icinga Jobs in Delhi, NCR and Gurgaon | Icinga Job openings in Delhi, NCR and Gurgaon
Apply to 11+ Icinga Jobs in Delhi, NCR and Gurgaon on CutShort.io. Explore the latest Icinga Job opportunities across top companies like Google, Amazon & Adobe.
Problem troubleshooting & Solving skills
Hands-on Hyper Converged Infrastructure & Virtualization technology Like: VMWare, RHEV And Nutanix.
Experience in Monitoring tools: Nagios, Icinga etc.
Knowledge of Backup Technologies like Commvault Etc.
Hands-on experience on storage Systems i.e. SAN/NAS, Net Backup- Dell EMC
Knowledge of CIS Security benchmarks.
Expert on UNIX, Shell, Bash Scripting.
Role - IT Cloud Engineer/ Devops
- Proficient in Linux.
- Hands on experience with AWS cloud or Google Cloud.
- Knowledge of container technology like Docker.
- Expertise in scripting languages. (Shell scripting or Python scripting)
- Working knowledge of LAMP/LEMP stack, networking and version control system like Gitlab or Github.
Job Description:
The incumbent would be responsible for:
- Deployment of various infrastructures on Cloud platforms like AWS, GCP, Azure, OVH etc.
- Server monitoring, analysis and troubleshooting.
- Deploying multi-tier architectures using microservices.
- Integration of Container technologies like Docker, Kubernetes etc as per application requirement.
- Automating workflow with python or shell scripting.
- CI and CD integration for application lifecycle management.
- Hosting and managing websites on Linux machines.
- Frontend, backend and database optimization.
- Protecting operations by keeping information confidential.
- Providing information by collecting, analyzing, summarizing development & service issues.
- Prepares & installs solutions by determining and designing system specifications, standards & programming.
DevOps Engineer (Cloud & Infrastructure)
📍 Noida | 🕐 Full-Time | 🧭 Experience: 2–4 years
About TestMu AI
TestMu AI (formerly LambdaTest) is an AI-native platform designed to move software testing beyond simple automation into the era of agentic intelligence. It provides end-to-end AI agents that manage the entire Quality Engineering lifecycle.
- Full-Stack AI Agents: Autonomously plan, author, execute, and analyze tests across the SDLC.
- Comprehensive Coverage: Supports web, mobile, and enterprise applications.
- Real-World Testing: Scale execution across real devices, browsers, and custom environments.
About the Role
This isn't a role for someone who just wants to "maintain" systems. As a DevOps Engineer at TestMu AI, you are the architect of the automated highways that power our AI agents. You will step into a fast-paced environment where you bridge the gap between cloud-native automation and core infrastructure.
You will manage complex CI/CD pipelines, troubleshoot deep-seated Linux issues, and ensure our hybrid-cloud environment (AWS/Azure) is as resilient as the code it runs.
Key Responsibilities: The Pillars of Growth
A. DevOps & Automation (50% Focus)
- Platform Orchestration: Lead the migration to modular, self-healing Terraform and Helm templates.
- Agentic CI/CD: Architect GitHub Actions workflows that treat AI agents as first-class citizens, automating environment promotion and risk scoring.
- Kubernetes Mastery: Advanced management of Docker and K8s clusters to support scalable production workloads.
- Predictive Observability: Use Prometheus, Grafana, and ELK to move from reactive alerts to autonomous anomaly detection.
B. Networking & Data Center Mastery (30% Focus)
- Hybrid Networking: Design and troubleshoot VPCs and subnets in Azure/AWS/GCP, paired with physical VLANs and switches in our data centers.
- Bare-Metal Lifecycle: Automate hardware provisioning, RAID setup, and firmware updates for our real-device cloud.
- Remote Admin: Master out-of-band management (iDRAC, iLO, IPMI) to ensure 100% remote operational capability.
- Core Protocols: Own the lifecycle of DNS, DHCP, Load Balancing, and IPAM across distributed environments.
C. Development & Scripting (20% Focus)
- Backend Integration: Debug and optimize Python or Go code; understanding how logic interacts with system-level resources.
- Advanced Scripting: Write idempotent Bash/Python scripts to automate complex, multi-server operational tasks.
- Agentic Tooling: Support the integration of LLM-based developer tools into DevOps workflows to eliminate "toil".
The Interview Journey
We value your ability to solve problems under pressure more than your ability to memorize documentation.
- Technical Round 1 (DevOps Leads): A live session focused on real-world debugging scenarios and Linux fundamentals.
- Technical Round 2 (Hiring Manager / Pod Lead): An assessment of your architectural thinking, automation strategy, and team alignment.
- Technical Round 3 (SVP Engineering / VP DevOps): Strategic discussion on scalability, infrastructure vision, and technical leadership.
- Final Round (CEO): Mission alignment, cultural fit, and the "big picture" at TestMu AI.
Growth Timeline
This is a high-visibility role. You will receive direct mentorship from our senior engineering leadership. As you master our production environment, you will have a clear path to move into Senior DevOps Engineer or Infrastructure Architect roles as our pods scale.
Perks That Matter
Health Cover: Comprehensive insurance for you and your family.
Fresh Meals: Daily catered meals at the office.
Transport: Safe cab facilities for eligible shifts.
Pod Budgets: Dedicated engagement budgets for team building and offsites.
About the Role
We are looking for a Senior DevOps Engineer to lead the design, automation, and scaling of our hybrid cloud infrastructure spanning public cloud and private/on-premises environments. You will partner closely with software engineering, security, and product teams to build reliable, secure, and high-performance systems that support rapid product delivery. This is a hands-on role with significant influence over our infrastructure strategy, deployment workflows, and engineering culture.
Key Responsibilities
- Architect, deploy, and maintain scalable, highly available infrastructure across both public cloud (AWS, Azure, GCP) and private cloud platforms (OpenStack, VMware vSphere/Tanzu, Nutanix, or similar).
- Operate and maintain on-premises infrastructure: hypervisors, compute, storage (Ceph, NetApp, SAN/NAS), networking (SDN, VLANs, BGP, MPLS), and hardware capacity planning, alongside their public cloud equivalents.
- Design and own CI/CD pipelines that deploy seamlessly across public and private environments.
- Implement and manage Infrastructure as Code (Terraform, Ansible, Pulumi) with strong version control and review practices, using providers for both public and private cloud platforms.
- Manage container orchestration (Kubernetes, ECS, OpenShift, Rancher) across managed cloud services and self-managed/bare-metal clusters, including upgrades, autoscaling, and workload reliability.
- Build observability into all systems through logging, metrics, tracing, and alerting (Prometheus, Grafana, Datadog, ELK, or similar) with unified visibility across hybrid environments.
- Champion security best practices: secrets management, IAM hardening, network segmentation, vulnerability scanning, and compliance (SOC 2, ISO 27001, HIPAA, or data-sovereignty requirements).
- Lead incident response, root-cause analysis, and post-mortems; drive long-term reliability improvements and SLO/SLA adherence.
- Optimize cost, capacity, and resource utilization across public cloud spend and on-premises hardware without compromising performance or availability.
- Partner with data center operations and network providers on hardware provisioning, firmware management, MPLS circuit management, and lifecycle planning.
- Mentor junior DevOps and software engineers; promote DevOps culture, automation-first thinking, and shared ownership of production.
- Evaluate and introduce new tools, platforms, and processes that improve developer productivity and system reliability.
Required Qualifications
- 5+ years of experience in DevOps, SRE, or Platform Engineering roles, with at least 2 years at a senior level.
- Deep expertise with at least one major public cloud provider (AWS, Azure, or GCP) in production.
- Hands-on experience operating private cloud or virtualization platforms (OpenStack, VMware, Nutanix, or equivalent) in production.
- Strong experience with virtualization, storage systems, and enterprise networking in on-premises environments.
- Strong hands-on experience with Kubernetes in production, including both managed cloud and self-managed/bare-metal clusters.
- Proficiency in Infrastructure as Code (Terraform and Ansible strongly preferred).
- Solid scripting and programming skills in Python, Go, Bash, or similar.
- Experience designing and operating CI/CD pipelines using tools such as GitHub Actions, GitLab CI, Jenkins, CircleCI, or ArgoCD.
- Strong Linux systems administration and networking fundamentals (TCP/IP, DNS, load balancing, VPNs, firewalls, routing, MPLS).
- Experience with monitoring and observability stacks (Prometheus, Grafana, Datadog, New Relic, ELK, or OpenTelemetry).
- Proven track record of leading incident response and improving system reliability.
- Excellent communication skills and the ability to collaborate across engineering, security, infrastructure, and product teams.
Preferred Qualifications
- Experience designing hybrid and multi-cloud architectures, including secure connectivity (Direct Connect, ExpressRoute, MPLS, VPN, SD-WAN) between public and private environments.
- Familiarity with service meshes (Istio, Linkerd), API gateways, and GitOps workflows (ArgoCD, Flux).
- Background in security-focused or regulated environments and exposure to compliance frameworks.
- Experience with database administration (PostgreSQL, MySQL, Redis, MongoDB) in cloud-managed and self-hosted setups.
- Contributions to open-source DevOps or cloud infrastructure tooling.
- Relevant certifications (AWS Solutions Architect / DevOps Engineer, Azure Administrator, CKA, CKAD, RHCE, VMware VCP, OpenStack Certified Administrator, HashiCorp Terraform Associate).

Global Digital Transformation Solutions Provider
JOB DETAILS:
* Job Title: Specialist I - DevOps Engineering
* Industry: Global Digital Transformation Solutions Provider
* Salary: Best in Industry
* Experience: 7-10 years
* Location: Bengaluru (Bangalore), Chennai, Hyderabad, Kochi (Cochin), Noida, Pune, Thiruvananthapuram
Job Description
Job Summary:
As a DevOps Engineer focused on Perforce to GitHub migration, you will be responsible for executing seamless and large-scale source control migrations. You must be proficient with GitHub Enterprise and Perforce, possess strong scripting skills (Python/Shell), and have a deep understanding of version control concepts.
The ideal candidate is a self-starter, a problem-solver, and thrives on challenges while ensuring smooth transitions with minimal disruption to development workflows.
Key Responsibilities:
- Analyze and prepare Perforce repositories — clean workspaces, merge streams, and remove unnecessary files.
- Handle large files efficiently using Git Large File Storage (LFS) for files exceeding GitHub’s 100MB size limit.
- Use git-p4 fusion (Python-based tool) to clone and migrate Perforce repositories incrementally, ensuring data integrity.
- Define migration scope — determine how much history to migrate and plan the repository structure.
- Manage branch renaming and repository organization for optimized post-migration workflows.
- Collaborate with development teams to determine migration points and finalize migration strategies.
- Troubleshoot issues related to file sizes, Python compatibility, network connectivity, or permissions during migration.
Required Qualifications:
- Strong knowledge of Git/GitHub and preferably Perforce (Helix Core) — understanding of differences, workflows, and integrations.
- Hands-on experience with P4-Fusion.
- Familiarity with cloud platforms (AWS, Azure) and containerization technologies (Docker, Kubernetes).
- Proficiency in migration tools such as git-p4 fusion — installation, configuration, and troubleshooting.
- Ability to identify and manage large files using Git LFS to meet GitHub repository size limits.
- Strong scripting skills in Python and Shell for automating migration and restructuring tasks.
- Experience in planning and executing source control migrations — defining scope, branch mapping, history retention, and permission translation.
- Familiarity with CI/CD pipeline integration to validate workflows post-migration.
- Understanding of source code management (SCM) best practices, including version history and repository organization in GitHub.
- Excellent communication and collaboration skills for cross-team coordination and migration planning.
- Proven practical experience in repository migration, large file management, and history preservation during Perforce to GitHub transitions.
Skills: Github, Kubernetes, Perforce, Perforce (Helix Core), Devops Tools
Must-Haves
Git/GitHub (advanced), Perforce (Helix Core) (advanced), Python/Shell scripting (strong), P4-Fusion (hands-on experience), Git LFS (proficient)
About Us -Celebal Technologies is a premier software services company in the field of Data Science, Big Data and Enterprise Cloud. Celebal Technologies helps you to discover the competitive advantage by employing intelligent data solutions using cutting-edge technology solutions that can bring massive value to your organization. The core offerings are around "Data to Intelligence", wherein we leverage data to extract intelligence and patterns thereby facilitating smarter and quicker decision making for clients. With Celebal Technologies, who understands the core value of modern analytics over the enterprise, we help the business in improving business intelligence and more data-driven in architecting solutions.
Key Responsibilities
• As a part of the DevOps team, you will be responsible for configuration, optimization, documentation, and support of the CI/CD components.
• Creating and managing build and release pipelines with Azure DevOps and Jenkins.
• Assist in planning and reviewing application architecture and design to promote an efficient deployment process.
• Troubleshoot server performance issues & handle the continuous integration system.
• Automate infrastructure provisioning using ARM Templates and Terraform.
• Monitor and Support deployment, Cloud-based and On-premises Infrastructure.
• Diagnose and develop root cause solutions for failures and performance issues in the production environment.
• Deploy and manage Infrastructure for production applications
• Configure security best practices for application and infrastructure
Essential Requirements
• Good hands-on experience with cloud platforms like Azure, AWS & GCP. (Preferably Azure)
• Strong knowledge of CI/CD principles.
• Strong work experience with CI/CD implementation tools like Azure DevOps, Team city, Octopus Deploy, AWS Code Deploy, and Jenkins.
• Experience of writing automation scripts with PowerShell, Bash, Python, etc.
• GitHub, JIRA, Confluence, and Continuous Integration (CI) system.
• Understanding of secure DevOps practices
Good to Have -
• Knowledge of scripting languages such as PowerShell, Bash
• Experience with project management and workflow tools such as Agile, Jira, Scrum/Kanban, etc.
• Experience with Build technologies and cloud services. (Jenkins, TeamCity, Azure DevOps, Bamboo, AWS Code Deploy)
• Strong communication skills and ability to explain protocol and processes with team and management.
• Must be able to handle multiple tasks and adapt to a constantly changing environment.
• Must have a good understanding of SDLC.
• Knowledge of Linux, Windows server, Monitoring tools, and Shell scripting.
• Self-motivated; demonstrating the ability to achieve in technologies with minimal supervision.
• Organized, flexible, and analytical ability to solve problems creatively.
A.P.T Portfolio, a high frequency trading firm that specialises in Quantitative Trading & Investment Strategies.Founded in November 2009, it has been a major liquidity provider in global Stock markets.
As a manager, you would be incharge of managing the devops team and your remit shall include the following
- Private Cloud - Design & maintain a high performance and reliable network architecture to support HPC applications
- Scheduling Tool - Implement and maintain a HPC scheduling technology like Kubernetes, Hadoop YARN Mesos, HTCondor or Nomad for processing & scheduling analytical jobs. Implement controls which allow analytical jobs to seamlessly utilize ideal capacity on the private cloud.
- Security - Implementing best security practices and implementing data isolation policy between different divisions internally.
- Capacity Sizing - Monitor private cloud usage and share details with different teams. Plan capacity enhancements on a quarterly basis.
- Storage solution - Optimize storage solutions like NetApp, EMC, Quobyte for analytical jobs. Monitor their performance on a daily basis to identify issues early.
- NFS - Implement and optimize latest version of NFS for our use case.
- Public Cloud - Drive AWS/Google-Cloud utilization in the firm for increasing efficiency, improving collaboration and for reducing cost. Maintain the environment for our existing use cases. Further explore potential areas of using public cloud within the firm.
- BackUps - Identify and automate back up of all crucial data/binary/code etc in a secured manner at such duration warranted by the use case. Ensure that recovery from back-up is tested and seamless.
- Access Control - Maintain password less access control and improve security over time. Minimize failures for automated job due to unsuccessful logins.
- Operating System -Plan, test and roll out new operating system for all production, simulation and desktop environments. Work closely with developers to highlight new performance enhancements capabilities of new versions.
- Configuration management -Work closely with DevOps/ development team to freeze configurations/playbook for various teams & internal applications. Deploy and maintain standard tools such as Ansible, Puppet, chef etc for the same.
- Data Storage & Security Planning - Maintain a tight control of root access on various devices. Ensure root access is rolled back as soon the desired objective is achieved.
- Audit access logs on devices. Use third party tools to put in a monitoring mechanism for early detection of any suspicious activity.
- Maintaining all third party tools used for development and collaboration - This shall include maintaining a fault tolerant environment for GIT/Perforce, productivity tools such as Slack/Microsoft team, build tools like Jenkins/Bamboo etc
Qualifications
- Bachelors or Masters Level Degree, preferably in CSE/IT
- 10+ years of relevant experience in sys-admin function
- Must have strong knowledge of IT Infrastructure, Linux, Networking and grid.
- Must have strong grasp of automation & Data management tools.
- Efficient in scripting languages and python
Desirables
- Professional attitude, co-operative and mature approach to work, must be focused, structured and well considered, troubleshooting skills.
- Exhibit a high level of individual initiative and ownership, effectively collaborate with other team members.
APT Portfolio is an equal opportunity employer
Our Client is an IT infrastructure services company, focused and specialized in delivering solutions and services on Microsoft products and technologies. They are a Microsoft partner and cloud solution provider. Our Client's objective is to help small, mid-sized as well as global enterprises to transform their business by using innovation in IT, adapting to the latest technologies and using IT as an enabler for business to meet business goals and continuous growth.
With focused and experienced management and a strong team of IT Infrastructure professionals, they are adding value by making IT Infrastructure a robust, agile, secure and cost-effective service to the business. As an independent IT Infrastructure company, they provide their clients with unbiased advice on how to successfully implement and manage technology to complement their business requirements.
- Working closely with other engineers and administrators
- Learning intimate knowledge of how best to customize the services available on various cloud platforms to help us become more secure and efficient.
- Assessing client requirements and coming up with costing for the sales team
- Planning and designing client infrastructure on Microsoft Azure and AWS
- Setting up alerts and monitor the health of cloud resources
- Handling the day-to-day management of clients’ cloud-based solutions Implementing security and protecting Identities
- Diagnosing and troubleshooting technical issues relating to Microsoft Azure and AWS
- Helping customers successfully deploy and implement cloud computing solutions
- Resolving technical support tickets via telephone, chat, email and sometimes in-person
- Keeping self and team updated with new cloud services offerings from Microsoft, Amazon & Google
- Staying current with industry trends, making recommendations as needed to help the company excel
What you need to have:
- Experience in cloud-based tech
- This position requires excellent written and verbal communication skills and negotiation
- Should have working knowledge of Microsoft Azure Calculator and AWS Calculator
- A clear understanding of core Cloud Computing services
- Knowledge of various computer services on Microsoft Azure and AWS
- Knowledge of various storage services on Microsoft Azure and AWS
- Knowledge of log collecting services available with Microsoft Azure and AWS
- Experience of working with popular operating systems such as Linux & Windows
- Experience of computer networks
- Experience of computer technologies like Active Directory, network protocols & subnetting
- Experience in automating day to day tasks using PowerShell scripting
- Confidence in own abilities
- Knowledgeable within this subject area and a thought leader
- Fast assimilator of information
- Imaginative problem solver
- Structured organizer
- Strong relationship building skills
- Strong analytical & numeracy skills
- Ability to use initiative and work under pressure, prioritizing to meet deadlines
- Driven, leading on initiatives, being committed to the role, and delivering on objectives and deadlines
- Service Orientation, demonstrable commitment to customer service
Technical Experience/Knowledge Needed :
- Cloud-hosted services environment.
- Proven ability to work in a Cloud-based environment.
- Ability to manage and maintain Cloud Infrastructure on AWS
- Must have strong experience in technologies such as Dockers, Kubernetes, Functions, etc.
- Knowledge in orchestration tools Ansible
- Experience with ELK Stack
- Strong knowledge in Micro Services, Container-based architecture and the corresponding deployment tools and techniques.
- Hands-on knowledge of implementing multi-staged CI / CD with tools like Jenkins and Git.
- Sound knowledge on tools like Kibana, Kafka, Grafana, Instana and so on.
- Proficient in bash Scripting Languages.
- Must have in-depth knowledge of Clustering, Load Balancing, High Availability and Disaster Recovery, Auto Scaling, etc.
-
AWS Certified Solutions Architect or/and Linux System Administrator
- Strong ability to work independently on complex issues
- Collaborate efficiently with internal experts to resolve customer issues quickly
- No objection to working night shifts as the production support team works on 24*7 basis. Hence, rotational shifts will be assigned to the candidates weekly to get equal opportunity to work in a day and night shifts. But if you get candidates willing to work the night shift on a need basis, discuss with us.
- Early Joining
- Willingness to work in Delhi NCR




