11+ Clustering Jobs in Chennai | Clustering Job openings in Chennai
Apply to 11+ Clustering Jobs in Chennai on CutShort.io. Explore the latest Clustering Job opportunities across top companies like Google, Amazon & Adobe.
Intuitive is the fastest growing top-tier Cloud Solutions and Services company supporting Global Enterprise Customer across Americas, Europe and Middle East.
Intuitive is looking for highly talented hands-on Cloud Infrastructure Architects to help accelerate our growing Professional Services consulting Cloud & DevOps practice. This is an excellent opportunity to join Intuitive’s global world class technology teams, working with some of the best and brightest engineers while also developing your skills and furthering your career working with some of the largest customers.
Job Description :
- Extensive exp. with K8s (EKS/GKE) and k8s eco-system tooling e,g., Prometheus, ArgoCD, Grafana, Istio etc.
- Extensive AWS/GCP Core Infrastructure skills
- Infrastructure/ IAC Automation, Integration - Terraform
- Kubernetes resources engineering and management
- Experience with DevOps tools, CICD pipelines and release management
- Good at creating documentation(runbooks, design documents, implementation plans )
Linux Experience :
- Namespace
- Virtualization
- Containers
Networking Experience
- Virtual networking
- Overlay networks
- Vxlans, GRE
Kubernetes Experience :
Should have experience in bringing up the Kubernetes cluster manually without using kubeadm tool.
Observability
Experience in observability is a plus
Cloud automation :
Familiarity with cloud platforms exclusively AWS, DevOps tools like Jenkins, terraform etc.
Job Title: AWS DevOps Engineer
Experience Level: 5+ Years
Location: Bangalore, Pune, Hyderabad, Chennai and Gurgaon
Summary:
We are looking for a hands-on Platform Engineer with strong execution skills to provision and manage cloud infrastructure. The ideal candidate will have experience with Linux, AWS services, Kubernetes, and Terraform, and should be capable of troubleshooting complex issues in cloud and container environments.
Key Responsibilities:
- Provision AWS infrastructure using Terraform (IaC).
- Manage and troubleshoot Kubernetes clusters (EKS/ECS).
- Work with core AWS services: VPC, EC2, S3, RDS, Lambda, ALB, WAF, and CloudFront.
- Support CI/CD pipelines using Jenkins and GitHub.
- Collaborate with teams to resolve infrastructure and deployment issues.
- Maintain documentation of infrastructure and operational procedures.
Required Skills:
- 3+ years of hands-on experience in AWS infrastructure provisioning using Terraform.
- Strong Linux administration and troubleshooting skills.
- Experience managing Kubernetes clusters.
- Basic experience with CI/CD tools like Jenkins and GitHub.
- Good communication skills and a positive, team-oriented attitude.
Preferred:
- AWS Certification (e.g., Solutions Architect, DevOps Engineer).
- Exposure to Agile and DevOps practices.
- Experience with monitoring and logging tools.
Why this role exists
Our infrastructure footprint is growing faster than our headcount, and we believe most of that
gap should be closed by automation and AI agents — not by hiring more humans to do toil. We
need someone early in their career who treats manual work as a bug, ships scripts and agents
instead of tickets, and wants to grow into deeper ownership over the next two years.
You will not be the most senior person on the team. You will be the one who multiplies the team.
What you'll own
In your first 1 months
• Take ownership of one slice of our CI/CD pipeline and make it measurably
faster, more reliable, or cheaper. We expect a number on a dashboard to move.
• Build at least three internal automations that replace manual ops toil —
using AI agents (Claude Code, agentic CLIs, scripted LLM workflows) as your force
multiplier.
• Be the first responder for a defined set of alerts. Write the runbooks. Drive
the alert volume down.
• Support senior engineers on AI/ML infrastructure (GPU nodes, inference
services, model deployment) — observe, document, and gradually take on contained
changes under review.
By 3 months you should be
• The go-to person for at least two production systems.
• Shipping routine infrastructure changes without needing senior review.
• Treating "manual" as a code smell.
Required (we will reject without these)
• 0–3 years hands-on experience with one major cloud (AWS, GCP, or
Azure — one is fine, depth beats breadth).
• Fluent in Linux command line, bash, and at least one scripting language
(Python or Go preferred).
• Have shipped something to production that real users hit. A side project
counts; a graded coursework lab does not.
• Comfortable with Docker — you can explain what an image vs. a
container is and why it matters.
• Working knowledge of networking fundamentals: DNS, HTTP/HTTPS,
TLS, ports, basic subnets — enough to debug "it works on my machine."
• Git fluency: branches, merges, rebases, conflict resolution.
• CI/CD pipelines — you have authored or substantially modified pipelines
in GitHub Actions, GitLab CI, ArgoCD, Jenkins, or similar. Not just "I clicked Re-run."
• Kubernetes basics — kubectl for real work, can read pod logs,
understand deployments and services, can debug a CrashLoopBackOff without
panicking. You do not need to have run a cluster; you do need to have lived inside one.
• Active user of AI coding agents (Claude Code, Cursor, Copilot, agentic
CLIs, etc.). You should be able to walk us through specific tasks where they made you
faster, and specific tasks where they failed you and how you noticed. "I have tried it" is
not enough.
Bonus (real plus, not required)
• Infrastructure as Code: Terraform, Pulumi, or Ansible.
• Observability: Prometheus/Grafana, Datadog, OpenTelemetry, any APM.
• Have built or extended an LLM-based agent — a custom MCP server, a
scripted multi-step workflow, an internal tool that calls models in a loop. Anything beyond
chat-with-Claude.
• Exposure to GPU workloads, model serving (vLLM, Triton, TGI, etc.), or
ML pipelines.
What we don't care about
• Whether your degree is in CS — or whether you have a degree at all.
• Brand-name companies on your resume.
• Certifications. They are fine. They do not substitute for having shipped.
How we work
• We default to automation. If you do something manually twice, the third
time you script it or hand it to an agent.
• AI agents are part of the workflow, not a novelty. Expect interview
questions about exactly how you use them — and where you have caught them being
wrong.
• Small, reversible changes beat big-bang rollouts.
• Postmortems are blameless and written down.
• We push back on each other. If you only execute, you will be unhappy
here.
How to apply
Send:
• Your resume.
• A short note (≤200 words) describing one infra or automation problem you
solved, and how AI agents factored in — or did not, and why. We read these. Generic
notes get rejected.
Internal note — delete before posting externally
• Comp band, location policy, team name, and reporting line marked
[CONFIRM] need to be filled in before this goes external.
• The Required list is intentionally tight: CI/CD and Kubernetes basics
promoted from bonus. Expect this to filter ~80% of typical junior DevOps applicants. The
remaining pool will skew toward people who have actually shipped infra at a startup, not
bootcamp grads or pure cloud-cert holders.
• IaC, observability, agent-building, and GPU/ML serving stay as bonus.
Promoting any of these to required at 0–3 yrs collapses the pool to near-zero or forces
hiring senior people at junior comp. If you want IaC required, re-level this to mid (3–5
yrs) and raise the band.
• Screening implication: the resume screen should explicitly check for
CI/CD pipeline authorship and any K8s-touching production work. If neither is on the
resume, reject at screen. Do not waste interview slots.
• Pipeline watch: if fewer than ~15 qualified resumes after 2 weeks of
active sourcing, the first thing to relax is the AI-agent-fluency bar (move to bonus and
screen for it in interview instead). Do not relax the "shipped to production" requirement
— that is the load-bearing filter.
We are looking for a passionate DevOps Engineer who can support deployment and monitor our Production, QE, and Staging environments performance. Applicants should have a strong understanding of UNIX internals and should be able to clearly articulate how it works. Knowledge of shell scripting & security aspects is a must. Any experience with infrastructure as code is a big plus. The key responsibility of the role is to manage deployments, security, and support of business solutions. Having experience in database applications like Postgres, ELK, NodeJS, NextJS & Ruby on Rails is a huge plus. At VakilSearch. Experience doesn't matter, passion to produce change matters
Responsibilities and Accountabilities:
- As part of the DevOps team, you will be responsible for configuration, optimization, documentation, and support of the infra components of VakilSearch’s product which are hosted in cloud services & on-prem facility
- Design, build tools and framework that support deploying and managing our platform & Exploring new tools, technologies, and processes to improve speed, efficiency, and scalability
- Support and troubleshoot scalability, high availability, performance, monitoring, backup, and restore of different Env
- Manage resources in a cost-effective, innovative manner including assisting subordinates ineffective use of resources and tools
- Resolve incidents as escalated from Monitoring tools and Business Development Team
- Implement and follow security guidelines, both policy and technology to protect our data
- Identify root cause for issues and develop long-term solutions to fix recurring issues and Document it
- Strong in performing production operation activities even at night times if required
- Ability to automate [Scripts] recurring tasks to increase velocity and quality
- Ability to manage and deliver multiple project phases at the same time
I Qualification(s):
- Experience in working with Linux Server, DevOps tools, and Orchestration tools
- Linux, AWS, GCP, Azure, CompTIA+, and any other certification are a value-add
II Experience Required in DevOps Aspects:
- Length of Experience: Minimum 1-4 years of experience
- Nature of Experience:
- Experience in Cloud deployments, Linux administration[ Kernel Tuning is a value add ], Linux clustering, AWS, virtualization, and networking concepts [ Azure, GCP value add ]
- Experience in deployment solutions CI/CD like Jenkins, GitHub Actions [ Release Management is a value add ]
- Hands-on experience in any of the configuration management IaC tools like Chef, Terraform, and CloudFormation [ Ansible & Puppet is a value add ]
- Administration, Configuring and utilizing Monitoring and Alerting tools like Prometheus, Grafana, Loki, ELK, Zabbix, Datadog, etc
- Experience with Containerization and orchestration tools like Docker, and Kubernetes [ Docker swarm is a value add ]Good Scripting skills in at least one interpreted language - Shell/bash scripting or Ruby/Python/Perl
- Experience in Database applications like PostgreSQL, MongoDB & MySQL [DataOps]
- Good at Version Control & source code management systems like GitHub, GIT
- Experience in Serverless [ Lambda/GCP cloud function/Azure function ]
- Experience in Web Server Nginx, and Apache
- Knowledge in Redis, RabbitMQ, ELK, REST API [ MLOps Tools is a value add ]
- Knowledge in Puma, Unicorn, Gunicorn & Yarn
- Hands-on VMWare ESXi/Xencenter deployments is a value add
- Experience in Implementing and troubleshooting TCP/IP networks, VPN, Load Balancing & Web application firewalls
- Deploying, Configuring, and Maintaining Linux server systems ON premises and off-premises
- Code Quality like SonarQube is a value-add
- Test Automation like Selenium, JMeter, and JUnit is a value-add
- Experience in Heroku and OpenStack is a value-add
- Experience in Identifying Inbound and Outbound Threats and resolving it
- Knowledge of CVE & applying the patches for OS, Ruby gems, Node, and Python packages
- Documenting the Security fix for future use
- Establish cross-team collaboration with security built into the software development lifecycle
- Forensics and Root Cause Analysis skills are mandatory
- Weekly Sanity Checks of the on-prem and off-prem environment
III Skill Set & Personality Traits required:
- An understanding of programming languages such as Ruby, NodeJS, ReactJS, Perl, Java, Python, and PHP
- Good written and verbal communication skills to facilitate efficient and effective interaction with peers, partners, vendors, and customers
IV Age Group: 21 – 36 Years
V Cost to the Company: As per industry standards
- Configure, optimize, document, and support of the infrastructure components of software products (which are hosted in collocated facilities and cloud services such as AWS)
- Design and build tools and frameworks that support deployment and management and platforms
- Design, build, and deliver cloud computing solutions, hosted services, and underlying software infrastructures
- Build core functionality of our cloud-based platform product, deliver secure, reliable services and construct third party integrations
- Assist in coaching application developers on proper DevOps techniques for building scalable applications in the microservices paradigm
- Foster collaboration with software product development and architecture teams to ensure releases are delivered with repeatable and auditable processes
- Support and troubleshoot scalability, high availability, performance, monitoring, backup, and restores of different environments
- Work independently across multiple platforms and applications to understand dependencies
- Evaluate new tools, technologies, and processes to improve speed, efficiency, and scalability of continuous integration environments
- Design and architect solutions for existing client-facing applications as they are moved into cloud environments such as AWS
- Competencies
- Full understanding of scripting and automated process management in languages such as Shell, Ruby and/ or Python
- Working Knowledge SCM tools such as Git, GitHub, Bitbucket, etc.
- Working knowledge of Amazon Web Services and related APIs
- Ability to deliver and manage web or cloud-based services
- General familiarity with monitoring tools
- General familiarity with configuration/provisioning tools such as Terraform
- Experience
- Experience working within an Agile type environment
- 4+ years of experience with cloud-based provisioning (Azure, AWS, Google), monitoring, troubleshooting, and related DevOps technologies
- 4+ years of experience with containerization/orchestration technologies like Rancher, Docker and Kubernetes
DESIRED SKILLS AND EXPERIENCE
Strong analytical and problem-solving skills
Ability to work independently, learn quickly and be proactive
3-5 years overall and at least 1-2 years of hands-on experience in designing and managing DevOps Cloud infrastructure
Experience must include a combination of:
o Experience working with configuration management tools – Ansible, Chef, Puppet, SaltStack (expertise in at least one tool is a must)
o Ability to write and maintain code in at least one scripting language (Python preferred)
o Practical knowledge of shell scripting
o Cloud knowledge – AWS, VMware vSphere o Good understanding and familiarity with Linux
o Networking knowledge – Firewalls, VPNs, Load Balancers
o Web/Application servers, Nginx, JVM environments
o Virtualization and containers - Xen, KVM, Qemu, Docker, Kubernetes, etc.
o Familiarity with logging systems - Logstash, Elasticsearch, Kibana
o Git, Jenkins, Jira
About us:
HappyFox is a software-as-a-service (SaaS) support platform. We offer an enterprise-grade help desk ticketing system and intuitively designed live chat software.
We serve over 12,000 companies in 70+ countries. HappyFox is used by companies that span across education, media, e-commerce, retail, information technology, manufacturing, non-profit, government and many other verticals that have an internal or external support function.
To know more, Visit! - https://www.happyfox.com/
Responsibilities:
- Build and scale production infrastructure in AWS for the HappyFox platform and its products.
- Research, Build/Implement systems, services and tooling to improve uptime, reliability and maintainability of our backend infrastructure. And to meet our internal SLOs and customer-facing SLAs.
- Proficient in managing/patching servers with Unix-based operating systems like Ubuntu Linux.
- Proficient in writing automation scripts or building infrastructure tools using Python/Ruby/Bash/Golang
- Implement consistent observability, deployment and IaC setups
- Patch production systems to fix security/performance issues
- Actively respond to escalations/incidents in the production environment from customers or the support team
- Mentor other Infrastructure engineers, review their work and continuously ship improvements to production infrastructure.
- Build and manage development infrastructure, and CI/CD pipelines for our teams to ship & test code faster.
- Participate in infrastructure security audits
Requirements:
- At least 5 years of experience in handling/building Production environments in AWS.
- At least 2 years of programming experience in building API/backend services for customer-facing applications in production.
- Demonstrable knowledge of TCP/IP, HTTP and DNS fundamentals.
- Experience in deploying and managing production Python/NodeJS/Golang applications to AWS EC2, ECS or EKS.
- Proficient in containerised environments such as Docker, Docker Compose, Kubernetes
- Proficient in managing/patching servers with Unix-based operating systems like Ubuntu Linux.
- Proficient in writing automation scripts using any scripting language such as Python, Ruby, Bash etc.,
- Experience in setting up and managing test/staging environments, and CI/CD pipelines.
- Experience in IaC tools such as Terraform or AWS CDK
- Passion for making systems reliable, maintainable, scalable and secure.
- Excellent verbal and written communication skills to address, escalate and express technical ideas clearly
- Bonus points – if you have experience with Nginx, Postgres, Redis, and Mongo systems in production.
About us:
HappyFox is a software-as-a-service (SaaS) support platform. We offer an enterprise-grade help desk ticketing system and intuitively designed live chat software.
We serve over 12,000 companies in 70+ countries. HappyFox is used by companies that span across education, media, e-commerce, retail, information technology, manufacturing, non-profit, government and many other verticals that have an internal or external support function.
To know more, Visit! - https://www.happyfox.com/
Responsibilities
- Build and scale production infrastructure in AWS for the HappyFox platform and its products.
- Research, Build/Implement systems, services and tooling to improve uptime, reliability and maintainability of our backend infrastructure. And to meet our internal SLOs and customer-facing SLAs.
- Implement consistent observability, deployment and IaC setups
- Lead incident management and actively respond to escalations/incidents in the production environment from customers and the support team.
- Hire/Mentor other Infrastructure engineers and review their work to continuously ship improvements to production infrastructure and its tooling.
- Build and manage development infrastructure, and CI/CD pipelines for our teams to ship & test code faster.
- Lead infrastructure security audits
Requirements
- At least 7 years of experience in handling/building Production environments in AWS.
- At least 3 years of programming experience in building API/backend services for customer-facing applications in production.
- Proficient in managing/patching servers with Unix-based operating systems like Ubuntu Linux.
- Proficient in writing automation scripts or building infrastructure tools using Python/Ruby/Bash/Golang
- Experience in deploying and managing production Python/NodeJS/Golang applications to AWS EC2, ECS or EKS.
- Experience in security hardening of infrastructure, systems and services.
- Proficient in containerised environments such as Docker, Docker Compose, Kubernetes
- Experience in setting up and managing test/staging environments, and CI/CD pipelines.
- Experience in IaC tools such as Terraform or AWS CDK
- Exposure/Experience in setting up or managing Cloudflare, Qualys and other related tools
- Passion for making systems reliable, maintainable, scalable and secure.
- Excellent verbal and written communication skills to address, escalate and express technical ideas clearly
- Bonus points – Hands-on experience with Nginx, Postgres, Postfix, Redis or Mongo systems.

technology based supply chain management
A Strong Devops experience of at least 4+ years
Strong Experience in Unix/Linux/Python scripting
Strong networking knowledge,vSphere networking stack knowledge desired.
Experience on Docker and Kubernetes
Experience with cloud technologies (AWS/Azure)
Exposure to Continuous Development Tools such as Jenkins or Spinnaker
Exposure to configuration management systems such as Ansible
Knowledge of resource monitoring systems
Ability to scope and estimate
Strong verbal and communication skills
Advanced knowledge of Docker and Kubernetes.
Exposure to Blockchain as a Service (BaaS) like - Chainstack/IBM blockchain platform/Oracle Blockchain Cloud/Rubix/VMWare etc.
Capable of provisioning and maintaining local enterprise blockchain platforms for Development and QA (Hyperledger fabric/Baas/Corda/ETH).
Requirements
You will make an ideal candidate if you have:
-
Experience of building a range of Services in a Cloud Service provider
-
Expert understanding of DevOps principles and Infrastructure as a Code concepts and techniques
-
Strong understanding of CI/CD tools (Jenkins, Ansible, GitHub)
-
Managed an infrastructure that involved 50+ hosts/network
-
3+ years of Kubernetes experience & 5+ years of experience in Native services such as Compute (virtual machines), Containers (AKS), Databases, DevOps, Identity, Storage & Security
-
Experience in engineering solutions on cloud foundation platform using Infrastructure As Code methods (eg. Terraform)
-
Security and Compliance, e.g. IAM and cloud compliance/auditing/monitoring tools
-
Customer/stakeholder focus. Ability to build strong relationships with Application teams, cross functional IT and global/local IT teams
-
Good leadership and teamwork skills - Works collaboratively in an agile environment
-
Operational effectiveness - delivers solutions that align to approved design patterns and security standards
-
Excellent skills in at least one of following: Python, Ruby, Java, JavaScript, Go, Node.JS
-
Experienced in full automation and configuration management
-
A track record of constantly looking for ways to do things better and an excellent understanding of the mechanism necessary to successfully implement change
-
Set and achieved challenging short, medium and long term goals which exceeded the standards in their field
-
Excellent written and spoken communication skills; an ability to communicate with impact, ensuring complex information is articulated in a meaningful way to wide and varied audiences
-
Built effective networks across business areas, developing relationships based on mutual trust and encouraging others to do the same
-
A successful track record of delivering complex projects and/or programmes, utilizing appropriate techniques and tools to ensure and measure success
-
A comprehensive understanding of risk management and proven experience of ensuring own/others' compliance with relevant regulatory processes
Essential Skills :
-
Demonstrable Cloud service provider experience - infrastructure build and configurations of a variety of services including compute, devops, databases, storage & security
-
Demonstrable experience of Linux administration and scripting preferably Red Hat
-
Experience of working with Continuous Integration (CI), Continuous Delivery (CD) and continuous testing tools
-
Experience working within an Agile environment
-
Programming experience in one or more of the following languages: Python, Ruby, Java, JavaScript, Go, Node.JS
-
Server administration (either Linux or Windows)
-
Automation scripting (using scripting languages such as Terraform, Ansible etc.)
-
Ability to quickly acquire new skills and tools
Required Skills :
-
Linux & Windows Server Certification
What we are looking for
Work closely with product & engineering groups to identify and document
infrastructure requirements.
Design infrastructure solutions balancing requirements, operational
constraints and architecture guidelines.
Implement infrastructure including network connectivity, virtual machines
and monitoring.
Implement and follow security guidelines, both policy and technical to
protect our customers.
Resolve incidents as escalated from monitoring solutions and lower tiers.
Identify root cause for issues and develop long term solutions to fix recurring
issues.
Ability to automate recurring tasks to increase velocity and quality.
Partner with the engineering team to build software tolerance for
infrastructure failure or issues.
Research emerging technologies, trends and methodologies and enhance
existing systems and processes.
Qualifications
Master’s/Bachelors degree in Computer Science, Computer Engineering,
Electrical Engineering, or related technical field, and two years of experience
in software/systems or related.
5+ years overall experience.
Work experience must have included:
Proven track record in deploying, configuring and maintaining Ubuntu server
systems on premise and in the cloud.
Minimum of 4 years’ experience designing, implementing and troubleshooting
TCP/IP networks, VPN, Load Balancers & Firewalls.
Minimum 3 years of experience working in public clouds like AWS & Azure.
Hands on experience in any of the configuration management tools like Anisble,
Chef & Puppet.
Strong in performing production operation activities.
Experience with Container & Container Orchestrator tools like Kubernetes, Docker
Swarm is plus.
Good at source code management tools like Bitbucket, GIT.
Configuring and utilizing monitoring and alerting tools.
Scripting to automate infrastructure and operational processes.
Hands on work to secure networks and systems.
Sound problem resolution, judgment, negotiating and decision making skills
Ability to manage and deliver multiple project phases at the same time
Strong analytical and organizational skills
Excellent written and verbal communication skills
Interview focus areas
Networks, systems, monitoring
AWS (EC2, S3, VPC)
Problem solving, scripting, network design, systems administration and
troubleshooting scenarios
Culture fit, agility, bias for action, ownership, communication



