Cutshort logo

50+ Terraform Jobs in India

Apply to 50+ Terraform Jobs on CutShort.io. Find your next job, effortlessly. Browse Terraform Jobs and apply today!

icon
Remote only
8 - 12 yrs
Best in industry
Terraform
Artificial Intelligence (AI)
IAC
skill iconAmazon Web Services (AWS)
ECS
+6 more


Senior Platform & Site Reliability Engineer

Location: Remote Employment Type: Contract

The Role

This role carries full architectural and operational ownership of the platform layer across a growing SaaS portfolio. The Cloud Architect owns AWS infrastructure standards — VPCs, account structures, networking, and compute design. Everything outside that lane is yours: the CI/CD platform, the observability and reliability stack, the event streaming infrastructure, the deployment pipelines, and the incident engineering model.

Architectural decisions are yours to make and defend, standards are yours to define and enforce, and the reliability of 20+ enterprise SaaS products depends on what you and your team build.

This is an AI-native engineering organisation. Where it is practical and safe to do so, you are expected to use automation and AI-assisted tooling to reduce toil — in CI/CD triage, infrastructure provisioning, observability workflows, and acquisition onboarding. The expectation is not to replace engineering judgement with automation, but to free it up for the problems that genuinely require it.

The Scale You Will Operate At

The portfolio consists of 20+ live, enterprise-grade SaaS solutions running concurrently. Each product serves enterprise customers and processes millions to billions of real-time requests. The architecture is serious: event streaming for real-time data pipelines, batch processing workloads running alongside live transaction flows, and multi-tenant enterprise-grade reliability expectations across every product.

You will design and operate the platform infrastructure that underpins all of it — scaling horizontally as each new acquisition joins the portfolio, without proportionally scaling cost, complexity, or headcount.

What You Will Own

Platform Architecture

  • Full architectural ownership of the non-AWS toolchain: CI/CD, observability, event streaming, automation, secrets, and deployment infrastructure
  • Define, build, and enforce platform standards across portfolio products
  • Terraform IaC for all infrastructure — nothing provisioned manually, everything versioned and reviewed
  • Self-service developer platform so product teams ship without waiting on platform

Event Streaming & Pipeline Infrastructure

  • Own the event streaming architecture, operational standards, and health monitoring across all products using real-time pipelines
  • Design and maintain batch processing infrastructure alongside live event flows
  • Ensure pipeline reliability, throughput, and cost are actively managed at scale

CI/CD & Deployment

  • Build and maintain CI/CD pipelines (GitHub Actions) across all portfolio products
  • Automate triage and retry logic for known failure classes — flaky tests, dependency timeouts, OOM kills — so engineers are only paged for genuinely novel failures
  • Deployment standards: release management, rollback mechanisms, canary and blue-green patterns where justified

Observability & Reliability

  • Own the full observability stack: Grafana, Prometheus, and Loki across all products
  • SLOs and error budgets defined per product; reliability tracked consistently
  • Build alerting that correlates signals and surfaces diagnostic context alongside notifications — so on-call engineers arrive at an incident with hypotheses, not a blank screen
  • Incident response: on-call design, escalation playbooks, post-mortem facilitation
  • Automated remediation scoped to safe, idempotent actions — container restarts, ECS task scaling, known rollback patterns; novel or ambiguous failures escalate to a human with full context attached

Acquisition Onboarding

  • Platform audit and gap analysis for every new acquisition — assessing CI/CD maturity, IaC coverage, observability gaps, and security posture
  • Migration plan and execution for each portfolio company joining the platform — sequenced to avoid disrupting live operations
  • Target: full platform integration within a defined window per acquisition

A Note on Automation

Where automation is safe and failure modes are well understood — routine provisioning, known CI/CD failure classes, secrets rotation, cost anomaly flagging — aggressive automation is expected. Where automation would act on ambiguous signals or carry significant blast radius, human judgement stays in the loop. The goal is to reduce toil on solved problems, not to automate decisions that require engineering expertise.

Platform Stack

Area Stack / Standard IaC Terraform OSS / OpenTofu CI/CD GitHub Actions Event Streaming Architecture and tooling chosen for the workload Observability Grafana, Prometheus, Loki Log Management AWS CloudWatch, Grafana Loki Incident Management OpsGenie (startup tier) or Better Uptime Secrets AWS Secrets Manager / HashiCorp Vault OSS Containers ECS (default), EKS only where justified Cost Monitoring AWS Cost Explorer with custom dashboards What We’re Looking For

  • 8–12 years in platform engineering, DevOps, or SRE — with clear evidence of increasing ownership over time
  • Strong Terraform depth across multi-environment, multi-account setups
  • CI/CD ownership across a multi-product environment with GitHub Actions
  • Experience with event streaming infrastructure at production scale — design, operations, reliability, and cost management
  • Hands-on Grafana, Prometheus, and Loki in production
  • AWS operational depth: ECS, EKS, RDS, IAM, VPC, CloudWatch, Cost Explorer
  • SRE fundamentals: SLOs, error budgets, on-call design, post-mortem culture
  • Acquisition or greenfield platform integration experience strongly preferred

How You Work

  • Comfortable operating across multiple products simultaneously — context-switching without dropping standards
  • Cost-efficiency instinct — you optimise spend as a habit, not as a project
  • You treat automation as a tool for eliminating toil, not a substitute for engineering judgement
  • You document decisions, enforce standards through code, and build platforms that other engineers find intuitive to use

Why This Role

The platform function is being built from the ground up. You will have architectural ownership of the entire non-AWS platform layer across a growing portfolio of enterprise SaaS products, with the freedom — and responsibility — to build the reliability and delivery culture of the organisation.

This is not a role that inherits someone else’s decisions and maintains them. Every major architectural choice is still to be made. If you want to build something that lasts and that other engineers depend on, this is the role.

Read more
Unico Connect Private Limited
Mumbai
2 - 4 yrs
Best in industry
DevOps
skill iconAmazon Web Services (AWS)
skill iconKubernetes
Terraform
CI/CD
+8 more

DevOps Engineer

AWS Infrastructure, CI/CD & Production Operations

Mumbai (On-site) | Full-time | 2-4 years


About the role:

Unico Connect is an AI-first technology partner that builds custom mobile, web, and AI products for clients across multiple geographies. We are hiring a DevOps Engineer who will own day-to-day cloud infrastructure, deployment automation, and production operations across active customer engagements.

The mandatory requirement for this role is hands-on production experience on AWS, with infrastructure as code, container orchestration, and CI/CD pipelines owned end to end on at least one live customer workload. The role is hands-on. Expect to operate Kubernetes clusters, build CI/CD pipelines, automate environment provisioning, manage TLS and DNS, set up observability, and partner with backend and AI engineers to ship reliably. A typical week includes a Terraform refactor, a deployment pipeline build for a new service, an incident response on a production cluster, and a cost review.


Responsibilities:

  • AWS infrastructure: Design and operate production infrastructure on AWS using EC2, EKS or ECS, S3, RDS, IAM, VPC, CloudFront, and Route53. Own configuration, networking, and cost.
  • Infrastructure as code: Write and maintain Terraform or Pulumi modules. Drive consistency across environments and tenants through IaC rather than manual configuration.
  • Kubernetes and containers: Operate production EKS clusters. Manage Helm charts, Ingress, autoscaling, secrets, and workload isolation.
  • CI/CD pipelines: Build and maintain pipelines using GitHub Actions, GitLab CI, or equivalent. Include automated tests, security scans, and rollback paths.
  • TLS, DNS, and CDN automation: Automate domain provisioning, TLS issuance (Let's Encrypt, cert-manager, ACM), and CDN configuration (CloudFront, Cloudflare).
  • Observability and incident response: Set up monitoring, logging, and alerting using Prometheus, Grafana, ELK, Loki, or CloudWatch. Lead incident response and write postmortems.
  • Secrets and security: Manage secrets through Vault, AWS Secrets Manager, or KMS. Apply least-privilege IAM and review access regularly.
  • Cost monitoring: Track and optimise AWS spend across environments. Surface waste and propose remediations.

Requirements:

  • Hands-on AWS production experience (mandatory). Must have personally operated production workloads on AWS, with responsibility for IaC, deployments, and incident response on at least one live customer or internal-platform deployment. POCs and lab environments do not qualify.
  • 2 to 4 years of hands-on DevOps or infrastructure experience. Candidates with slightly less experience but strong demonstrated ownership are welcome to apply.
  • AWS depth. Hands-on with EC2, S3, IAM, VPC, EKS or ECS, RDS, CloudFront, and Route53. Working knowledge of CloudWatch and AWS cost tooling.
  • Kubernetes in production. Hands-on operation of EKS or equivalent. Comfort with Helm, Ingress controllers, autoscaling, and resource quotas.
  • Infrastructure as code. Strong with Terraform (preferred) or Pulumi. Modular code, state management, and review discipline.
  • CI/CD pipelines. Production experience with GitHub Actions, GitLab CI, or equivalent. Comfort with multi-environment pipelines and release strategies.
  • Scripting and automation. Strong Bash and Python (or Go) for tooling. Linux fluency at the command line.
  • Observability stack. Hands-on with Prometheus, Grafana, ELK or Loki, and at least one APM tool (Datadog, New Relic, or equivalent).
  • Networking, TLS, and security fundamentals. Comfortable with DNS, TLS certificate lifecycle, VPC peering, and security groups.



Nice to have: multi-tenant SaaS infrastructure experience; service mesh (Istio, Linkerd); GitOps (ArgoCD, Flux); sandboxed execution environments (Firecracker, gVisor); exposure to platform engineering or developer-platform teams.


Read more
Mudals Technologies

at Mudals Technologies

1 candid answer
Ariba Khan
Posted by Ariba Khan
Hyderabad
6 - 8 yrs
Best in industry
skill iconAmazon Web Services (AWS)
Windows Azure
skill iconGitHub
Terraform

Job Summary:

We are seeking a DevOps Engineer (AWS) with deep expertise in Terraform, Atlantis, and large-scale multi-cloud environments (200–300+ accounts/subscriptions). This is a client-facing consulting role, responsible for designing and governing secure, scalable infrastructure and CI/CD platforms across enterprise environments.


This role goes beyond implementation, you will define standards, governance models, and automation frameworks for large organizations operating at scale.


Key Responsibilities:

Cloud Platforms / AWS Expertise

  • AWS (Expert) – Organizations, IAM, VPC, Security services
  • Strong hands-on experience with AWS Service Catalog to design, publish, and manage standardized, secure infrastructure products across multi-account AWS environments
  • Proven ability to enforce governance, compliance, and cost controls by integrating Service Catalog with IAM, SCPs, and CI/CD pipelines
  • Experience with Azure or GCP (multi-cloud exposure required) Enterprise DevSecOps Architecture
  • Architect and standardize end-to-end DevSecOps platforms
  • Design secure, scalable CI/CD pipelines using Jenkins & GitHub Actions
  • Embed security gates (SAST, DAST, SCA, container security) into pipelines
  • Define reusable pipeline templates across multiple teams/business units Terraform + Atlantis (Must-Have, Core Focus)
  • Design and manage large-scale Terraform architecture across 200–300+ cloud accounts
  • Implement Terraform modules, remote backends, and state isolation strategies
  • Build and manage Atlantis workflows for automated Terraform plan/apply
  • Enforce:Code reviews & approvals for infra changes Policy-as-code (OPA / Sentinel)
  • Drift detection & remediation

Multi-Cloud Architecture (AWS + Azure/GCP)

  • Architect multi-cloud landing zones and governance frameworks
  • Manage large account structures:
  • AWS Organizations (multi-account strategy) Azure Management Groups / Subscriptions
  • Implement:Network segmentation (VPC/VNet design) Identity federation (SSO, IAM, RBAC) Cross-account access models
  • Ensure high availability, scalability, and cost optimization Security & Compliance at Scale
  • Implement DevSecOps controls aligned with SOC 2, PCI-DSS, GDPR
  • Integrate tools like: JFrog Xray (SCA) SonarQube (SAST) Trivy / Prisma / Wiz (container & cloud security)
  • Build policy-as-code frameworks for compliance enforcement
  • Automate evidence collection for audits

Artifact & Dependency Management

  • Architect secure artifact lifecycle using JFrog Artifactory
  • Implement access control, immutability, and vulnerability scanning
  • Standardize dependency management across teams Observability & Reliability
  • Implement centralized logging/monitoring:
  • CloudWatch, ELK, Prometheus/Grafana
  • Define SLOs/SLIs for platform reliability
  • Reduce MTTR via automation and alerting

Consulting & Leadership

  • Act as a trusted advisor to enterprise clients
  • Lead architecture discussions and DevSecOps transformation programs
  • Mentor teams and enforce engineering best practices
  • Drive platform adoption across multiple business units


Required Skills:

Infrastructure as Code (Core)

  • Terraform (Expert level)
  • Atlantis (Hands-on implementation at scale)
  • Strong experience with multi-account architecture (200–300+ accounts)

DevOps Tooling

  • Jenkins (advanced pipelines)
  • GitHub Actions (enterprise workflows)
  • GitOps practices

Cloud Platforms

  • AWS (Expert) – Organizations, IAM, VPC, Security services
  • Experience with Azure or GCP (multi-cloud exposure required)

Security Stack

  • SAST, DAST, SCA tools integration
  • Container security & Kubernetes security
  • Secrets management (Vault / AWS Secrets Manager)

Programming/Scripting

  • Python / Bash (automation focus)


Preferred Qualifications:

  • Experience with Kubernetes (EKS/AKS/GKE) at scale
  • Knowledge of Zero Trust Architecture
  • Experience with OPA / Sentinel (policy-as-code)
  • Familiarity with platform engineering concepts (Internal Developer Platforms)

Certifications (Good to Have)

  • AWS Solutions Architect – Professional
  • Terraform Associate / Advanced Terraform certifications
  • CISSP / CKS (Kubernetes Security)


What Makes This Role Premium

  • Ownership of large-scale (200–300 account) cloud environments
  • Direct impact on enterprise DevSecOps maturity
  • Client-facing architecture & strategy role (not just execution)
  • Opportunity to define organization-wide standards
Read more
A leading data & analytics intelligence technology solutions provider

A leading data & analytics intelligence technology solutions provider

Agency job
via HyrHub by Neha Koshy
Remote only
3 - 7 yrs
₹10L - ₹27L / yr
DevOps
Windows Azure
CI/CD
Terraform
Linux/Unix
+2 more

Contract Job: DevOps / Azure DevOps

Contract Term: Max 3-6 months

Looking For Immediate joiners only

Remote Opportunity

  • CI/CD, deployment, environment management, and release support
  • This can be a shared capability/position as well


Read more
 A Digital Product Engineering company

A Digital Product Engineering company

Agency job
via Unique Occupational by Mantasha Naaz
Gurugram
5.5 - 7.5 yrs
₹28L - ₹32L / yr
skill iconKubernetes
ArgoCD
NewRelic
Crossplane
skill iconAmazon Web Services (AWS)
+2 more

Job Details:

  • Role: Staff Engineer, ArgoCD
  • Experience: 5.5-7.5 Years
  • Employment Type: Full-time
  • Work Mode: Gurugram (Hybrid)

Job Description

REQUIREMENTS:

  • Strong hands-on experience with Kubernetes (K8s) administration, deployment, and troubleshooting
  • Expertise in GitOps implementation using ArgoCD
  • Strong experience with Crossplane for infrastructure provisioning and orchestration
  • Hands-on experience with New Relic for monitoring, observability, and performance management
  • Experience building and maintaining CI/CD pipelines and deployment automation
  • Strong knowledge of Infrastructure as Code (IaC) using Terraform
  • Experience working with AWS cloud services and cloud-native architectures
  • Hands-on experience with Docker and containerization technologies
  • Strong Linux administration and scripting skills
  • Experience implementing platform reliability, security, and automation best practices
  • Strong understanding of monitoring, logging, and observability frameworks

RESPONSIBILITIES:

  • Manage, maintain, and optimize Kubernetes-based infrastructure and application deployments
  • Implement and support GitOps workflows using ArgoCD
  • Design and manage infrastructure provisioning using Crossplane
  • Monitor platform performance, reliability, and user experience using New Relic
  • Build, enhance, and maintain CI/CD pipelines for automated software delivery
  • Collaborate with development and platform engineering teams to deliver scalable cloud-native solutions
  • Implement Infrastructure as Code practices using Terraform and automation tools
  • Ensure platform stability, security, scalability, and operational excellence
  • Troubleshoot infrastructure, deployment, and performance-related issues
  • Drive continuous improvement initiatives across DevOps processes, tooling, and automation practices
  • Support cloud infrastructure management and containerized application environments on AWS
  • Promote DevOps best practices, governance, and operational standards across teams

Qualifications

Bachelor’s or master’s degree in computer science, Information Technology, or a related fields   

Read more
Credilio Financial Technologies Pvt. Ltd.
Munjal Dhamecha
Posted by Munjal Dhamecha
Mumbai
4 - 8 yrs
Best in industry
skill iconAmazon Web Services (AWS)
skill iconKubernetes
Terraform

DevOps Engineer

We are looking for a hands-on DevOps Engineer to manage and scale our cloud infrastructure, Kubernetes-based microservice deployments, monitoring systems, and data engineering infrastructure.

The person will be responsible for building reliable, secure, scalable, and cost-efficient infrastructure using automation-first practices. This role is important for supporting a high-growth B2C platform where availability, deployment velocity, observability, security, and cost efficiency are critical.


Key Responsibilities

  • Manage and automate cloud infrastructure using Terraform.
  • Deploy, manage, and troubleshoot microservices on Kubernetes.
  • Build and maintain CI/CD pipelines to ensure reliable, controlled deployments.
  • Implement safe release practices, including rolling deployments, rollback, and zero-downtime deployments.
  • Manage monitoring, logging, alerting, dashboards, and production runbooks.
  • Support incident response, production debugging, RCA, and preventive action closure.
  • Ensure infrastructure is scalable, secure, highly available, and cost-optimised.
  • Support data engineering infrastructure, including ClickHouse, PeerDB, Airflow, Kafka, and related platform components.
  • Maintain infra-level security controls, backups, disaster recovery, and access governance.

Required Skills

  • Strong experience with Terraform, Infrastructure as Code, and AWS.
  • Strong experience with Kubernetes, Docker, Helm, ingress, and autoscaling.
  • Experience with CI/CD tools such as GitHub Actions, GitLab CI, Jenkins, ArgoCD, or similar.
  • Experience with monitoring and observability tools such as Prometheus, Grafana, ELK/OpenSearch, New Relic, or similar.
  • Good understanding of cloud networking, DNS, load balancers, VPC/VPN, SSL/TLS, firewalls, and WAF.
  • Experience with Linux administration, shell scripting, and automation.
  • Understanding of cloud security, IAM, secrets management, and access governance.
  • Exposure to databases, queues, caches, and data infrastructure tools such as ClickHouse, PeerDB, Airflow, Kafka, or similar.
  • Strong debugging and problem-solving skills during production incidents.
  • Ability to work closely with engineering teams to improve deployment, monitoring, cost, and reliability. 
Read more
Remote only
3.5 - 6 yrs
₹6L - ₹18L / yr
Terraform
Serverless
skill iconAmazon Web Services (AWS)
DevSecOps

Mactores is a trusted leader among businesses in providing modern data platform solutions. Since 2008, Mactores have been enabling businesses to accelerate their value through automation by providing End-to-End Data Solutions that are automated, agile, and secure. We collaborate with customers to strategize, navigate, and accelerate an ideal path forward with a digital transformation via assessments, migration, or modernization.


You will be part of the DevOps engineers' team, managing large customer deployments including Linux and Windows Administration, Large Enterprise Application, and Big Data Workloads. You will have broad business and technology expertise coupled with a background in professional services and client-facing skills. You are passionate about the best practices of cloud deployment and ensuring the customer expectation is set and met appropriately. You will help us build scalable, efficient cloud infrastructure.

 

You’ll implement monitoring for automated system health checks. Lastly, you’ll build our CI pipeline, and train and guide the team in DevOps practices.  If you love to solve problems using your skills, then come join the Team Mactores. We have a casual and fun office environment that actively steers clear of rigid "corporate" culture, focuses on productivity and creativity, and allows you to be part of a world-class team while still being yourself.


What you will do?

  • Application migration projects from on-premises to AWS.
  • Database (RDBMS, NoSQL, DW, Hadoop) migration projects from on-premises to AWS. 
  • Automate operational and server provisioning workflows using AWS CFT on AWS.
  • Share the responsibility for deploying releases and conducting other operations maintenance.
  • Enhance operations infrastructures such as Jenkins clusters, Bitbucket, monitoring tools (Consul), and metrics tools such as Graphite and Grafana.
  • Provide operational support for the rest of the Engineering team.
  • Help migrate our remaining dedicated hardware infrastructure to the cloud.
  • Establish and maintain operational best practices.


What do you have?

  • 2+ years of experience in using Terraform for IaaC.
  • 2+ years of configuration management and engineering for large scale customers, ideally supporting an Agile development process.
  • 2+ years of Linux Administration experience.
  • Deep understanding of version control systems (git), including branching and merging strategies.
  • Experience working with cloud platforms (AWS/EC2/ ECS/ RDS/ CloudFormation, Cloudwatch, etc.) and cloud automation tools (Ansible, Chef).
  • Experience with software build tools (Maven, Gradle) and continuous integration tools (Jenkins).
  • Must have supported Java-based applications in a production environment.
  • Experience with Linux environments and scripting languages - bash, python, Groovy.
  • Experience in supporting Node.js in production is a plus.
  • Knowledge of service discovery tools such as Consul is a plus.
  • Comfortable working late evening hours, which is when most patching occurs.
  • You are extremely proactive at identifying ways to improve things and make them more reliable.


You will be preferred if

  • You are AWS DevOps Pro or AWS SA Pro Certified


Read more
CLOUDSUFI
Bengaluru (Bangalore)
9 - 15 yrs
₹50L - ₹65L / yr
Architecture
Google Cloud Platform (GCP)
Google BigQuery
Terraform
Technical Architecture
+1 more

About Us

CLOUDSUFI, a Google Cloud Premier Partner, is a global leading provider of data-driven digital transformation across cloud-based enterprises. With a global presence and focus on Software & Platforms, Life sciences and Healthcare, Retail, CPG, financial services, and supply chain, CLOUDSUFI is positioned to meet customers where they are in their data monetization journey.


Our Values

We are a passionate and empathetic team that prioritizes human values. Our purpose is to elevate the quality of lives for our family, customers, partners and the community.


Equal Opportunity Statement

CLOUDSUFI is an equal opportunity employer. We celebrate diversity and are committed to creating an inclusive environment for all employees. All qualified candidates receive consideration for employment without regard to race, color, religion, gender, gender identity or expression, sexual orientation, and national origin status. We provide equal opportunities in employment, advancement, and all other areas of our workplace. Please explore more at https://www.cloudsufi.com/


Summary

The Tech Lead is the technical spine of the engagement. They sit across all four delivery areas — Data Engineering, Frontend, ML/AI, and DevOps/Infrastructure — providing architectural direction, resolving cross-track dependencies, and ensuring the quality and coherence of everything we ship. Equally critical is the ability to represent CLOUDSUFI in a credible, articulate, and collaborative manner to Google counterparts at every level.


What You Will Do

Technical Leadership & Architecture

• Own the end-to-end technical architecture across all program tracks, ensuring alignment with Google Data Commons standards, APIs, and roadmap.

• Define and enforce engineering best practices, coding standards, and review processes across Data Engineering, Frontend, ML/AI, and DevOps/Infra workstreams.

• Make high-stakes design decisions on data modeling, pipeline architecture, schema design, and system integration — with clear documentation and traceability.

• Proactively identify technical risks, cross-track conflicts, and blockers; drive resolution before they impact delivery.

• Lead or participate in architecture review sessions with Google's core engineering team.


Cross-Track Coordination

• Act as the single technical point of contact horizontally across all tracks, breaking down silos and ensuring consistent patterns and interfaces.

• Facilitate technical syncs across track leads; surface dependencies early and coordinate sequencing of deliverables.

• Define shared standards for data contracts, API interfaces, and infrastructure configuration used across tracks.


Google Partnership & Communication

• Build and maintain a strong working relationship with the Google Data Commons core team — attending joint ceremonies, reviews, and design discussions.

• Represent CLOUDSUFI’s delivery quality and technical credibility in all interactions with Google stakeholders.

• Translate complex technical discussions into clear summaries for both technical and non-technical audiences across both organizations.

• Proactively communicate progress, blockers, and decisions to CLOUDSUFI leadership and Google program managers.


Delivery Quality & Engineering Excellence

• Drive code reviews, set quality gates, and ensure CI/CD pipelines, test coverage, and observability standards are met across all tracks.

• Champion performance, reliability, and scalability from design through production.

• Mentor and technically grow senior engineers across tracks without carrying formal HR responsibility.

• Contribute hands-on to critical technical work when needed — this is a hybrid IC and leadership role. Technology Stack & Domain Knowledge


Core / Must-Have

• Google Data Commons – deep familiarity with the Data Commons knowledge graph, DCID conventions, import pipelines, and the Statistical Variable hierarchy.

• Google Cloud Spanner – schema design, distributed transactions, interleaved tables, and performance tuning at scale.

• Google BigQuery – data modeling, partitioning/clustering strategies, query optimization, and integration with downstream consumers.

• Data pipelines – Apache Beam / Dataflow, or equivalent GCP-native ETL tooling.

• Infrastructure as Code – Terraform on GCP; Cloud Build, Artifact Registry, GKE or Cloud Run.


Strong Advantage

• Python (primary language for Data Commons import tooling and ML pipelines).

• TypeScript / React for the Data Commons web frontend and visualization layers.

• Vertex AI, BigQuery ML, or equivalent ML lifecycle tooling.

• Knowledge graph principles, RDF/SPARQL, or statistical data modeling.

• Data Commons Python / REST APIs and the DCID import automation tools.

• Experience with Google's internal engineering culture, tools (e.g. Buganizer, Critique, Cider), or prior delivery inside a Google product or partnership engagement.


Experience & Qualifications Required

• 10+ years of software engineering experience, with at least 3 years in a formal or informal tech lead capacity overseeing multiple workstreams.

• Demonstrable experience delivering production-grade systems on Google Cloud Platform.

• Prior experience working with or for Google — as a Googler, through a Google partnership program, or as a contractor embedded in a Google team — is strongly preferred.

• Exceptional verbal and written English communication skills; able to engage confidently with senior Google engineers and program managers.

• Proven ability to operate across ambiguous, fast-moving programs with multiple parallel tracks.

• Based in Bangalore, India, and able to work on-site at Google's Bangalore office on a regular basis.


Preferred

• Experience in the data and AI consulting or professional services space, particularly within the Google Cloud partner ecosystem.

• Familiarity with open data, public statistics, or knowledge graph use cases (e.g. UN SDG data, census data, health/economic indicators).

• Prior contribution to open-source projects, particularly Google Data Commons or related tooling.


What CLOUDSUFI Offers

• A flagship engagement at the intersection of Google's open data infrastructure and real-world AI/data impact.

• Daily collaboration with Google's core engineering team — a rare opportunity to work inside one of Google's strategic open data initiatives.

• Competitive compensation benchmarked at Staff / Principal level in the Bangalore market.

• CLOUDSUFI’s full benefits package.

• A culture that values technical excellence, client trust, and a people-first approach to delivery.

Read more
Reliable Group

at Reliable Group

2 candid answers
Nilesh Gend
Posted by Nilesh Gend
Pune
5 - 12 yrs
₹15L - ₹35L / yr
Google Cloud Platform (GCP)
Ansible
Terraform

Job Title: GCP Cloud Engineer/Lead


Location: Pune, Balewadi

Shift / Time Zone: 1:30 PM – 10:30 PM IST (3:00 AM – 12:00 PM EST, 3–4 hours overlap with US Eastern Time)


Role Summary

We are seeking an experienced GCP Cloud Engineer to join our team supporting CVS. The ideal candidate will have a strong background in Google Cloud Platform (GCP) architecture, automation, microservices, and Kubernetes, along with the ability to translate business strategy into actionable technical initiatives. This role requires a blend of hands-on technical expertise, cross-functional collaboration, and customer engagement to ensure scalable and secure cloud solutions.


Key Responsibilities

  • Design, implement, and manage cloud infrastructure on Google Cloud Platform (GCP) leveraging best practices for scalability, performance, and cost efficiency.
  • Develop and maintain microservices-based architectures and containerized deployments using Kubernetes and related technologies.
  • Evaluate and recommend new tools, services, and architectures that align with enterprise cloud strategies.
  • Collaborate closely with Infrastructure Engineering Leadership to translate long-term customer strategies into actionable enablement plans, onboarding frameworks, and proactive support programs.
  • Act as a bridge between customers, Product Management, and Engineering teams, translating business needs into technical requirements and providing strategic feedback to influence product direction.
  • Identify and mitigate technical risks and roadblocks in collaboration with executive stakeholders and engineering teams.
  • Advocate for customer needs within the engineering organization to enhance adoption, performance, and cost optimization.
  • Contribute to the development of Customer Success methodologies and mentor other engineers in best practices.


Must-Have Skills

  • 8+ years of total experience, with 5+ years specifically as a GCP Cloud Engineer.
  • Deep expertise in Google Cloud Platform (GCP) — including Compute Engine, Cloud Storage, Networking, IAM, and Cloud Functions.
  • Strong experience in microservices-based architecture and Kubernetes container orchestration.
  • Hands-on experience with infrastructure automation tools (Terraform, Ansible, or similar).
  • Proven ability to design, automate, and optimize CI/CD pipelines for cloud workloads.
  • Excellent problem-solving, communication, and collaboration skills.
  • GCP Professional Certification (Cloud Architect / DevOps Engineer / Cloud Engineer) preferred or in progress.
  • Ability to multitask effectively in a fast-paced, dynamic environment with shifting priorities.


Good-to-Have Skills

  • Experience with Cloud Monitoring, Logging, and Security best practices in GCP.
  • Exposure to DevOps tools (Jenkins, GitHub Actions, ArgoCD, or similar).
  • Familiarity with multi-cloud or hybrid-cloud environments.
  • Knowledge of Python, Go, or Shell scripting for automation and infrastructure management.
  • Understanding of network design, VPC architecture, and service mesh (Istio/Anthos).
  • Experience working with enterprise-scale customers and cross-functional product teams.
  • Strong presentation and stakeholder communication skills, particularly with executive audiences.


Read more
Searce Inc
Mumbai
4 - 8 yrs
Best in industry
Google Cloud Platform (GCP)
Terraform
skill iconKubernetes
GKE
Site reliability
+3 more

Senior Cloud Security & Reliability Engineer (GCP)

Searce Inc | Mumbai, Maharashtra | On-site | 4–7 Years

About Searce

Searce is a global, AI-native, engineering-led technology consultancy and a Premier Google Cloud Partner — recognized as the Google Cloud Workplace AI Transformation Partner of the Year, APAC (2026). With 20+ years of experience and 3,000+ clients across 10+ countries, we help businesses stay ahead of the cloud curve. The Role We're looking for a Senior Cloud Security & Reliability Engineer with deep GCP expertise to join our Mumbai MSP team. You'll own reliability, security, and optimization of enterprise client GCP environments — 24/7.


What You'll Do

Own Reliability — Lead 24x7 GCP cloud operations and incident management. Define and enforce SLOs.

Engineer the Blueprint — Build scalable, secure GCP architectures and maintain IaC modules and playbooks.

Automate Everything — Embed AI-driven automation and AIOps to eliminate toil and preempt incidents.

Drive FinOps — Own GCP cost optimization for clients with quantified impact.

Be the Expert — Represent deep GCP expertise in client conversations and documentation.

Mentor & Elevate — Coach junior engineers through code reviews and problem-solving.


What We're Looking For

Experience

4–7 years total with 4+ years on GCP cloud infrastructure Background in Cloud Managed Services / MSP environments

2+ years in a client-facing technical role


Technical Skills (Must-Have)

GCP: GKE, IAM, VPC, Cloud Monitoring, Stackdriver — in work experience

Kubernetes: GKE — demonstrated in production

IaC: Terraform — module-level, demonstrated in work experience Observability: Prometheus + Grafana minimum

Security: GCP IAM, VPC controls, Security Command Center

Scripting: Python


Nice to Have

GCP Professional Cloud Architect / Pro DevOps Engineer certification BFSI or enterprise domain experience Thanos, Vault, Istio, ArgoCD FinOps Foundation Certification


Why Searce?

🏆 Google Cloud Partner of the Year — APAC 2026

🌍 Enterprise clients across US, APAC, and India

🤖 AI-first, engineering-led culture

📈 Fast-growing MSP team with real career ownership

🤝 HAPPIER values — Humble, Adaptable, Positive, Passionate, Innovative, Excellence, Responsible

📧 Interested? Share your profile and let's connect.

Read more
Appiness Interactive Pvt. Ltd.
S Suriya Kumar
Posted by S Suriya Kumar
Bengaluru (Bangalore)
2.5 - 5.5 yrs
₹3L - ₹10L / yr
Terraform
skill iconDocker
DevOps
CI/CD
skill iconAmazon Web Services (AWS)

Appiness is a Bengaluru-based technology and product engineering company. We work with enterprises and high growth startups across the globe, building digital products that scale. We build enterprise AI, agentic and conversational AI, and cloud-native platforms across web, mobile, and cloud. We invest deeply in modern tech, with a strong bias for clean architecture, automation, and engineering craft across everything we build. With 50+ international design awards and recognition among India's fastest-growing companies, we're proud of the bar we set on both technology and experience. We love what we build, and we have fun building it.


What you'll do


  • Design, provision, and manage AWS infrastructure across dev, staging, and production using Terraform modules and remote state.
  • Build and maintain CI/CD pipelines for fast, reliable, repeatable deployments across multiple services.
  • Containerize applications with Docker and orchestrate workloads on ECS or Kubernetes with sensible scaling and rollout strategies.
  • Set up monitoring, alerting, and centralized logging so issues are caught and triaged before customers feel them.
  • Own cloud security end to end: IAM hygiene, secrets management, network segmentation, encryption in transit and at rest, and patching across client accounts.
  • Harden infrastructure against common attack surfaces, run vulnerability scans on containers and dependencies, and stay aligned with industry security standards.
  • Partner with engineering teams to streamline release cycles, reduce lead time, and remove friction from day-to-day workflows.
  • Document architecture, runbooks, and post-incident learnings so the platform and the team scale together.


What we're looking for

  • 3+ years of hands-on DevOps or Cloud Engineering experience in production environments.
  • Strong AWS fundamentals: VPC, EC2, ECS/EKS, RDS, S3, IAM, CloudWatch, Route 53, ALB/NLB.
  • Production experience with Terraform, including modules, remote state, drift management, and multi-environment patterns.
  • Comfort with Docker, container registries (ECR), and image security and patching practices.
  • CI/CD experience with Jenkins or GitHub Actions, including secret handling and approval gates.
  • Solid grasp of cloud security: IAM, secrets management, network policies, TLS, and least- privilege design.
  • Exposure to vulnerability scanning, container hardening, and security best practices in CI/CD pipelines.
  • Scripting in Python or Bash for automation, glue, and small internal tools.
  • Solid Linux system administration and networking fundamentals.
Read more
Unico Connect Private Limited
Mumbai
4 - 8 yrs
Best in industry
MLOps
LangGraph
Open Telemetry
LLMOps
skill iconAmazon Web Services (AWS)
+6 more

Senior MLOps Engineer

LLM Operations, Observability & Eval Infrastructure

📍 Mumbai (On-site) | Full-time | 5-7 years


About the Role:

Unico Connect is an AI-first technology partner that builds custom mobile, web, and AI products for clients across multiple geographies.

We are hiring a Senior MLOps Engineer for a dedicated client engagement focused on building an AI-powered application builder platform. The platform consumes LLMs at scale through provider APIs.

This role owns the operational discipline around production LLM consumption - increasingly called LLMOps - covering observability, evaluation infrastructure, model lifecycle, cost operations, prompt deployment, and agent run reliability.


The mandatory requirement is hands-on production experience operating LLM-backed systems, with a strong DevOps or SRE foundation. This is not a model training or ML science role.

The work is making the system around the AI engineer's designs observable, controlled, reliable, and economically accountable. You will pair daily with the Senior AI Engineer, who designs prompts, evals, and agent behaviour - you operationalise those systems for production.

A typical week includes a tracing audit on a degraded agent run, an eval pipeline build for a new model release, a cost attribution review, and a staged prompt rollout.


Responsibilities:


Observability and Tracing

Build and own end-to-end tracing for agent runs: every prompt, response, tool call, token count, latency, and cost, linked to user session and project.

Stand up and operate LLM observability tooling (Langfuse, LangSmith, Braintrust, or Arize Phoenix).

Make debugging a single bad agent run among thousands a routine workflow through searchable traces, failure taxonomies, and dashboards segmented by task type.


Evaluation Infrastructure as a Production System

Operationalise the eval suite designed by the Senior AI Engineer: automated execution in CI on every prompt or model change, with results stored and trended over time.

Implement regression gates that block quality-degrading changes from shipping.

Build production sampling to continuously score a sample of real agent runs and catch quality drift that offline evals miss.


Model Lifecycle Management

Pin model versions, never "latest".

Own the upgrade process: run the eval suite against new model releases and manage eval-gated migrations.

Maintain fallback chains across providers for graceful degradation or queueing during outages.

Track provider deprecation schedules and plan migrations ahead of forced cutoffs.


Cost Operations

Implement per-user and per-task cost attribution - token spend is the platform's largest variable cost and requires the same rigour as cloud cost management.

Set up budget alerts and anomaly detection so a single user or bug cannot burn significant spend overnight.

Monitor prompt cache hit rates and quantify savings.

Manage capacity planning around provider rate limits, including quota negotiation and throughput tiering.


Prompt and Configuration Deployment

Treat prompts as production artifacts: version control for prompts and agent configurations, staged rollout infrastructure (deploy a prompt change to a percentage of traffic before full rollout), A/B testing infrastructure, instant rollback, and audit history covering which prompt version served which user and when.


Reliability Engineering for Agent Runs

Agent runs are long, stateful, and failure-prone.

Own retry and resume semantics so a run that fails mid-way does not restart from scratch.

Implement timeouts and circuit breakers on provider calls, dead-letter handling for failed runs, and queue and concurrency management for agent workloads.


SLO Ownership and Incident Response

Define and track SLOs for agent run latency and completion rates.

Lead incident response when SLOs are breached.

Write postmortems.

Surface reliability risks proactively before they reach users.


Safety and Compliance Operations

Run the moderation pipeline (prompt and output classification) in production.

Monitor for abuse patterns and own incident response when the agent misbehaves at scale.

Maintain audit logs and implement data retention and residency policies for prompts and generated code as enterprise requirements emerge.


AI-Assisted Engineering Discipline

Use Claude, Cursor, and similar tools day to day for infrastructure code, scripts, and pipelines.

Set the team standard for safe use, review, and validation of AI-generated infrastructure before it ships.


Requirements:


Hands-on production ownership of LLM-backed systems in operation (mandatory).

Must have personally shipped and operated at least one LLM-powered system in production, with operational responsibility including oncall, incident response, and reliability ownership.

Alternatively: strong DevOps or SRE background with demonstrated hands-on familiarity with LLMOps tooling (Langfuse, LangSmith, Braintrust, Arize, or equivalent).

POCs and lab work do not qualify.


5+ years of overall engineering experience

With at least 2 years in DevOps, SRE, platform engineering, or LLM operations roles.

This is not an ML science role.

A DevOps or SRE background with a substantive pivot into LLMOps is a strong qualification.


Observability and Tracing Depth

Production experience with LLM observability tooling - Langfuse, LangSmith, Braintrust, or Arize Phoenix.

Comfortable instrumenting with OpenTelemetry, Prometheus, and Grafana.

Able to build and search trace pipelines, define failure taxonomies, and surface quality signals from production traffic.


CI/CD and Quality Gate Experience

Strong with GitHub Actions or GitLab CI.

Experience building automated quality gates: eval-gated pipelines, regression enforcement, or coverage gates that block degrading changes from shipping.


Cost Management and Attribution for Usage-Based Services

Experience owning cost attribution for cloud API spend or equivalent.

Comfortable with budget alerts, anomaly detection, and per-user or per-task cost breakdowns.


Reliability Engineering for Long-Running, Stateful Workloads

Experience with queues, retry patterns, idempotency, and failure recovery on asynchronous or multi-step workloads.

Comfortable defining SLOs and being accountable for them on production systems.


Multi-Provider API Management

Familiarity with LLM provider rate limits, version pinning, fallback chains, and quota management across OpenAI, Anthropic, Google, or equivalent.


Infrastructure as Code and Deployment Automation

Hands-on with Terraform or Pulumi and Docker.

AWS working knowledge (EC2, S3, IAM, EKS or ECS).

Strong with CI/CD for deploying services and configuration changes safely.


Nice to Have

  • Experience with prompt A/B testing or staged rollout infrastructure
  • Workflow orchestration (BullMQ, Temporal, Celery)
  • Content moderation pipeline experience
  • Data residency and compliance requirements for AI systems
  • Kubernetes (EKS) in production
  • AWS certifications
Read more
Remote only
5 - 9 yrs
₹15L - ₹20L / yr
skill iconReact.js
skill iconNodeJS (Node.js)
skill iconPython
skill icongrafana
skill iconAmazon Web Services (AWS)
+5 more

Overview

We are seeking a versatile Full-Stack Cloud Developer to build modern front-end Web UIs for client data presentations. In this role, you will lead the development of our internal customer interface while also functioning as a billable resource for diverse client projects. You must be agile and capable of translating complex application data logs and cloud metrics into clean, actionable dashboards. This is a client-facing position that requires strong English and presentation skills. You will be expected to interface directly with clients on special projects, presentations, and requirements gathering.


Key Responsibilities

Build modern front-end Web UIs: Design client data presentations using APIs and application data logs for both internal and external clients.

Data Visualization: Integrate observability tools like Grafana to deliver high-fidelity metrics for cloud and NOC performance tracking.

Full-Stack Development: Develop secure, multi-tenant application logic hosted in AWS and Azure environments.

Enforce Version Control: Maintain strict discipline using GitHub for all code and Terraform for infrastructure deployments.

Client Engagement: Lead presentations regarding custom-built data solutions and portal features to stakeholders.

Rigorous Documentation: Maintain detailed records of architecture, API schemas, and codebase standards for long-term maintainability. Collaborative Execution: Work within an engineering team to ensure technical goals are met and operational friction is reduced.


Required Qualifications

Full-Stack Proficiency: Expert knowledge of modern frameworks (React, Node.js, or Python) to build data-driven applications. Observability & Metrics: Strong experience with Grafana integration, embedding dashboards, and visualizing FinOps/NOC data. Documentation Discipline: Proven ability to create clear technical guides for both team members and clients.

Communication: Exceptional verbal and written English skills for high-level client presentations and engagements.

Cloud Foundations: Hands-on experience with AWS and Azure services, particularly serverless (Lambda/Azure Functions).

Tooling: Proficient with GitHub and Terraform for version control and infrastructure management.


Professional Growth & Career Path

Technical Leadership: Opportunity to own the full lifecycle of mission-critical products and grow into Lead Architect roles. Certification Support: We encourage and support growing into advanced professional-level certifications to stay ahead of the curve.

Team Culture: Participate in a culture that values collective problem-solving, mentorship, and shared technical goals.

Read more
Improving
Leena Lahari
Posted by Leena Lahari
Bengaluru (Bangalore)
4 - 8 yrs
₹12L - ₹30L / yr
skill iconKubernetes
Google Cloud Platform (GCP)
Terraform
helm
ArgoCD
+4 more

Site Reliability Engineer

Experience - 4 - 8 Years

Location - Bangalore (Hybrid) 


We are seeking a highly skilled Site Reliability Engineer (SRE) to design, build, and operate scalable, reliable, and secure cloud-native platforms. The ideal candidate will have strong experience with Kubernetes ecosystems, cloud infrastructure, automation, observability, and GitOps practices.


Key Responsibilities

  • Manage and optimize Kubernetes-based platforms, including Cilium, Istio, Ingress Controllers, and related ecosystem components.
  • Design, deploy, and maintain infrastructure on Google Cloud Platform (GCP).
  • Automate infrastructure provisioning and lifecycle management using Terraform.
  • Implement and manage GitOps workflows using ArgoCD and GitLab.
  • Deploy and maintain Helm charts for Kubernetes applications.
  • Manage secrets, service discovery, and distributed systems using Vault and Consul.
  • Build and maintain monitoring, logging, and observability platforms using Prometheus Operator and the Grafana Stack (Grafana, Mimir, Loki, Alloy, Tempo, and Pyroscope).
  • Collaborate with development teams to improve platform reliability, performance, scalability, and operational excellence.
  • Develop CI/CD pipelines and automation to support modern cloud-native deployments.


Required Skills

  • Strong hands-on experience with Kubernetes (K8s) and cloud-native technologies.
  • Experience with GCP, Terraform, Helm, and ArgoCD.
  • Knowledge of Service Mesh technologies, particularly Istio and Cilium.
  • Experience with Vault, Consul, and infrastructure security best practices.
  • Strong expertise in observability tools including Prometheus and the Grafana ecosystem.
  • Proficiency with GitOps, GitLab, CI/CD pipelines, and automation.
  • Good understanding of Linux systems, networking, and troubleshooting in distributed environments.


Preferred Qualifications

  • Experience operating large-scale production environments.
  • Knowledge of SRE principles, incident management, capacity planning, and reliability engineering.
  • Relevant cloud-native certifications (CKA, GCP, Terraform, etc.) are a plus.


Read more
VDart Technology
Bengaluru (Bangalore)
7 - 10 yrs
₹25L - ₹35L / yr
Azure
DevOps
Terraform
skill iconDocker

Role: Azure Devops Consultant

Location: Bangalore

Fulltime

Work Mode: 5 Days Work From Office.


Responsibilities:

· Design and architect CI/CD pipelines and development automation solutions.

· Lead implementation of build versioning, packaging, and dependency

management strategies

· Define standards for code quality, security scanning, and automated testing

integration

· Create reference architectures for development workflows and automation

patterns

· Collaborate with security teams to implement "shift-left" security practices

· Guide implementation of A/B testing and feature flagging architectures


Skills:

· Strong expertise in Azure DevOps,PAAS, Git, CI/CD practices

· Deep understanding of SDLC and automation

· Containerization (Docker, Kubernetes)

· Security scanning tools (Veracode, SonarQube)

· Build tools and artifact management

· IaC (Terraform, ARM templates)


Read more
Searce Inc

at Searce Inc

3 recruiters
Karthika Senthilkumar
Posted by Karthika Senthilkumar
Pune
7 - 12 yrs
Best in industry
Google Cloud Platform (GCP)
Terraform
skill iconKubernetes
GKE
Reliability engineering
+1 more

About Searce

Searce is a global, AI-native, engineering-led technology consultancy and a Premier Google

Cloud Partner — recognized as the Google Cloud Workplace AI Transformation Partner of the

Year, APAC (2026). With 20+ years of experience and 3,000+ clients across 10+ countries, we

help businesses stay ahead of the cloud curve.


The Role

We're looking for a Lead Cloud Security & Reliability Engineer with deep GCP expertise to own

end-to-end cloud reliability and security forAPAC enterprise clients. As Lead, you'll set the architectural direction, mentor your squad, and drive measurable client outcomes across multi-

cloud environments.


What You'll Do

Own Client Delivery — Lead 24x7 GCP cloud operations forAPAC clients. Define SLO frameworks and ensure adherence.


Architect Solutions — Design scalable, secure GCP-primary architectures with multi-cloud awareness.


Drive Reliability — Lead incident response, RCA, and long-term remediation across production systems.


Mentor & Elevate — Coach and grow a squad of Senior CSREs.


Drive FinOps — Own cloud cost governance and optimization with quantified impact.


Be the Expert — Represent Searce's technical depth in global client conversations.


What We're Looking For

Experience

7–12 years total with 5+ years on GCP cloud infrastructure

Strong background in Cloud Managed Services / MSP environments

Proven experience leading a team in client-facing delivery

Multi-cloud exposure (AWS/Azure secondary) preferred


Technical Skills (Must-Have)

  • GCP: GKE, IAM, VPC, Cloud Monitoring, Stackdriver, KMS — demonstrated in work
  • experience
  • Kubernetes: GKE — production cluster management, Helm
  • IaC: Terraform — module-level, reusable frameworks
  • Observability: Prometheus, Grafana, Thanos or equivalent
  • Security: IAM, Zero-trust, DevSecOps, CSPM tools
  • Scripting: Python or Go
  • FinOps: GCP cost governance demonstrated


Nice to Have

  • GCP Professional Cloud Architect / Pro DevOps Engineer certification
  • AWS / Azure secondary experience
  • CKA (Certified Kubernetes Administrator)
  • ITIL / change management awareness
  • APAC client delivery experience


Why Searce?

🏆 Google Cloud Partner of the Year — APAC 2026

🌍 Work with APAC enterprise clients across multiple industries

🤖 AI-first, engineering-led culture

📈 Lead-level ownership with real career growth

🤝 HAPPIER values — Humble, Adaptable, Positive, Passionate, Innovative, Excellence,

Responsible

Read more
Searce Inc

at Searce Inc

3 recruiters
Mohammed Rabidheen
Posted by Mohammed Rabidheen
Gurugram
5 - 10 yrs
Best in industry
skill iconAmazon Web Services (AWS)
Amazon EKS
Terraform
Amazon VPC
AWS IAM

About the Company

Searce is a global, AI-native, engineering-led technology consultancy and a Premier Google Cloud Partner — recognized as the Google Cloud Workplace AI Transformation Partner of the Year, APAC (2026). With businesses staying ahead with 20+ years of experience and 3,000+ clients across 10+ countries, we lead the cloud curve.


About the Role

We're looking for a Lead Cloud Security & Reliability Engineer with deep AWS expertise to own end-to-end cloud reliability and security for enterprise clients. As Lead, you'll drive architecture decisions, mentor your squad, and deliver measurable client outcomes.


Responsibilities

  • Own Client Delivery — Lead 24x7 AWS cloud operations. Define SLO frameworks and ensure adherence.
  • Architect Solutions — Design scalable, secure AWS architectures for enterprise clients.
  • Drive Reliability — Lead incident response, RCA, and long-term remediation across production systems.
  • Mentor & Elevate — Coach and grow a squad of Senior CSREs.
  • Drive FinOps — Own AWS cost governance and optimization with quantified impact.
  • Be the Expert — Represent Searce's technical depth in client conversations.


Qualifications

Experience: 6–10 years total with 5+ years on AWS cloud infrastructure.


Required Skills

  • Strong background in Cloud Managed Services / MSP environments.
  • Proven experience leading a team in client-facing delivery.


Technical Skills (Must-Have):

  • AWS: EKS, EC2, VPC, IAM, Lambda, CloudWatch, GuardDuty, Security Hub — demonstrated in work experience.
  • Kubernetes: EKS — production cluster management.
  • IaC: Terraform or CloudFormation — module level.
  • Observability: CloudWatch + Prometheus/Grafana.
  • Security: IAM, GuardDuty, WAF, Secrets Manager, DevSecOps.
  • Scripting: Python or Bash.
  • FinOps: AWS Cost Management, Reserved Instances, Savings Plans.


Preferred Skills

  • AWS Solutions Architect Professional / DevOps Engineer certification.
  • Multi-cloud exposure.
  • CKA (Certified Kubernetes Administrator).
  • ITIL / change management awareness.
Read more
VDart Digital
Niharikhaa R
Posted by Niharikhaa R
Bengaluru (Bangalore)
10 - 20 yrs
₹40L - ₹50L / yr
Agentic AI
DevOps
Microsoft Windows Azure
Azure OpenAI
Terraform

We are seeking a highly experienced Azure AI, AIOps & MLOps Architect to lead enterprise-scale AI platform engineering, cloud modernization, DevSecOps transformation, and intelligent automation initiatives.


The ideal candidate should possess deep expertise in Microsoft Azure, Azure AI Foundry, Azure OpenAI, Azure Machine Learning, Kubernetes, Terraform, Azure DevOps, and enterprise observability platforms. The role will focus on designing scalable AI platforms, implementing MLOps and AIOps capabilities, enabling Agentic AI architectures, and driving cloud-native engineering practices across the organization.


Key Responsibilities


Cloud Architecture & Engineering


• Design and implement scalable, secure, and highly available solutions on Microsoft Azure.

• Define cloud architecture standards, reference architectures, and best practices.

• Lead cloud migration and modernisation initiatives across enterprise workloads.

• Implement multi-region disaster recovery and business continuity strategies.

• Oversee Azure networking, identity, security, and governance frameworks.


DevOps & CI/CD


• Architect and implement end-to-end CI/CD pipelines using Azure DevOps or GitHub Actions.

• Drive DevSecOps culture — embedding security scanning, quality gates, and compliance into the delivery pipeline.

• Champion Infrastructure-as-Code (IaC) practices using Terraform, Bicep, or ARM templates.

• Establish branching strategies, release management, and environment promotion standards.

• Define and enforce platform engineering standards and internal developer tooling.


AI & Machine Learning Integration


• Architect AI/ML solutions leveraging Azure AI services — Azure OpenAI, Azure Machine Learning, Azure AI Foundry, and Cognitive Services.

• Design intelligent automation and agentic workflows integrated into enterprise DevOps processes.

• Implement AI-powered capabilities such as code review assistance, anomaly detection, predictive analytics, and natural language automation.

• Define AI governance frameworks: model evaluation, prompt management, responsible AI, and cost controls.

• Design and implement enterprise MLOps frameworks.

• Build automated model training, validation, deployment, and monitoring pipelines.

• Establish model governance and lifecycle management.


Generative AI & Agentic AI


• Design enterprise GenAI solutions using Azure OpenAI.

• Build AI Agents using Azure AI Foundry.

• Develop Agent-to-Agent communication patterns.

• Implement Retrieval Augmented Generation (RAG) architectures.

• Build enterprise Knowledge Management and AI Skill Registry platforms.

• Design multi-agent orchestration frameworks.


Leadership & Stakeholder Engagement


• Serve as the technical authority and subject matter expert for Azure AI and DevOps practices.

• Mentor and guide junior architects, developers, and DevOps engineers.

• Collaborate with business stakeholders, product owners, and vendors to translate requirements into technical solutions.

• Produce architecture documentation, decision records (ADRs), and roadmaps.

• Represent the technology function in enterprise architecture forums and governance boards.


Required Qualifications


• Bachelor's or Master's degree in Computer Science, Information Technology, or a related field.

• 10+ years of experience in cloud engineering and architecture.

• 5+ years of hands-on experience with Microsoft Azure across compute, networking, storage, identity, and data services.

• Proven experience designing and implementing enterprise-grade CI/CD pipelines.

• Strong hands-on expertise with Infrastructure-as-Code (Terraform, Bicep, or ARM).

• Demonstrated experience architecting and deploying AI/ML solutions on Azure (Azure OpenAI, Azure ML, AI Foundry).

• Deep knowledge of DevSecOps principles, tools, and practices.

• Experience with containerisation and orchestration: Docker, Kubernetes (AKS).

• Proficiency in scripting and development: Python, PowerShell, Bash.

• Excellent communication and stakeholder management skills.


Preferred Qualifications


• Microsoft Certified: Azure Solutions Architect Expert.

• Microsoft Certified: DevOps Engineer Expert.

• Microsoft Certified: Azure AI Engineer Associate.

• Experience with Azure API Management (APIM), Event Grid, and Azure Functions.

• Familiarity with Datadog, Prometheus, or equivalent observability platforms.

• Experience in the real estate, retail, or enterprise industry sector.

• Knowledge of agentic AI frameworks and LLM orchestration patterns (LangChain, Semantic Kernel, MCP).

• Background in building Internal Developer Platforms (IDP).

Read more
Pune
4 - 7 yrs
Best in industry
DevSecOps
skill iconAmazon Web Services (AWS)
DevOps
Github Actions
sonarqube
+18 more

About NonStop io Technologies

NonStop io Technologies is a value-driven company with a strong focus on process-oriented software engineering. We specialize in Product Development and have a decade's worth of experience in building web and mobile applications across various domains. NonStop io Technologies follows core principles that guide its operations and believes in staying invested in a product's vision for the long term. We are a small but proud group of individuals who believe in the 'givers gain' philosophy and strive to provide value in order to seek value. We are committed to and specialize in building cutting-edge technology products and serving as trusted technology partners for startups and enterprises. We pride ourselves on fostering innovation, learning, and community engagement. Join us to work on impactful projects in a collaborative and vibrant environment.


Brief Description

We are looking for a skilled DevSecOps Engineer who can help design, automate, and secure cloud-native platforms for healthcare and life sciences clients. The ideal candidate will have hands-on experience with cloud security, infrastructure automation, CI/CD pipelines, compliance controls, and platform operations in regulated environments.


You will work closely with engineering teams, architects, security stakeholders, and client representatives to build secure-by-design systems that meet healthcare security and compliance requirements. Experience supporting AI/ML platforms, healthcare data platforms, or regulated workloads is highly desirable.


Roles and Responsibilities

  • Design and implement security controls aligned with healthcare regulations, including HIPAA, HITRUST, and industry security best practices
  • Ensure secure handling of Protected Health Information (PHI), Personally Identifiable Information (PII), and sensitive healthcare datasets
  • Support client security reviews, vendor assessments, penetration testing remediation, and compliance audits
  • Partner with engineering teams to establish secure SDLC practices and shift-left security initiatives
  • Implement cloud governance policies, security baselines, and compliance automation across multiple client environments
  • Build and maintain audit-ready logging, monitoring, and evidence collection mechanisms
  • Support disaster recovery, business continuity, and security incident response processes
  • Collaborate with healthcare product teams working on FHIR APIs, healthcare integrations, clinical applications, genomics platforms, or AI-enabled healthcare solutions
  • Experience working with healthcare, life sciences, biotech, genomics, digital health, or regulated SaaS platforms is strongly preferred
  • Understanding of PHI, PII, healthcare security controls, and healthcare compliance requirements
  • Familiarity with healthcare interoperability standards such as FHIR, HL7, SMART on FHIR, or healthcare APIs is a plus
  • Experience securing healthcare data platforms, analytics environments, AI/ML workloads, or regulated cloud environments is highly desirable
  • Ability to work directly with client stakeholders and communicate security risks, recommendations, and remediation plans
  • Experience participating in security assessments, audits, compliance reviews, and client-facing technical discussions
  • Strong documentation and security governance skills


Requirements

  • 4–7 years of experience in DevOps, DevSecOps, SRE, or Platform Engineering
  • Strong experience with AWS, Azure, or GCP and cloud security best practices
  • Hands-on experience with CI/CD tools such as Jenkins, GitHub Actions, GitLab CI, or Azure DevOps
  • Experience with security tools, including SonarQube, Snyk, Checkmarx, Fortify, Veracode, or similar platforms
  • Strong understanding of vulnerability management, IAM, threat detection, and security scanning
  • Experience implementing compliance controls aligned with one or more of the following frameworks:
  • HIPAA
  • HITRUST
  • SOC 2
  • ISO 27001
  • NIST Cybersecurity Framework
  • PCI-DSS (where applicable)
  • FDA-regulated software environments (preferred)
  • Proficiency with Terraform, CloudFormation, ARM, Docker, Kubernetes, Linux, and shell scripting
  • Experience with monitoring and observability tools such as Prometheus, Grafana, ELK, or Datadog
  • Exposure to MLOps/AI platforms, model deployment, or AI workload management is desirable
  • Strong troubleshooting, automation, networking, and cloud security skills


Why Join Us?

  • Opportunity to work on a cutting-edge healthcare product
  • A collaborative and learning-driven environment
  • Exposure to AI and software engineering innovations
  • Excellent work ethic and culture

If you're passionate about technology and want to work on impactful projects, we'd love to hear from you!


Read more
Whitefield Careers
Whitefield Team
Posted by Whitefield Team
Bengaluru (Bangalore)
6 - 12 yrs
₹15L - ₹25L / yr
DevOps
skill iconAmazon Web Services (AWS)
skill icongrafana
prometheus
Terraform
+2 more

Required Skills

● Experience: Minimum of 5 years of professional experience in a DevOps Engineer

role

● Cloud Proficiency: Proven experience with at least one major cloud provider (AWS,

Azure, or GCP).

● Scripting & Programming: Strong scripting skills in languages such as Bash, Python,

or Go.

● IaC Tools: Hands-on experience with Terraform.

● Container Technology: Expertise in Docker and Kubernetes.

● CI/CD Tools: Proficient with CI/CD platforms like Jenkins, GitLab CI, or Travis CI.

● Configuration Management: Experience with configuration management tools like

Ansible, Chef, or Puppet.

● Version Control: Strong knowledge of Git and branching strategies.

● Problem-Solving:Excellent problem-solving abilities and a commitment to automation

and continuous improvement.

Read more
NeoGenCode Technologies Pvt Ltd
Akshay Patil
Posted by Akshay Patil
Remote only
6 - 10 yrs
₹5L - ₹11L / yr
AWS Aurora PostgreSQL
skill iconPostgreSQL
AWS RDS
SQL
Performance tuning
+21 more

Job Title : DBA – AWS Aurora PostgreSQL

Experience : 6+ Years

Work Mode : Remote

Laptop Pickup : One-time pickup from Noida, Bengaluru, Pune, Hyderabad, or Chennai

Contract Duration : 6 Months


Job Summary :

We are seeking an experienced AWS Aurora PostgreSQL DBA to manage, optimize, and support enterprise database environments hosted on AWS.

The ideal candidate should have strong expertise in PostgreSQL, AWS RDS/Aurora PostgreSQL, database performance tuning, high availability, disaster recovery, security, and production support.


Key Skills Required :

AWS Aurora PostgreSQL, PostgreSQL DBA, AWS RDS, SQL Performance Tuning, Query Optimization, Indexing, Execution Plan Analysis, High Availability (HA), Disaster Recovery (DR), Multi-AZ, Read Replicas, CloudWatch, Aurora Performance Insights, Backup & Recovery, Database Security, IAM, Audit Logging, Production Support, RCA, Python/Shell/PowerShell Scripting, Terraform/CloudFormation, AWS DMS, CI/CD, Git.


Mandatory Skills :

  • 4 to 5+ years of experience in PostgreSQL Database Administration.
  • 3 to 5+ years of hands-on experience with AWS RDS and Aurora PostgreSQL.
  • Strong SQL expertise, including query optimization, indexing, and execution plan analysis.
  • Experience with Aurora clustering, Multi-AZ deployments, read replicas, failover management, backup, and recovery.
  • Hands-on experience with AWS CloudWatch and Aurora Performance Insights.
  • Strong understanding of database security, encryption, IAM controls, audit logging, and compliance practices.
  • Experience in production support, incident management, troubleshooting, and root cause analysis (RCA).
  • Scripting/automation experience using Python, Shell Script, or PowerShell.
  • Experience with Infrastructure as Code (Terraform or CloudFormation).
  • Knowledge of AWS DMS, DevOps practices, CI/CD pipelines, and Git.


Preferred Skills :

  • Experience supporting Master Data Governance (MDG) or enterprise data platforms.
  • Experience working in healthcare or other highly regulated environments.
  • Exposure to AWS services such as Redshift, DynamoDB, and S3.
  • AWS Certified Database Specialty or AWS Solutions Architect certification is preferred.

Key Responsibilities :

  • Manage and support AWS Aurora PostgreSQL environments, including provisioning, configuration, monitoring, scaling, and maintenance.
  • Optimize database performance through query tuning, indexing, and proactive monitoring.
  • Implement and maintain high availability, backup, recovery, and disaster recovery solutions.
  • Troubleshoot production issues, perform root cause analysis, and ensure database reliability and availability.
  • Collaborate with application, cloud, and data engineering teams to support deployments and production releases.
  • Automate routine DBA activities and improve operational efficiency using scripting and AWS-native tools.

Qualifications :

  • Bachelor's degree in Computer Science, Information Technology, Engineering, or a related field.
  • Strong analytical, troubleshooting, and communication skills.
  • Ability to work independently and manage multiple priorities in a fast-paced environment.
Read more
The supreme consultancy
Pramila Ranjane
Posted by Pramila Ranjane
Remote only
7 - 9 yrs
₹27L - ₹45L / yr
Fullstack Developer
skill iconPython
skill iconReact.js
skill iconJavascript
TypeScript
+13 more

Role & Responsibilities


Responsibilities


• Business: Immerse in operations until you think like an insider.

Rapidly acquire domain expertise through direct observation, translate between business and engineering seamlessly, and mentor engineers in your area on immersion. Influence senior stakeholders effectively, manage complex stakeholder landscapes with competing agendas, and build trust rapidly with new stakeholders.


• Delivery: Lead rapid delivery initiatives across teams in your area, coach on prototype-first approaches, and establish trust through consistent fast delivery. Build complete applications rapidly across any technology stack, select the right tools for each problem, and define clear criteria for prototype-to-production transitions.


• Generative AI: Architect RAG systems for complex use cases across teams, implement advanced techniques (hybrid search, reranking, query expansion), mentor engineers on RAG best practices, and establish RAG standards. Lead evaluation strategy across teams, establishing annotation guidelines, training human-calibrated LLM judges, and building evaluation pipelines that connect tracing to datasets to experiments.


• People: Build high-performing teams across your area, navigate complex interpersonal dynamics, foster psychological safety, and create environments where diverse perspectives are valued. Influence through communication at all levels — from frontline to executive. Handle difficult conversations skilfully and train engineers in your area on effective communication.


• AI-Augmented Development: Optimise AI tool usage across teams in your area, train engineers on AI-augmented and agentic engineering workflows, evaluate new AI development tools, and establish practices that balance AI speed with verification rigour.


• Scale: Design complex multi-component systems end-to-end, evaluate architectural options for large initiatives across teams, guide technical decisions for your area, and mentor engineers on architecture. Create debt reduction strategies across teams, influence roadmap decisions to include debt work, and teach engineers when to accept debt for speed versus when to invest in quality.


• Documentation: Define documentation standards across teams in your area, create documentation systems and templates, train engineers on spec-driven development, and ensure documentation quality across projects. Lead pattern generalization initiatives, defining criteria for when to generalize versus keep custom.


• Reliability: Define reliability standards across teams in your area, drive post-incident improvements systematically, design capacity planning processes, andmentor engineers on SRE practices.


Ideal Candidate


  • Strong Staff Software Engineer / FDE profile (full-stack + production GenAI, multi-team technical leadership)
  • Mandatory (Experience 1) – Must have 7+ years of relevant professional software engineering experience, with demonstrated full-stack delivery across backend and frontend.
  • Mandatory (Experience 2) – Must have deep production experience with Python AND JavaScript/TypeScript, working comfortably across the full stack.
  • Mandatory (Experience 3) – Must have 2+ years of experience in generative AI applications developement — LLM integrations, vector databases, RAG systems, and evaluation pipelines
  • Mandatory (Experience 4) – Must have strong experience with modern frontend frameworks (Next.js / React) and backend API development.
  • Mandatory (Experience 5) – Must have extensive experience with cloud platforms (AWS preferred; Azure/GCP valued), including infrastructure-as-code (CloudFormation / Terraform).
  • Mandatory (Experience 6) – Must have working knowledge of multiple database paradigms — relational (PostgreSQL), document, and key-value (Redis) — with ability to select the right storage per problem.
  • Mandatory (Experience 7) – Must have strong experience with CI/CD pipelines (e.g. GitHub Actions), containerization, and production deployment strategies.
  • Mandatory (Experience 8) – Must have demonstrable fluency with AI coding tools (Claude Code, Cursor, GitHub Copilot, or similar) and proven ability to design agentic engineering workflows and train teams on them
  • Preferred (Experience) – Advanced RAG techniques — hybrid search, reranking, query expansion — and establishing RAG standards across teams


Read more
Maropost
Rishika Mehra
Posted by Rishika Mehra
Chandigarh, Mohali
5 - 15 yrs
₹25L - ₹32L / yr
Google Cloud Platform (GCP)
skill iconPython
Bash
skill iconGo Programming (Golang)
skill iconKubernetes
+3 more

Everything we do is for our customers!


Featured on Deloitte's Technology Fast 500 list and G2's leaderboard, Maropost offers a unified commerce experience that our customers need, transforming ecommerce, retail, marketing automation, merchandising, helpdesk and AI operations with one platform designed to scale for fast-growing businesses. With a relentless focus on our customers’ success, we are motivated by customer obsession, extreme urgency, excellence and resourcefulness to to power 5,000+ global brands while we head to 100,000+.


Driven by the same customer-centric mentality as above, we empower businesses to achieve their goals and grow alongside us. If you're a driver and not passenger and are ready to make a significant impact and be part of our transformative journey, Maropost is the place for you.


The Opportunity:


Thrive on change and grow beyond limits! We are looking for a bold thinker who sees a chance to learn and define what's possible with every challenge! Ready to make an impact? Welcome to Maropost and you can turn ideas into action!


We are seeking an experienced Senior Platform Engineer to join our growing team responsible for building, scaling, and maintaining the core infrastructure of our SaaS product on Google Cloud Platform (GCP). You will architect and implement robust, secure, and scalable systems using Kubernetes for orchestration and terraform for infrastructure-as-code.


What you'll be responsible for:


  • Design, implement, and optimize cloud-native infrastructure on GCP to support our SaaS product. 
  • Lead the deployment and management of Kubernetes clusters, ensuring high availability and security. 
  • Develop and maintain Terraform scripts for automated provisioning of cloud resources and environments. 
  • Collaborate with development and product teams to deliver scalable and reliable solutions. 
  • Establish CI/CD pipelines and automate infrastructure deployment and application delivery workflows. 
  • Set up and monitor observability solutions (logging, monitoring, alerting) to ensure operational excellence. 
  • Identify and resolve performance bottlenecks, contributing to continuous platform optimization. 
  • Drive adoption of DevOps practices and mentor junior engineers in Kubernetes, GCP, and Terraform best practices. 
  • Participate in on-call rotations and support incident response as a senior technical resource in 24/7 rotational environment.


What you'll bring to Maropost:


  • 5+ years of experience in platform engineering, SRE, or DevOps, with a focus on cloud-native SaaS environments. 
  • Strong hands-on expertise with GCP services (Compute Engine, GKE, Cloud SQL, IAM, etc.). 
  • Advanced skills in Kubernetes cluster management, workload orchestration, and troubleshooting. 
  • Deep proficiency in Terraform and infrastructure-as-code concepts for cloud automation. 
  • Experience designing and maintaining CI/CD pipelines (Bitbucket, Build kite, Argo CD , or similar). 
  • Proficient in scripting/programming (Python, Go, or Bash). 
  • Solid understanding of networking, security, and compliance in cloud environments. 
  • Familiarity with monitoring and observability tools (Prometheus, Grafana, Stack driver, etc.). 
  • Excellent problem-solving, communication, and team collaboration skills. 
  • Exposure to service mesh (Istio, Linkerd), cloud cost optimization, or FinOps. 
  • Experience with other public clouds (AWS, Azure). 
  • Certifications in GCP, Kubernetes, or Terraform. 
  • You exemplify Maropost’s Values: 
  • Customer Obsessed 
  • Extreme Urgency 
  • Excellence 
  • Resourceful


Message from the Founders: Maropost is looking for builders - people who want to drive our business forward at all costs in order to achieve the goals we have both short and long term for the results and outcomes that that will bring to us all.

 

If that isn't for you that’s ok, for those of you that it is please get in touch with us!

Read more
Bengaluru (Bangalore)
5 - 12 yrs
₹20L - ₹45L / yr
skill iconJavascript
skill iconJava
skill iconNodeJS (Node.js)
skill iconPython
skill iconGo Programming (Golang)
+12 more

Job Title : Generalist Fullstack Engineer (Java / Node.js / React / AI-Driven Engineering)

Experience : 5 to 12 Years

Location : Bengaluru (Koramangala) – Hybrid (3 Days WFO)

Employment Model : C2H

Joining : Immediate to 20 Days Notice Only

Open Positions : 4


About the Role :

We are hiring a hands-on Generalist Fullstack Engineer for a high-impact engineering team. This is a 90% coding role focused on building scalable products across frontend, backend, cloud, and modern AI-enabled engineering practices.


Mandatory Skills :

Full Stack Development (Frontend + Backend), JavaScript / Java / Node.js / Python / Go, ReactJS or modern UI frameworks, Microservices, AWS, Docker, Kubernetes, SQL/NoSQL, CI/CD, Terraform (IaC), Automated Testing (TDD), AI-enabled software delivery, and strong hands-on coding experience.


Key Requirements :

  • 5 to 12 years of software development experience.
  • Strong expertise in JavaScript, Java, Node.js, Python, or Go.
  • Hands-on experience with ReactJS / modern frontend frameworks.
  • Strong backend depth with microservices & event-driven architecture.
  • Experience with AWS, Docker, Kubernetes.
  • Exposure to AI tools/frameworks in software delivery.
  • Strong understanding of SQL/NoSQL databases & data modelling.
  • Experience with CI/CD, automated testing, Terraform (IaC).
  • Working knowledge of TDD, pair programming, continuous integration, security best practices.


Roles & Responsibilities :

  • Develop and own production-grade full-stack applications.
  • Build AI-first engineering workflows to improve delivery efficiency.
  • Drive architecture decisions and engineering best practices.
  • Deliver scalable cloud-native solutions.
  • Collaborate across product, QA, design, and architecture teams.
  • Improve system reliability through monitoring, automation, and testing.

Ideal Candidate :

✔ Strong end-to-end ownership (Frontend + Backend).

✔ Deep hands-on coding mindset (not people-management focused).

✔ Strong AWS + SQL exposure.

✔ Experience building and managing UI state and frontend components.

✔ Stable career progression (avoid frequent job changes).


Interview Process :

  1. Technical Round – GT (Assemble)
  2. Technical Round – Client
  3. Technical Round – Client
  4. HR Discussion & Offer
Read more
 Digital Product Engineering company

Digital Product Engineering company

Agency job
via Unique Occupational by Mantasha Naaz
Bengaluru (Bangalore)
9 - 11 yrs
₹32L - ₹35L / yr
Debian Linux
Edge technology
Proxmox
Linux administration
DevOps
+6 more

Associate Principal Engineer, Linux Administrator

Location: Bengaluru, India (Hybrid)

Employment Type: Full-time

Experience:9-11 years



Job description

REQUIREMENTS:


  • Strong experience in DevOps, Platform Engineering, and Infrastructure Automation
  • Deep hands-on expertise in Linux Administration (RHEL, CentOS, Ubuntu) – OS hardening, security, patching, and performance management (Must Have)
  • Strong experience with Cloud Technologies – Public & Private Cloud environments (Must Have)
  • Hands-on experience with Infrastructure as Code (IaC) using Terraform (Must Have)
  • Strong automation expertise using Ansible for configuration management and infrastructure provisioning (Must Have)
  • Experience building and managing CI/CD pipelines and end-to-end deployment automation
  • Strong experience with Kubernetes administration, orchestration, and cluster management (Must Have)
  • Hands-on experience with Docker containerization and Helm package management
  • Experience managing large-scale development and infrastructure environments
  • Strong understanding of Networking concepts, connectivity, design, troubleshooting, and network automation
  • Experience with Observability & Monitoring tools and best practices
  • Experience with Proxmox virtualization platform administration and management
  • Knowledge of Edge Technologies and distributed infrastructure environments
  • Basic understanding and administration of Active Directory (AD)
  • Experience implementing AI-driven Automation solutions and operational efficiencies
  • Strong understanding of infrastructure security, compliance, and governance
  • Experience working in Agile/Scrum environments
  • Strong troubleshooting, analytical, and problem-solving skills
  • Excellent communication and stakeholder management skills

RESPONSIBILITIES:

  • Design, build, and manage scalable infrastructure platforms across cloud and on-premise environments
  • Administer and maintain Linux servers including security hardening, patching, performance tuning, and troubleshooting
  • Develop and manage Infrastructure as Code (IaC) solutions using Terraform
  • Automate infrastructure provisioning, configuration management, and operational tasks using Ansible
  • Design, implement, and maintain CI/CD pipelines for application and infrastructure deployments
  • Deploy, manage, and optimize Kubernetes clusters and containerized workloads
  • Manage Docker environments and Helm-based application deployments
  • Design and implement network solutions ensuring security, reliability, and scalability
  • Monitor infrastructure health, performance, and availability using observability and monitoring tools
  • Manage and support Proxmox virtualization environments
  • Implement AI-driven automation initiatives to improve operational efficiency and reduce manual effort
  • Support edge infrastructure deployments and distributed computing environments
  • Collaborate with development, security, and operations teams to deliver reliable platform services
  • Troubleshoot production incidents and perform root cause analysis
  • Define infrastructure standards, automation frameworks, and operational best practices
  • Ensure high availability, scalability, security, and reliability of infrastructure platforms
  • Mentor junior engineers and provide technical leadership on DevOps and platform engineering initiatives
  • Participate in Agile ceremonies and contribute to continuous improvement initiatives
  • Work closely with stakeholders to understand infrastructure requirements and deliver optimal solutions

Qualifications

Bachelor’s or master’s degree in computer science, Information Technology, or a related fields

Read more
Wissen Technology

at Wissen Technology

4 recruiters
Shakthi M
Posted by Shakthi M
Bengaluru (Bangalore)
3 - 9 yrs
Best in industry
Azure
skill iconPython
Terraform
SRE
Windows Azure


  • Strong hands-on experience in Microsoft Azure Cloud.
  • Good understanding of Azure services such as Compute, Storage, Event Hub, Event Subscription, Storage Queue, and PaaS services.
  • Basic understanding of Azure AI Foundry and AI-related Azure service setup.
  • Good Azure networking basics: VNet, subnet, routing, and basic troubleshooting.
  • Strong knowledge of Terraform, especially:
  • Terraform state
  • plan / apply
  • troubleshooting failures
  • migration risks
  • Terraform Enterprise concepts
  • Strong Python coding capability, not just basic scripting.
  • Experience using Python for API integration, automation, JSON/YAML handling, and internal tooling.
  • Good understanding of CI/CD pipelines.
  • Ability to troubleshoot pipeline failures.
  • Comfortable with YAML and JSON.
  • Ability to troubleshoot Azure infrastructure/platform issues.
  • Ability to collect logs/evidence and coordinate with network/app/Microsoft support teams.
  • Basic awareness of agentic AI / LLM concepts.
  • Awareness of security and cost best practices.

Good to Have Skills

  • Hands-on experience with Harness.
  • Hands-on experience with Terraform Enterprise.
  • Exposure to LangGraph / LangChain.
  • Exposure to agentic AI workflows or skill creation.
  • Exposure to Claude or enterprise LLM integrations.
  • Knowledge of Azure ML Workspace, model registry, and managed endpoints.
  • MLOps / LLMOps knowledge.
  • FinOps / Azure cost optimization experience.
  • Azure certifications: AZ-104, AZ-305, AZ-400, AZ-500.

 

Read more
Fonada
Karandeep Singh
Posted by Karandeep Singh
Noida
5 - 8 yrs
₹15L - ₹20L / yr
DevOps
skill iconAmazon Web Services (AWS)
Microsoft Windows Azure
Google Cloud Platform (GCP)
VMware vSphere
+8 more


About the Role 

We are looking for a Senior DevOps Engineer to lead the design, automation, and scaling of our hybrid cloud infrastructure spanning public cloud and private/on-premises environments. You will partner closely with software engineering, security, and product teams to build reliable, secure, and high-performance systems that support rapid product delivery. This is a hands-on role with significant influence over our infrastructure strategy, deployment workflows, and engineering culture. 


Key Responsibilities 

  • Architect, deploy, and maintain scalable, highly available infrastructure across both public cloud (AWS, Azure, GCP) and private cloud platforms (OpenStack, VMware vSphere/Tanzu, Nutanix, or similar). 
  • Operate and maintain on-premises infrastructure: hypervisors, compute, storage (Ceph, NetApp, SAN/NAS), networking (SDN, VLANs, BGP, MPLS), and hardware capacity planning, alongside their public cloud equivalents. 
  • Design and own CI/CD pipelines that deploy seamlessly across public and private environments. 
  • Implement and manage Infrastructure as Code (Terraform, Ansible, Pulumi) with strong version control and review practices, using providers for both public and private cloud platforms. 
  • Manage container orchestration (Kubernetes, ECS, OpenShift, Rancher) across managed cloud services and self-managed/bare-metal clusters, including upgrades, autoscaling, and workload reliability. 
  • Build observability into all systems through logging, metrics, tracing, and alerting (Prometheus, Grafana, Datadog, ELK, or similar) with unified visibility across hybrid environments. 
  • Champion security best practices: secrets management, IAM hardening, network segmentation, vulnerability scanning, and compliance (SOC 2, ISO 27001, HIPAA, or data-sovereignty requirements). 
  • Lead incident response, root-cause analysis, and post-mortems; drive long-term reliability improvements and SLO/SLA adherence. 
  • Optimize cost, capacity, and resource utilization across public cloud spend and on-premises hardware without compromising performance or availability. 
  • Partner with data center operations and network providers on hardware provisioning, firmware management, MPLS circuit management, and lifecycle planning. 
  • Mentor junior DevOps and software engineers; promote DevOps culture, automation-first thinking, and shared ownership of production. 
  • Evaluate and introduce new tools, platforms, and processes that improve developer productivity and system reliability. 

Required Qualifications 

  • 5+ years of experience in DevOps, SRE, or Platform Engineering roles, with at least 2 years at a senior level. 
  • Deep expertise with at least one major public cloud provider (AWS, Azure, or GCP) in production. 
  • Hands-on experience operating private cloud or virtualization platforms (OpenStack, VMware, Nutanix, or equivalent) in production. 
  • Strong experience with virtualization, storage systems, and enterprise networking in on-premises environments. 
  • Strong hands-on experience with Kubernetes in production, including both managed cloud and self-managed/bare-metal clusters. 
  • Proficiency in Infrastructure as Code (Terraform and Ansible strongly preferred). 
  • Solid scripting and programming skills in Python, Go, Bash, or similar. 
  • Experience designing and operating CI/CD pipelines using tools such as GitHub Actions, GitLab CI, Jenkins, CircleCI, or ArgoCD. 
  • Strong Linux systems administration and networking fundamentals (TCP/IP, DNS, load balancing, VPNs, firewalls, routing, MPLS). 
  • Experience with monitoring and observability stacks (Prometheus, Grafana, Datadog, New Relic, ELK, or OpenTelemetry). 
  • Proven track record of leading incident response and improving system reliability. 
  • Excellent communication skills and the ability to collaborate across engineering, security, infrastructure, and product teams. 

Preferred Qualifications 

  • Experience designing hybrid and multi-cloud architectures, including secure connectivity (Direct Connect, ExpressRoute, MPLS, VPN, SD-WAN) between public and private environments. 
  • Familiarity with service meshes (Istio, Linkerd), API gateways, and GitOps workflows (ArgoCD, Flux). 
  • Background in security-focused or regulated environments and exposure to compliance frameworks. 
  • Experience with database administration (PostgreSQL, MySQL, Redis, MongoDB) in cloud-managed and self-hosted setups. 
  • Contributions to open-source DevOps or cloud infrastructure tooling. 
  • Relevant certifications (AWS Solutions Architect / DevOps Engineer, Azure Administrator, CKA, CKAD, RHCE, VMware VCP, OpenStack Certified Administrator, HashiCorp Terraform Associate). 


Read more
Bengaluru (Bangalore)
4 - 8 yrs
₹12L - ₹17L / yr
Azure
skill icongrafana
Scripting
prometheus
CI/CD
+4 more

Location: Bangalore 

Experience: 4-8 years

Interview Process - Two Rounds - First Round Virtual

Second Round-Face to Face at Bangalore


Key Skills Required

☁️ Cloud & Infrastructure

  • Strong hands-on experience with AWS Cloud Services
  • Proficiency in Terraform for Infrastructure as Code (IaC)
  • Experience in managing scalable cloud environments

⚙️ Containerization & Orchestration

  • Solid experience in Kubernetes (K8s) for container orchestration
  • Understanding of microservices architecture

🔄 CI/CD & DevOps

  • Hands-on experience with Azure DevOps (CI/CD pipelines)
  • Experience in build, release, and deployment automation

📊 Observability & Monitoring

  • Strong experience with Prometheus & Grafana
  • Expertise in setting up alerts, dashboards, and monitoring system health

🔐 API Gateway & Security

  • Experience with Kong or equivalent API Gateway
  • Understanding of API security controls (authentication, rate limiting, policies)

🧠 Core Technical Competencies

  • Strong Linux troubleshooting and system debugging skills
  • Proficiency in scripting (Bash / Python / Shell)
  • Understanding of networking concepts: TCP/IP, HTTP, DNS, Load Balancing
  • Experience with system architecture and distributed systems

🚨 SRE Responsibilities

  • Monitor system performance, reliability, and availability
  • Handle incidents, perform troubleshooting, and conduct RCA
  • Automate operational tasks to improve efficiency
  • Build and maintain scalable, resilient infrastructure
  • Collaborate with development and DevOps teams for system improvements

🧪 Good to Have

  • Experience with Finacle operations
  • Exposure to API/load testing tools like JMeter or Gatling
  • Familiarity with logging tools like Loki

🤝 Soft Skills

  • Strong communication and collaboration skills
  • Ability to document processes and technical workflows clearly

🎯 Ideal Candidate

A hands-on SRE/DevOps Engineer with strong exposure to:

  • AWS + Terraform + Kubernetes
  • CI/CD (Azure DevOps)
  • Monitoring + API Gateway security 


Read more
Wissen Technology

at Wissen Technology

4 recruiters
Shakthi M
Posted by Shakthi M
Bengaluru (Bangalore)
3 - 14 yrs
Best in industry
skill iconPython
Azure
DevOps
Terraform
Windows Azure
+1 more
  • Strong hands-on experience in Microsoft Azure Cloud.
  • Good understanding of Azure services such as Compute, Storage, Event Hub, Event Subscription, Storage Queue, and PaaS services.
  • Basic understanding of Azure AI Foundry and AI-related Azure service setup.
  • Good Azure networking basics: VNet, subnet, routing, and basic troubleshooting.
  • Strong knowledge of Terraform, especially:
  • Terraform state
  • plan / apply
  • troubleshooting failures
  • migration risks
  • Terraform Enterprise concepts
  • Strong Python coding capability, not just basic scripting.
  • Experience using Python for API integration, automation, JSON/YAML handling, and internal tooling.
  • Good understanding of CI/CD pipelines.
  • Ability to troubleshoot pipeline failures.
  • Comfortable with YAML and JSON.
  • Ability to troubleshoot Azure infrastructure/platform issues.
  • Ability to collect logs/evidence and coordinate with network/app/Microsoft support teams.
  • Basic awareness of agentic AI / LLM concepts.
  • Awareness of security and cost best practices.

Good to Have Skills

  • Hands-on experience with Harness.
  • Hands-on experience with Terraform Enterprise.
  • Exposure to LangGraph / LangChain.
  • Exposure to agentic AI workflows or skill creation.
  • Exposure to Claude or enterprise LLM integrations.
  • Knowledge of Azure ML Workspace, model registry, and managed endpoints.
  • MLOps / LLMOps knowledge.
  • FinOps / Azure cost optimization experience.
  • Azure certifications: AZ-104, AZ-305, AZ-400, AZ-500.

 

Screening Priority:

Azure Cloud + Terraform + Python Coding + CI/CD Troubleshooting + YAML/JSON + Basic Agentic AI Awareness

 

Read more
OpsTree Global
Pragati Srivastava
Posted by Pragati Srivastava
Bengaluru (Bangalore), Mumbai
4 - 9 yrs
₹20L - ₹25L / yr
Google Cloud Platform (GCP)
skill iconKubernetes
Terraform
Monitoring
prometheus
+3 more

Immediate Hiring: GCP DevOps Engineer | Mumbai & Bengaluru (On-site)


OpsTree Global is urgently hiring a GCP DevOps Engineer with 4–9 years of experience for immediate requirements in Mumbai and Bengaluru.


Key Skills

  • Google Cloud Platform (GCP)
  • Terraform / Infrastructure as Code (IaC)
  • Kubernetes & Helm Charts
  • CI/CD – Jenkins, GitLab CI, GitHub Actions
  • Linux Administration
  • Scripting – Python / Go / Java

Role Responsibilities

  • Build and manage scalable cloud infrastructure on GCP
  • Automate deployments and infrastructure provisioning
  • Ensure system reliability, monitoring, and performance optimization
  • Collaborate with development and operations teams for seamless delivery


📍 Locations: Mumbai & Bengaluru (On-site)

⚡ Immediate Joiners Preferred

💼 Experience: 4–9 Years

Read more
Searce Inc

at Searce Inc

3 recruiters
Mohammed Rabidheen
Posted by Mohammed Rabidheen
Coimbatore
5 - 10 yrs
Best in industry
Microsoft Windows Azure
skill iconKubernetes
Terraform
Observability
Reliability engineering

About Searce

Searce (pronounced 'search') is a global, AI-native, and engineering-led modern technology consultancy. Founded in 2004 with a vision to "solve for better," we partner with organizations to "futurify" their businesses by leveraging the full power of Cloud, AI, and Data Engineering.

With a presence across 10+ countries—including the US, India, Singapore, and Australia—Searce has evolved over two decades into a trusted technology partner for over 3,000 clients. We are not just a service provider; we are a group of "solvers-at-heart" who thrive on complex technical challenges.

Why Join the "Solvers" Brigade?

  • Award-Winning Excellence: In 2026, Searce was recognized as the Google Cloud Workplace AI Transformation Partner of the Year (APAC). We are a Premier Google Cloud Partner and a top-tier Managed Services Provider (MSP).
  • AI-First Mindset: We specialize in Applied AI (Generative & Conventional), Cloud Modernization, and Location Intelligence, helping industries from FinServ and Healthcare to Retail and Manufacturing reinvent themselves.
  • The "Futurify" DNA: We don't just maintain; we improve. We use our proprietary EVLOS business innovation framework to ensure our clients aren't just moving to the cloud, but are staying ahead of the curve.

Our Culture: The HAPPIER Values

We look for individuals who live and breathe our HAPPIER values:

  • Humble: We learn from everyone.
  • Adaptable: We embrace change as the only constant.
  • Positive: We focus on solutions, not just problems.
  • Passionate: We are obsessed with engineering excellence.
  • Innovative: We challenge the status quo.
  • Excellence: We deliver impactful, futuristic outcomes.
  • Responsible: We take ownership of our work and its impact.


Your Mission: The Role

solving for better.

You are a reliability-owning, hands-on solver. Not just a "break-fix engineer."

As a DRI (directly responsible individual) for our clients' most critical systems, you’ll be the go-to expert within the squad that ensures their environments are secure, reliable, and optimized 24/7. You will deliver measurable impact – improved uptime, faster response times, and real cost savings. Not just closed tickets. Not just alerts. Real outcomes you engineer yourself.

You will lead the charge on technical execution, from complex troubleshooting and root cause analysis to engineering proactive, automated solutions. This role is about building the future of reliable cloud operations and shipping it into today's production environments.


Your Responsibilities

what you will wake up to solve.

This isn’t a “manage tickets” role. You are the architect, the executioner and the DRI for our Cloud Managed Services GTM, deploying solutions that turn operational noise into hardened outcomes. Here’s how you’ll make your mark:

  • Own Service Reliability: You will be the go-to technical expert for 24/7 cloud operations and incident management. You'll ensure strict adherence to SLOs by getting your hands dirty, leading high-stakes troubleshooting to deliver a superior client experience.
  • Engineer the Blueprint: You'll translate client needs into scalable, automated, and secure cloud architectures. You will write and maintain the operational playbooks and Infrastructure as Code (IaC) that your squad uses every day.
  • Automate with Intelligence: You'll lead the charge from the keyboard to futurify our operations. You'll embed AI-driven automation, predictive monitoring, and AIOps into core processes to eliminate toil and preempt incidents.
  • Drive FinOps & Impact: You'll own the technical execution of the FinOps framework. You will continuously analyze, configure, and optimize cloud spend for clients through hands-on engineering.
  • Be the Expert in the Room: You'll share your knowledge through internal demos, documentation, and technical deep dives, representing the deep expertise that turns operational complexity into business resilience.
  • Mentor & Elevate: You will be a technical mentor for your peers. Through code reviews and collaborative problem-solving, you'll help build a high-performing squad that lives the “Always Hardened” mindset.


Experience & Relevance

We are looking for future technology leaders, not just coders. We value raw intelligence, analytical rigor, and an obsessive passion for technology over any prior experience.

  • Cloud Operations Pedigree: 5+ years of experience in Azure cloud infrastructure, with a significant portion in cloud managed services. Hands-on experience in Kubernetes is mandatory.
  • Commercial Acumen: Proven track record of building and scaling a net-new managed services business.
  • Client-Facing Tech Acumen: 2+ years of experience in a client-facing technical role, acting as the trusted advisor for cloud operations, security, and reliability.


Functional Skills:

  • Service Delivery Mindset: A deep understanding of MSP business models, SLAs, and the importance of client satisfaction in an operational context.
  • Client Engagement: Ability to ask appropriate questions to get to the heart of an operational issue and win trust with stakeholders.
  • Cross-Functional Catalyst: Thrive in multi-disciplinary teams, bringing together operations, security, and development teams.
  • Repository builder: Creates reusable frameworks, IaC modules, and operational playbooks for scale.
Read more
Virtana

at Virtana

3 candid answers
2 recruiters
Krutika Devadiga
Posted by Krutika Devadiga
Pune
5 - 9 yrs
Best in industry
Google Cloud Platform (GCP)
DevOps
Shell Scripting
skill iconPython
skill iconKubernetes
+11 more

Role Overview:

Virtana is looking for a Senior DevOps Engineer to join our R&D Infrastructure team. In this role, you won't just follow conventions — you'll help redefine them. You will own the architecture, build, and day-to-day operations of the GCP-based cloud platform that powers Virtana's SaaS products and the AI-driven observability experience our Global 2000 customers depend on. This is a hands-on senior individual contributor role with meaningful technical leadership scope, working alongside engineers and architects on a unified observability platform.


Work Location: Pune


Job Type: Hybrid


Role Responsibilities:

  • GCP Cloud Operations: Develop, deploy, operate, and support production cloud infrastructure primarily on GCP — leveraging GKE, BigTable, BigQuery, Dataflow, Cloud Storage, IAM, and core networking services.
  • Reliability & SLAs: Ensure production systems are running at all times with multiple levels of redundancy to meet committed SLAs; lead incident response, root cause analysis, and post-incident reviews.
  • Build & Release Automation: Design, implement, and continuously improve scalable CI/CD pipelines and test frameworks leveraged by QA and development teams across the company.
  • Infrastructure as Code: Manage large-scale, repeatable deployments using Terraform, Ansible, Puppet, or SaltStack; champion Git-based workflows and version control standards for distributed engineering teams.
  • Security & Availability: Maintain the ongoing maintenance, security, patching, and availability of services in line with tight operations, security, and procedural models.
  • Monitoring & Alerting: Plan and deliver high-value monitoring and alerting features to support operations, support, and customer-facing reliability — eating our own dog food with the Virtana Platform wherever possible.
  • Capacity & Cost: Forecast capacity, plan upgrades, patches, and migrations, and drive cloud cost efficiency across hybrid and multi-cloud environments.
  • Cross-Functional Partnership: Work with development, operations, and support personnel to identify, isolate, and diagnose issues; handle support escalations and drive permanent fixes.


Required Qualifications:

  • Bachelor's degree in Computer Science / Engineering or equivalent relevant experience.
  • 5–7 years of professional hands-on DevOps / SRE experience supporting production cloud environments.
  • Strong, demonstrable production experience on GCP — including GKE, BigTable, BigQuery, Dataflow, IAM, and core GCP networking services.
  • Deep, hands-on expertise with container orchestration (Kubernetes) and Docker in production.
  • Advanced proficiency with at least one infrastructure-as-code / configuration management tool: Terraform, Ansible, Puppet, or SaltStack.
  • Solid understanding of networking, firewalls, load balancers, DNS, and database operations.
  • Strong working knowledge of Git-based workflows and version control standards for distributed engineering teams.
  • Comfort operating hybrid environments that include both Linux and Windows ecosystems.
  • Excellent verbal and written communication skills, with the ability to explain highly technical topics to both technical and non-technical audiences.
  • Self-motivated, detail-oriented, and able to work both independently and within a globally distributed team.


Good to Have:

  • Strong scripting skills and a demonstrated ability to automate operational toil — Python preferred; Bash, Go, or Groovy a plus.
  • Hands-on experience designing and operating CI/CD pipelines with Jenkins (Spinnaker, GitHub Actions, or GitLab CI also welcome).
  • Exposure to AWS or other public clouds in addition to GCP.
  • Experience operating SaaS platforms built on microservices architectures.
Read more
Blitzy

at Blitzy

2 candid answers
1 product
Bisman Gill
Posted by Bisman Gill
Pune
5yrs+
₹65L - ₹90L / yr
skill iconAmazon Web Services (AWS)
skill iconKubernetes
Terraform
skill iconPython
Distributed Systems

The Role

As a Senior Site Reliability Engineer at Blitzy's Pune headquarters, you will be the backbone of our platform's reliability, scalability, and operational excellence. You'll work at the intersection of software engineering and infrastructure, ensuring our AI-powered development platform remains highly available and performant as we scale rapidly. This is a high-impact, hands-on role for an engineer who thrives in a fast-moving environment and takes deep ownership of the systems they build.


What Success Looks Like

  • In 30 days: You have a deep understanding of Blitzy's infrastructure architecture, have identified key reliability risks, and are actively contributing to on-call rotations.
  • In 90 days: You have shipped meaningful improvements to observability, incident response workflows, and deployment pipelines that measurably reduce MTTR and increase system uptime.
  • In 6 months: You have driven at least one major reliability initiative from inception to production, established SLO/SLA frameworks for critical services, and are a trusted technical voice shaping our infrastructure roadmap.


Areas of Ownership

  • Design, build, and operate scalable, fault-tolerant infrastructure across cloud environments (AWS, GCP, or Azure).
  • Define and enforce SLOs, SLAs, and error budgets; lead blameless postmortems and drive systemic improvements.
  • Build and maintain robust CI/CD pipelines, release automation, and deployment infrastructure.
  • Own observability: design and maintain logging, metrics, tracing, and alerting stacks (e.g., Prometheus, Grafana, Datadog, OpenTelemetry).
  • Partner closely with software engineering teams to embed reliability practices into the development lifecycle.
  • Drive capacity planning, performance benchmarking, and cost optimization across our infrastructure.
  • Champion security best practices within the infrastructure and deployment layers.


Required Experience

  • 5+ years of experience in Site Reliability Engineering, DevOps, or Infrastructure Engineering roles.
  • Strong proficiency in at least one major cloud platform (AWS preferred); experience with Kubernetes and container orchestration at scale.
  • Hands-on experience with infrastructure-as-code tools (Terraform, Pulumi, or equivalent).
  • Proven track record designing and maintaining high-availability, distributed systems.
  • Deep expertise in observability tooling, incident management, and on-call practices.
  • Strong scripting and automation skills (Python, Go, Bash, or similar).
  • Excellent communication skills with the ability to collaborate across engineering teams and present technical findings to leadership.


What Makes You Stand Out

  • Experience supporting AI/ML workloads or GPU-accelerated infrastructure.
  • Prior experience in a high-growth startup environment where you wore multiple hats.
  • Familiarity with eBPF, service mesh technologies (Istio, Linkerd), or advanced networking.
  • Contributions to open-source SRE/DevOps tooling or communities.
  • Experience building global, multi-region infrastructure with strict latency and availability requirements.


What Makes This Role Different

You won't be maintaining legacy systems or fighting fires in a sprawling monolith. At Blitzy, you're building reliability into a greenfield AI platform that is redefining how the world creates software. You'll have direct influence over architectural decisions, work side-by-side with world-class engineers, and see the tangible impact of your work as we scale to serve Fortune 500 customers. As a founding member of the Pune SRE team, you'll help shape the culture and technical standards of a team that will grow with the company.

Read more
Blitzy

at Blitzy

2 candid answers
1 product
Bisman Gill
Posted by Bisman Gill
Pune
5yrs+
Upto ₹85L / yr (Varies
)
skill iconKubernetes
Google Cloud Platform (GCP)
Linux/Unix
skill iconDocker
Terraform
+1 more

The Role

As a DevOps Engineer at Blitzy's Pune headquarters, you'll build and operate the infrastructure that powers our AI agents and the applications they produce. You'll work at the intersection of cloud infrastructure, developer tooling, and AI-native systems — designing the pipelines, clusters, and automation that allow Blitzy to ship production-ready software at machine speed. This is a hands-on, high-ownership role for an engineer who moves fast, automates everything, and cares deeply about developer experience and system reliability.


What Success Looks Like

  • Kubernetes clusters are running reliably at scale, with clear deployment standards, Helm-managed releases, and minimal manual intervention required from engineering teams.
  • CI/CD pipelines are fast, consistent, and trusted — developers ship confidently knowing the automation handles the rest.
  • Observability is comprehensive: alerts are actionable, dashboards are meaningful, and incidents are resolved faster because the right data is always available.
  • Infrastructure provisioning is fully automated — no snowflake environments, no manual setup, everything reproducible through code.
  • AI agent orchestration infrastructure is stable and scalable, directly enabling Blitzy's core product to deliver for enterprise customers.
  • Engineering teams notice the difference — developer productivity is measurably higher and infrastructure is no longer a bottleneck to shipping.


Areas of Ownership

  • Build and manage Kubernetes clusters supporting AI agent workloads and application deployment at scale.
  • Design, implement, and maintain CI/CD pipelines for application and AI service delivery — ensuring speed, reliability, and repeatability.
  • Automate infrastructure provisioning and dynamic scaling using Python scripts and Terraform IaC.
  • Deploy and manage applications using Helm charts; own packaging standards and release automation.
  • Build and maintain comprehensive observability stacks — alerting, distributed tracing, metrics, and logging (e.g., Prometheus, Grafana, Datadog, OpenTelemetry).
  • Monitor and maintain production services and APIs; own incident response and drive blameless postmortems.
  • Build dedicated infrastructure for AI agent orchestration and management, enabling Blitzy's core autonomous development capabilities.
  • Collaborate with engineering teams on deployment strategies and continuously improve developer experience through tooling and automation.


Required Experience

  • 5–8 years of DevOps, infrastructure, or platform engineering experience.
  • Python proficiency for scripting, automation, and infrastructure tooling.
  • Deep Kubernetes expertise — cluster management, workload deployment, scaling, and troubleshooting.
  • Hands-on Helm experience for application packaging and release management.
  • Proven ability to design and implement CI/CD pipelines across complex, multi-service environments.
  • Practical experience with at least one major cloud platform (AWS, GCP, or Azure).
  • Terraform proficiency for infrastructure-as-code provisioning and state management.
  • Strong Linux administration and containerization fundamentals (Docker, OCI).


What Makes You Stand Out

  • CKA (Certified Kubernetes Administrator) certification.
  • Familiarity with MLOps tooling such as MLflow, Kubeflow, or similar platforms for AI/ML workload management.
  • Experience with microservices architecture and distributed systems design.
  • Knowledge of API gateways and service mesh technologies (Istio, Linkerd, or equivalent).
  • Prior experience in a high-growth AI or software startup where you moved fast and owned broadly.
  • Track record of meaningfully improving developer productivity through platform and tooling investments.


What Makes This Role Different

Most DevOps roles have you maintaining existing systems. At Blitzy, you're building the infrastructure layer for a platform that autonomously writes enterprise software — a genuinely new category of product. You'll work on AI agent orchestration, Kubernetes at scale, and developer tooling that is directly responsible for how fast Blitzy delivers value to Fortune 500 customers. As an early member of the Pune engineering team, you'll have outsized influence over our infrastructure culture and technical direction. High performers are eligible for company equity — giving you real ownership in what you build.

Read more
appscrip

at appscrip

2 recruiters
Nilam Surati
Posted by Nilam Surati
Surat
0.6 - 1.5 yrs
₹3L - ₹5L / yr
DevOps
skill iconAmazon Web Services (AWS)
Google Cloud Platform (GCP)
Windows Azure
skill iconDocker
+2 more

we are currently hiring for Junior DevOps Developer 


Can you pls check below Job Description for the post 

 

Job Description: Junior DevOps Developer (0.6 – 1.5 Years Experience)

Job Title: Junior DevOps Developer

Experience: 6 months to 1.5 years

Employment Type: Full-time

About the Role:

We are looking for a motivated Junior DevOps Developer to support our development and operations teams. You will assist in managing cloud infrastructure, improving deployment processes, and maintaining system reliability.

Key Responsibilities:

  • Assist in managing and maintaining cloud infrastructure (AWS/GCP/Azure)
  • Support CI/CD pipeline setup and maintenance
  • Help automate deployment processes and routine tasks
  • Monitor system performance and troubleshoot issues
  • Assist in containerization using Docker and Kubernetes
  • Perform root cause analysis for production issues
  • Collaborate with developers to improve system performance and scalability
  • Maintain documentation for infrastructure and processes
  • cloud platform and infrastructure include hetzener

Required Skills:

  • Basic understanding of DevOps concepts and workflows
  • Knowledge of cloud platforms like AWS, GCP, or Azure
  • Familiarity with Docker and Kubernetes
  • Basic understanding of Infrastructure as Code tools (Terraform is a plus)
  • Knowledge of Git and version control systems
  • Basic scripting knowledge (Bash/Python preferred)

Good to Have:

  • Exposure to CI/CD tools (Jenkins, GitHub Actions, GitLab CI/CD)
  • Understanding of monitoring tools (Grafana, Prometheus) 
  • Understanding of monitoring tools (Grafana, Prometheus) 


You can contact me on this WhatsApp number: Nine three one six one two zero one three two

Read more
Pune, Bengaluru (Bangalore)
8 - 12 yrs
₹19L - ₹22L / yr
Solution Architect
AI/ML tools
OpenStack
HPC
AWS
+2 more

Role & Responsibilities

We are seeking a seasoned Solution Architect to design and lead AI infrastructure and private cloud initiatives. This role focuses on building scalable, high-performance environments to support AI/ML workloads, data platforms, and enterprise applications. The ideal candidate will have deep expertise in private cloud architectures, GPU-based computing, and modern data center technologies, along with the ability to align infrastructure strategy with business and AI innovation goals.

Key Responsibilities-

  • Architect and design AI-ready infrastructure platforms, including GPU clusters, high-performance computing (HPC), and storage systems
  • Define and implement private cloud solutions using technologies such as OpenStack and VMware
  • Design scalable environments for AI/ML workloads, including training and inference pipelines
  • Collaborate with data scientists, platform engineers, and infrastructure teams to translate AI requirements into infrastructure solutions
  • Drive infrastructure modernization initiatives, including containerization and orchestration using Kubernetes
  • Ensure high availability, performance, scalability, and security of AI platforms
  • Design storage solutions optimized for AI workloads (e.g., distributed file systems, object storage)
  • Implement networking architectures for high-throughput, low-latency data transfer
  • Define automation strategies using Infrastructure as Code (IaC) and configuration management tools
  • Establish governance, standards, and best practices for AI infrastructure and private cloud environments
  • Evaluate emerging technologies and recommend solutions aligned with enterprise strategy
  • Provide technical leadership and guidance across architecture, design, and implementation phases

Ideal Candidate

  • Strong Solution Architect – AI Infrastructure & Private Cloud profiles
  • Mandatory (Experience 1) – Must have 8+years of experience in IT infrastructure, cloud, or data center architecture roles
  • Mandatory (Experience 2) – Must have strong expertise in private cloud and virtualization (OpenStack, VMware vSphere) along with solid knowledge of Linux, networking, and storage architectures.
  • Mandatory (Experience 3) – Must have hands-on experience designing AI/ML infrastructure, including GPU-based systems (e.g., NVIDIA platforms), HPC, and AI-optimized storage
  • Mandatory (Experience 4) – Must have strong experience with containerization and orchestration (Docker, Kubernetes) and IaC/automation tools (Terraform, Ansible)
  • Mandatory (Experience 5) – Must have experience designing scalable AI/ML environments for training/inference pipelines, with high-throughput, low-latency networking and distributed storage
  • Mandatory (Experience 6) – Must have familiarity with hybrid cloud integration (AWS, Azure, or GCP) and proven ability to lead architecture design with strong stakeholder management.
  • Mandatory (Skill) – Must have familiarity with hybrid cloud integration involving AWS, Azure, or GCP.
  • Preferred (Skill 1) – Certifications in cloud (AWS/Azure/GCP), Kubernetes, or VMware/OpenStack, along with experience in MLOps platforms and AI lifecycle management
  • Preferred (Skill 2) – Knowledge of high-performance networking (InfiniBand, RDMA) and exposure to data lake architectures and big data platforms
  • Preferred (Skill 3) – Experience in large-scale enterprise or hyperscale environments.


Read more
FrontM Limited
Pradeep Chandkiran
Posted by Pradeep Chandkiran
Bengaluru (Bangalore)
3 - 5 yrs
₹8L - ₹14L / yr
skill iconKubernetes
Terraform
skill iconAmazon Web Services (AWS)

Location: Bangalore preferred / Hybrid as applicable

Experience: 3+ years

Education: B.E/B.Tech in Computer Science, Engineering or a related technical discipline

Salary: Above market standards, flexible for the right candidate

Career growth: Long-term opportunity with potential to lead DevOps architecture and cloud platform operations


About FrontM

FrontM builds software platforms for frontline workforces operating in remote and low-connectivity environments, with a strong focus on the maritime industry. The platform supports communication, collaboration, healthcare, learning, welfare and operational workflows across mobile, web, kiosk and connected device environments.

The platform runs across cloud infrastructure, constrained networks and specialised customer environments, requiring reliable DevOps practices, strong observability, secure architecture and careful operational discipline.


Role Summary

As a Senior DevOps Engineer, you will take ownership of FrontM’s AWS cloud infrastructure, CI/CD pipelines, platform reliability and technical operations. You will work closely with the VP of Delivery, CTO and CEO to maintain secure, scalable and high-availability infrastructure for FrontM’s production systems.

This role requires strong hands-on DevOps experience, broad AWS knowledge, Kubernetes experience and the ability to troubleshoot complex networking and production issues across multi-domain SaaS environments.


Key Responsibilities

Cloud Infrastructure & DevOps Architecture (≈45%)

· Own, maintain and improve AWS cloud infrastructure for FrontM platforms

· Create and maintain Terraform scripts for infrastructure deployment and management

· Manage Kubernetes workloads deployed within AWS EKS

· Support multi-zone AWS infrastructure design for availability, resilience and scale

· Maintain AWS services including Route 53, EC2, API Gateway, VPC, VPN, AWS Cognito, ElastiCache, DynamoDB and Lambda

· Contribute to DevOps architecture planning in line with FrontM’s platform roadmap

CI/CD, Operations & Platform Reliability (≈35%)

· Build, maintain and improve CI/CD pipelines for backend and platform services

· Oversee technical operations with hands-on administration, monitoring and release support

· Ensure continuous server uptime, stability, performance and maintainability

· Debug, respond to and restore system outages in production and staging environments

· Improve observability across infrastructure and applications, including migration from Elastic stack to logz.io

· Support backend stability, scale and performance across Node.js, Java and related services

Security, Networking & Production Support (≈20%)

· Maintain AWS security configurations, access controls and monitoring practices

· Support complex networking requirements across multi-domain SaaS implementations

· Troubleshoot network, infrastructure and access issues with internal teams and customer-side users

· Work with backend teams to support API integrations and infrastructure abstractions for complex requirements

· Document operational procedures, incident findings and technical support steps clearly


Required Technical Skills

Cloud Infrastructure & AWS

· Strong hands-on experience with AWS infrastructure and cloud operations

· Experience with Route 53, EC2, API Gateway, VPC, VPN, AWS Cognito, ElastiCache, DynamoDB and Lambda

· Experience with AWS security setup, monitoring and multi-zone infrastructure

· Ability to manage infrastructure using Terraform

Kubernetes, CI/CD & Observability

· Strong experience with Kubernetes, preferably AWS EKS

· Extensive CI/CD and DevOps experience

· Experience with infrastructure observability and application monitoring tools

· Ability to diagnose production bottlenecks, server failures and performance issues

Backend, Networking & SaaS Operations

· Experience supporting Node.js, Java and backend system procedures for stability and scale

· Good understanding of APIs, integrations and backend service dependencies

· Experience with complex networking and multi-domain SaaS implementations

· Ability to troubleshoot technical issues with non-technical end users

Nice to Have

· Experience with MongoDB clusters in MongoDB Atlas

Personal Attributes

· Strong ownership mindset for uptime, reliability and production stability

· Practical problem-solving approach with the ability to act quickly during incidents

· Clear written and spoken communication in English

· Ability to work independently and coordinate with senior management when required

· Comfortable working in fast-moving engineering teams

· Attention to detail in security, monitoring, documentation and operational processes


Why join FrontM?

Long-Term Career Growth

Opportunity to work on cloud infrastructure used by global maritime and remote workforce customers, with scope to grow into DevOps architecture and platform leadership roles.

Engineering Challenges That Matter

Work on infrastructure that supports applications used in remote, low-bandwidth and operationally demanding environments.

Broad Technical Ownership

Take responsibility across cloud infrastructure, Kubernetes, CI/CD, observability, networking, security and production reliability.


Apply now

Join a team focused on building reliable software infrastructure for real-world use cases and contribute to systems used across the global maritime workforce.

Read more
Rootflo
Remote only
1 - 2 yrs
₹4L - ₹5L / yr
skill iconAmazon Web Services (AWS)
skill iconKubernetes
Microsoft Windows Azure
CI/CD
Terraform

Company Description

Rootflo is an innovative AI-driven company transforming finance back offices through advanced revenue intelligence and workflow automation solutions. Specializing in empowering financial institutions, Rootflo partners with leading banks, lending platforms, and insurance companies to revolutionize their operations. Leveraging the power of artificial intelligence, Rootflo provides efficient and intelligent systems that drive business outcomes. The company is dedicated to delivering impactful advancements for modern financial processes.


About the Role

We are looking for a motivated and technically sound Junior Cloud / Infrastructure Engineer with approximately 1 year of hands-on experience in designing, deploying, and managing cloud infrastructure on AWS, GCP, or Azure using Kubernetes. The candidate will work closely with senior engineers to support and maintain scalable, secure, and highly available cloud environments while gaining exposure across the full cloud stack.

Key Responsibilities

Cloud Infrastructure

  • Assist in provisioning and managing cloud resources (VMs, storage, networking, databases) on AWS / GCP / Azure in Kubernetes
  • Support infrastructure-as-code (IaC) implementation using Terraform
  • Monitor cloud resource usage, costs, and performance; raise alerts for anomalies
  • Set up log monitoring and other tracking
  • Implementing cost optimizations and efficient resource utilisation

Networking & Security

  • Configure VPCs, subnets, security groups, IAM roles, and access policies
  • Assist in implementing firewalls, SSL/TLS certificates, and VPN connectivity
  • Support compliance and security best practices (CIS benchmarks, least privilege)

CI/CD & DevOps Support

  • Work with CI/CD pipelines (Jenkins, GitHub Actions, GitLab CI, or similar)
  • Assist in containerization and orchestration using Docker and Kubernetes (EKS / GKE / AKS)
  • Support deployment automation and version control workflows

Monitoring & Incident Response

  • Use monitoring tools such as CloudWatch, Stackdriver, Azure Monitor, Grafana, or Datadog
  • Respond to infrastructure alerts and assist in root cause analysis (RCA)
  • Participate in on-call rotations and incident triage as required

Collaboration

  • Work closely with development, QA, and product teams to support release cycles
  • Maintain documentation for infrastructure, runbooks, and architecture diagrams
  • Participate in code reviews for infrastructure scripts and configurations


Required Skills & Qualifications

Cloud Platforms (Any One Primary)

  • AWS: EC2, S3, RDS, Lambda, VPC, IAM, ECS/EKS, CloudWatch, Route 53
  • GCP: Compute Engine, GCS, Cloud SQL, GKE, Cloud Functions, VPC, IAM, Stackdriver
  • Azure: VMs, Azure Storage, Azure SQL, AKS, Azure Functions, VNet, AAD, Azure Monitor

Infrastructure & Automation

  • Hands-on with Terraform and/or CloudFormation / ARM Templates / GCP Deployment Manager
  • Proficiency in Linux/Unix system administration
  • Scripting skills in Bash, Python, or PowerShell

Containers & DevOps

  • Working knowledge of Docker (build, push, run, Compose)
  • Basic understanding of Kubernetes concepts (pods, deployments, services, ingress)
  • Familiarity with at least one CI/CD tool

Networking Fundamentals

  • Understanding of DNS, HTTP/HTTPS, TCP/IP, load balancers, CDNs
  • Experience with cloud-native networking (VPCs, peering, NAT gateways)

Soft Skills

  • Strong problem-solving ability and attention to detail
  • Good written and verbal communication
  • Eagerness to learn and adapt in a fast-paced environment

Good to Have

  • Cloud certification: AWS Solutions Architect Associate / GCP ACE / Azure AZ-900 or AZ-104
  • Experience with GitOps tools like ArgoCD or Flux
  • Exposure to service mesh (Istio, Linkerd)
  • Knowledge of Ansible or Chef/Puppet for configuration management
  • Familiarity with observability tools: Prometheus, Grafana, ELK Stack
  • Experience with cost optimization and FinOps practices


Read more
Remote, Noida, Gurugram, Pune, Nagpur, Jaipur, Gandhinagar
8 - 14 yrs
₹12L - ₹18L / yr
skill iconPython
SQL
PySpark
databricks
Snow flake schema
+6 more

Senior Data Engineer (Databricks, BigQuery, Snowflake)

Experience: 8+ Years in Data Engineering

Location: Remote | Onsite (Noida, Gurgaon, Pune, Nagpur, Jaipur, Gandhinagar)

Budget: Open / Competitive


Job Summary:

We are seeking a highly skilled Senior Data Engineer to design, build, and optimize scalable data solutions that support advanced analytics and machine learning initiatives. You will lead the development of reliable, high-performance data systems and collaborate closely with data scientists to enable data-driven decision-making.

In this role, we expect a forward-thinking professional who utilizes AI-augmented development tools (such as Cursor, Windsurf, or GitHub Copilot) to increase engineering velocity and maintain high code standards in a modern enterprise environment.


Key Responsibilities:

  • Scalable Pipelines: Design, develop, and optimize end-to-end data pipelines using SQL, Python, and PySpark.
  • ETL/ELT Workflows: Build and maintain workflows to transform raw data into structured, analytics-ready datasets.
  • ML Integration: Partner with data scientists to deploy and integrate machine learning models into production environments.
  • Cloud Infrastructure: Manage and scale data infrastructure within AWS and Azure ecosystems.
  • Data Warehousing: Utilize Databricks and Snowflake for big data processing and enterprise warehousing.
  • Automation & IaC: Implement workflow orchestration using Apache Airflow and manage infrastructure as code via Terraform.
  • Performance Tuning: Optimize data storage, retrieval, and system performance across data warehouse platforms.
  • Governance & Compliance: Ensure data quality and security using tools like Unity Catalog or Hive Metastore.
  • AI-Augmented Development: Integrate AI tools and LLM APIs into data pipelines and use AI IDEs to streamline debugging and documentation.


Technical Requirements:

  • Experience: 8+ years of core Data Engineering experience in large-scale enterprise or consulting environments.
  • Languages: Expert proficiency in SQL and Python for complex data processing.
  • Big Data: Hands-on experience with PySpark and large-scale distributed computing.
  • Architecture: Strong understanding of ETL frameworks, data pipeline architecture, and data warehousing best practices.
  • Cloud Platforms: Deep working knowledge of AWS and Azure.
  • Modern Tooling: Proven experience with Databricks, Snowflake, and Apache Airflow.
  • Infrastructure: Experience with Terraform or similar IaC tools for scalable deployments.
  • AI Competency: Proficiency in using AI IDEs (Cursor/Windsurf) and integrating AI/ML models into production data flows.


Preferred Qualifications:

  • Exposure to data governance and cataloging tools (e.g., Unity Catalog).
  • Knowledge of performance tuning for massive-scale big data systems.
  • Familiarity with real-time data processing frameworks.
  • Experience in digital transformation and sustainability-focused data projects.
Read more
Bootlabs Technologies Private Limited

at Bootlabs Technologies Private Limited

2 candid answers
1 recruiter
PS Mohan
Posted by PS Mohan
Mumbai
8 - 14 yrs
Best in industry
Azure
Landing Zone
Terraform
CI/CD
Azure DevOps

About BootLabs:

BootLabs is a boutique tech consulting and digital engineering company that partners with leading enterprises to build and support scalable, cloud-native, and data-driven platforms. We specialise in delivering high-impact solutions across cloud, AI/GenAI, platform engineering, and enterprise application support. Our teams work closely with global clients across highly regulated industries, including BFSI, Healthcare, E-commerce, and other domains, ensuring reliability, security, and operational excellence for mission-critical systems.

 

JD:

Extensive experience in deploying, and supporting enterprise workloads across Azure IaaS, PaaS, and SaaS environments.

Experience managing enterprise Landing Zones & Azure Policies.

Exposure to Azure Data Lake and Azure Databricks (basic to intermediate).

Knowledge of Terraform and any DevOps CI/CD pipelines.

Hybrid connectivity experience (VPN Gateway / ExpressRoute).

Strong networking fundamentals (NSG, UDR, Peering).

OS-level troubleshooting (Windows & Linux).

Experience with enterprise security tools such as Prisma / Cortex or similar vulnerability & endpoint security tools

Experience in production support, governance, and security guardrails.

Experience handling production incidents, RCA preparation, and performance tuning. Change management & CAB coordination.

Hands-on experience with native Azure security services like Azure Firewall, Azure Application Gateway with WAF etc

Cost optimization ,Monitoring & alerting implementation

Read more
NeoGenCode Technologies Pvt Ltd
Akshay Patil
Posted by Akshay Patil
Bengaluru (Bangalore)
3 - 8 yrs
₹15L - ₹18L / yr
DevOps
Google Cloud Platform (GCP)
skill iconKubernetes
helm
Terraform
+5 more

Job Title : DevOps Engineer

Experience : 3+ Years

Location : Indiranagar, Bengaluru (Work From Office – 5 Days)

Employment Type : Full-Time

Work Timings : 11:00 AM to 7:00 PM IST

Notice Period : Immediate Joiners Preferred


Role Overview :

We are seeking a skilled DevOps Engineer with 3+ years of experience in building and managing scalable cloud-native infrastructure.

The ideal candidate will have strong expertise in Kubernetes and Helm, along with hands-on experience in deploying and maintaining production-grade systems on cloud platforms.

This role offers an opportunity to work in a high-growth startup environment, contributing to both existing systems and new infrastructure development.


Key Responsibilities :

  • Design, deploy, and manage scalable infrastructure using Kubernetes.
  • Build and maintain CI/CD pipelines for efficient and automated deployments.
  • Manage and optimize cloud environments (preferably GCP).
  • Implement Infrastructure as Code using Helm/Terraform.
  • Monitor system performance and ensure high availability and reliability.
  • Handle bug fixes, system improvements, and performance optimization.
  • Collaborate with engineering teams to design scalable microservices architecture.
  • Implement logging, monitoring, and alerting solutions.
  • Ensure security best practices including IAM, secrets management, and network policies.


Mandatory Skills :

  • Strong hands-on experience with Kubernetes.
  • Expertise in Helm Charts.
  • Experience with Google Cloud Platform (GCP).
  • Hands-on experience with ArgoCD or similar CI/CD tools.
  • Knowledge of CI/CD tools like Jenkins, GitHub Actions, GitLab CI.
  • Experience in database hosting and scaling.


Nice to Have :

  • Exposure to other cloud platforms (AWS/Azure).
  • Experience with modern DevOps and automation tools.
  • Ability to quickly learn and adapt to new technologies.


Team & Work Scope :

  • No dedicated DevOps team currently – high ownership role.
  • Work on both existing systems (maintenance & improvements) and new system builds (greenfield projects).
  • Opportunity to shape DevOps practices and infrastructure from scratch.


Preferred Candidate Profile :

  • 3+ years of relevant DevOps experience.
  • Strong problem-solving and debugging skills.
  • Experience working in fast-paced startup environments.
  • Understanding of scalability, security, and performance optimization.
  • Good communication and collaboration skills.

Hiring Process :

  1. Profile Screening
  2. GT Assessment
  3. Technical Interview – Round 1
  4. Technical Interview – Round 2
  5. Final Round (if required with US team)
Read more
Bell Techlogix
Pemmraju VenkatVandita
Posted by Pemmraju VenkatVandita
Hyderabad
5 - 10 yrs
₹15L - ₹20L / yr
CI/CD
Terraform
MLOps
skill iconMachine Learning (ML)
Powershell

The DevOps Engineer will play a critical role in operationalizing artificial intelligence across Bell Techlogix client environments. This role focuses on building and supporting cloud infrastructure, CI/CD pipelines, and automation frameworks that power AI and machine learning workloads. The ideal candidate has experience supporting AI platforms such as Azure AI, Azure Machine Learning, Azure OpenAI, and ServiceNow or conversational AI platforms, and understands the operational requirements of production AI systems, including reliability, scalability, and security. 

 

Key Responsibilities 

•Design, build, and operate cloud infrastructure and platform services that support AI and machine learning workloads in production, SLA-driven managed services environments 

•Implement CI/CD and MLOps pipelines to enable automated training, testing, deployment, and rollback of AI and ML models 

•Develop and maintain Infrastructure as Code to provision AI-ready environments consistently across dev/test/prod 

•Support AI platform operations including monitoring model health, pipeline execution, compute utilization, and data dependencies 

•Partner with Machine Learning Engineers and Data Engineers to standardize deployment patterns for AI services and LLM-based solutions 

•Enable secure and scalable AI integrations using APIs, messaging, and event-driven architectures 

•Implement observability solutions for AI platforms, including logging, metrics, alerting, and drift detection integrations 

•Troubleshoot AI platform incidents, perform root cause analysis, and implement remediation to improve reliability and automation coverage 

•Apply security best practices for AI environments including secrets management, identity and access controls, network isolation, and policy enforcement 

•Support AI-driven automation use cases across platforms such as Microsoft Copilot, ServiceNow, and conversational AI tools 

•Collaborate with service desk, security, and architecture teams to continuously improve AI service delivery and operational maturity 

 

Required Qualifications 

•Bachelor’s degree in Computer Science, Engineering, or equivalent practical experience 

•5+ years of experience in DevOps, cloud engineering, or platform operations, with exposure to AI or data workloads 

•Hands-on experience with Microsoft Azure, including compute, networking, storage, and monitoring services 

•Experience building CI/CD pipelines using Azure DevOps, GitHub Actions, or similar tools 

•Working knowledge of Infrastructure as Code (Terraform and/or Bicep/ARM) 

•Scripting experience using PowerShell and/or Python 

•Experience supporting production platforms with incident management, change control, and root cause analysis 

•Understanding of cloud security fundamentals and enterprise governance requirements 

 

Preferred Qualifications 

•Experience with Azure Machine Learning, Azure AI Services, Azure OpenAI, or MLOps frameworks 

•Exposure to containerization and orchestration technologies (Docker, Kubernetes, AKS) 

•Experience supporting data pipelines or feature stores used by machine learning systems 

•Familiarity with ServiceNow, AI-driven ITSM workflows, or automation platforms 

•Experience with observability tools 

•Knowledge of Responsible AI, data governance, and compliance considerations for AI systems 

•Relevant certifications (Microsoft Azure Administrator, Azure DevOps Engineer, Azure AI Engineer)

Read more
Blitzy

at Blitzy

2 candid answers
1 product
Bisman Gill
Posted by Bisman Gill
Pune
5yrs+
Upto ₹90L / yr (Varies
)
skill iconPython
skill iconKubernetes
Terraform
Google Cloud Platform (GCP)
skill iconAmazon Web Services (AWS)
+2 more

About the role

We are looking for talented Senior Backend Engineers (5+ years of experience) to join our team and take ownership of different parts of our stack. You will be working alongside a team of Engineers locally and directly with the U.S. Engineering team on all aspects of product/application development. You will leverage your experiences and abilities to inform decisions across product development and technology. You will help us build the foundation of our 2nd Headquarters in Pune: its culture, its processes, and its practices. There are a ton of interesting problems to solve, so come hungry. If your colleagues describe you as curious, driven, kind, and creative you are a culture fit.

What Success Looks Like

  • You write, review and ship code in production. Your employer or client's success depends on the software you build
  • You use Generative AI tools on a daily basis to enhance the quality and efficacy of your software and non-software deliverables
  • You are a self-starter and enjoy working with minimal supervision
  • You evaluate and make technical architecture decisions with a long-term view, optimizing for speed, quality, and safety
  • You take pride in the product you create and the code that you write
  • Your team can rely on you to get them out of a sticky situation in production
  • You can work well on a team of sales executives, designers and engineers in an in-person environment
  • You are passionate about the enterprise software development lifecycle and feel strongly about improving it
  • You are a first principles engineer who exercises curiosity about the technologies you work with
  • You can learn quickly about technologies, software and code that you are not familiar with, often from rudimentary documentation
  • You take ownership of the code that you write, and you help the team operate with everything that you build, throughout its lifecycle
  • You communicate openly and solicit feedback on important decisions, keeping the team aligned on your rationale
  • You exercise an optimistic mindset and are willing to go the extra mile to make things work

Areas of Ownership

Our hiring process is designed for you to demonstrate a generalist set of capabilities, with a specialization in Backend Technologies.

Required Technical Experience (MUST HAVE):

  • Expertise in Python -
  • Deep hands-on experience with Terraform -
  • Proficiency in Kubernetes -
  • Experience with cloud platforms (GCP strongly preferred, AWS/Azure acceptable) -

Additional experience with some of the following:

  • Backend Frameworks and Technologies (Node.js, NuxtJS, Express.js)
  • Programming languages (JavaScript, TypeScript, Java, C++, Go)
  • RPCs (REST, gRPC or GraphQL)
  • Databases (SQL, NoSQL, Postgres, MongoDB, or Firebase)
  • CI/CD (Jenkins, CircleCI, GitLab or similar)
  • Source code versioning tools such as Git or Perforce
  • Microservices architecture

Ways to stand out

  • Familiarity with AI Platforms
  • Extensive experience with building enterprise-scale applications with >99% SLAs
  • Deep expertise across the full required stack: Python, Terraform, Kubernetes, and GCP

You'll Get...

  • Competitive Salary
  • Medical Insurance Benefits
  • Employer Provident Fund contributions with Gratuity after 5 years of service
  • Company-sponsored US onsite trips for high performers, based on business requirements
  • Potential international transfer support for top performers, based on business requirements
  • Technology (hardware, software, trainings, etc.) equipment and/or allowance
  • The opportunity to re-shape an entire industry
  • Beautiful office environment
  • Meal allowance and/or food provision on site

Culture

Who we are: Our Co-Founder and CTO is a Serial Gen AI Inventor who grew up in Pune, India, is a BITS Pilani graduate, and worked at NVIDIA's Pune office for 6 years. There, he was promoted 5 times in 6 years and was transferred to the NVIDIA Headquarters in Santa Clara, California. After making significant contributions to NVIDIA, he proceeded to attend Harvard for his dual Masters in Engineering and MBA from HBS. Our other Co-Founder/CEO is a successful Serial Entrepreneur who has built multiple companies. As a team, we work very hard, have a curious mind-set, and believe in a low-ego high output approach.

Read more
One2n

at One2n

3 candid answers
Abhishek Ghorpade
Posted by Abhishek Ghorpade
Pune
3 - 7 yrs
₹15L - ₹25L / yr
DevOps
skill iconKubernetes
skill iconDocker
skill icongrafana
prometheus
+5 more

Virtual Hiring Drive Site Reliability Engineer (SRE)


Date: 25th April 2026, Saturday (Single-Day Drive)

Mode: 100% Virtual - All interview rounds on the same day

Experience: 3 to 7 Years


Note : We are looking for quick joiners who can join us within 30 days.


About the Role

We are looking for a Site Reliability Engineer who understands the realities of running production systems at scale. If building reliable, scalable, and observable systems excites you, you'll enjoy working with us.

At One2N, we solve One-to-N problems where proof of concept is already built and the real challenge lies in scalability, maintainability, performance, and reliability.

You will work closely with startups and mid-sized clients, helping them architect production-grade infrastructure and observability systems.


Key Responsibilities

  • Design and build platform engineering solutions with a self-serve model
  • Architect and optimize observability systems (metrics, logs, traces)
  • Implement monitoring, logging, alerting & dashboards
  • Build and optimize CI/CD pipelines
  • Automate repetitive operational and infrastructure tasks (IaC-first approach)
  • Improve Developer Experience (DX)
  • Guide teams on SRE best practices & on-call processes
  • Participate in code reviews and mentor engineers
  • Contribute to cloud-native and platform engineering initiatives


Must-Have Skills

  • 3 - 7 years experience in DevOps / SRE / Platform Engineering
  • Strong hands-on with Kubernetes on AWS
  • Expertise in observability tools like Datadog / Honeycomb / ELK / Grafana / Prometheus
  • Experience with Docker & Microservices architecture
  • Infrastructure as Code using Terraform / Pulumi
  • Strong Linux troubleshooting skill
  • Programming knowledge in Golang / Python / Java
  • Automation & scripting expertise
Read more
NeoGenCode Technologies Pvt Ltd
Akshay Patil
Posted by Akshay Patil
Bengaluru (Bangalore)
4 - 10 yrs
₹10L - ₹30L / yr
skill iconPython
SQL
Spark
skill iconAmazon Web Services (AWS)
Amazon S3
+13 more

Job Title : AWS Data Engineer

Experience : 4+ Years

Location : Bengaluru (HSR – Hybrid, 3 Days WFO)

Notice Period : Immediate Joiner


💡 Role Overview :

We are looking for a skilled AWS Data Engineer to design, build, and scale modern data platforms. The role involves working with AWS-native services, Python, Spark, and DBT to deliver secure, scalable, and high-performance data solutions in an Agile environment.


🔥 Mandatory Skills :

Python, SQL, Spark, AWS (S3, Glue, EMR, Redshift, Athena, Lambda), DBT, ETL/ELT pipeline development, Airflow/Step Functions, Data Lake (Parquet/ORC/Iceberg), Terraform & CI/CD, Data Governance & Security


🚀 Key Responsibilities :

  • Design, build, and optimize ETL/ELT pipelines using Python, DBT, and AWS services
  • Develop and manage scalable data lakes on S3 using formats like Parquet, ORC, and Iceberg
  • Build end-to-end data solutions using Glue, EMR, Lambda, Redshift, and Athena
  • Implement data governance, security, and metadata management using Glue Data Catalog, Lake Formation, IAM, and KMS
  • Orchestrate workflows using Airflow, Step Functions, or AWS-native tools
  • Ensure reliability and automation via CloudWatch, CloudTrail, CodePipeline, and Terraform
  • Collaborate with data analysts and data scientists to deliver actionable insights
  • Work in an Agile environment to deliver high-quality data solutions

✅ Mandatory Skills :

  • Strong Python (including AWS SDKs), SQL, Spark
  • Hands-on experience with AWS data stack (S3, Glue, EMR, Redshift, Athena, Lambda)
  • Experience with DBT and ETL/ELT pipeline development
  • Workflow orchestration using Airflow / Step Functions
  • Knowledge of data lake formats (Parquet, ORC, Iceberg)
  • Exposure to DevOps practices (Terraform, CI/CD)
  • Strong understanding of data governance and security best practices
  • Minimum 4–7 years in Data Engineering (3+ years on AWS)

➕ Good to Have :

  • Understanding of Data Mesh architecture
  • Experience with platforms like Data.World
  • Exposure to Hadoop / HDFS ecosystems

🤝 What We’re Looking For :

  • Strong problem-solving and analytical skills
  • Ability to work in a collaborative, cross-functional environment
  • Good communication and stakeholder management skills
  • Self-driven and adaptable to fast-paced environments

📝 Interview Process :

  1. Online Assessment
  2. Technical Interview
  3. Fitment Round
  4. Client Round
Read more
Bengaluru (Bangalore)
8 - 15 yrs
₹25L - ₹40L / yr
skill iconJava
skill iconSpring Boot
skill iconNodeJS (Node.js)
Microservices
Architecture
+20 more

Job Title: Consultant – Enterprise Application Development

Location: Bengaluru (Hybrid / On-site)

Engagement: Full-Time

Experience: 10 – 15 years preferred


About Us: Introducing VTT, a comprehensive mobility service provider catering to diverse multinational sectors like IT/ITES, KPO/BPO, Financial, Pharma, and more across Indian cities. Our “Managed Mobility Program” includes Fleet Management, Technology, Resource Management, Car Rentals, Logistics, and Special Services (Ambulance and PWD vehicles). Trusted by Fortune companies such as Cisco, Morgan Stanley, Wells Fargo, Google, PWC, and others, we pride ourselves on leveraging expertise and cutting-edge technology for safe, efficient, and uninterrupted service delivery. With a commitment to excellence, we ensure best-in-class standards for all our clients. Trip to school is now timely, comfy and secure! Our well maintained f leet is here to enrich your child’s commute, keeping students punctual and safe thanks to GPS tracking paired with well-trained drivers. Our routes are carefully planned, our drivers attentive, and everything hassle-free.


Role Overview

We are looking for a seasoned Consultant with comprehensive expertise in enterprise-level application development across backend, frontend, mobile, DevOps, and cloud. The role demands a strong architectural mindset combined with hands-on execution. The Consultant will also play a critical role in understanding the current system architecture end-to-end, driving technical improvements, building the tech team foundation, and establishing structured technical documentation.


Key Responsibilities

• Understand the complete architecture of the existing systems, including web, mobile, backend services, and cloud environment.

• Provide hands-on leadership across backend, frontend, mobile, DevOps, and cloud infrastructure.

• Architect and optimize enterprise-grade applications for scalability, security, performance, and reliability.

• Conduct technical due diligence on current systems and propose improvements or refactoring plans.

• Build the foundation for the internal engineering team including hiring support, role definitions, and best-practice processes.

• Drive engineering workflows including coding standards, branching strategy, CI/CD, monitoring, and release management.

• Create comprehensive technical documentation covering system architecture, API specs, deployment playbooks, and SOPs.

• Review code and provide mentorship to engineering resources.

• Coordinate with product and business teams to translate requirements into technical design and actionable development roadmap.

• Troubleshoot and resolve deep-stack issues during development or production.


Technical Expertise Required


Backend


• Java / Spring Boot

• Node.js

•Microservices architecture

• REST / GraphQL


Frontend


• React js

• Responsive UI, component-based architecture, state management


Mobile


• Flutter

• React Native


Cloud & DevOps


• AWS (ECS / EKS / EC2 / RDS / Lambda / S3 / IAM / CloudWatch etc.)

• CI/CD pipelines (GitHub Actions / Jenkins / GitLab CI or equivalent)

• Docker / Kubernetes

• Infrastructure-as-code (Terraform / CloudFormation)


Database


• MongoDB

• Knowledge of PostgreSQL / MySQL is an added advantage


Professional Attributes


• Strong architectural thinking with the ability to simplify complex systems.

• Excellent communication and stakeholder management skills.

• Ability to work independently without constant supervision.

• Capability to mentor, lead, and build an engineering team from scratch.

• Process-driven mindset with a focus on best practices and documentation.


Deliverables


• Architectural understanding and documentation of current systems.

• Recommendations and implementation plan for system upgrades or restructuring.

• Establishment of core engineering processes and standards.

• Hiring support and technical evaluation of developers.

Read more
EaseMyTrip.com

at EaseMyTrip.com

1 recruiter
Zainab Siddiqui
Posted by Zainab Siddiqui
Noida
2 - 3 yrs
₹3L - ₹5L / yr
skill iconAmazon Web Services (AWS)
Google Cloud Platform (GCP)
skill iconPython
skill iconNodeJS (Node.js)
skill iconGitHub
+5 more

Key Responsibilities:

  • ☁️ Manage cloud infrastructure and automation on AWS, Google Cloud (GCP), and Azure.
  • 🖥️ Deploy and maintain Windows Server environments, including Internet Information Services (IIS).
  • 🐧 Administer Linux servers and ensure their security and performance.
  • 🚀 Deploy .NET applications (ASP.Net, MVC, Web API, WCF, etc.) using Jenkins CI/CD pipelines.
  • 🔗 Manage source code repositories using GitLab or GitHub.
  • 📊 Monitor and troubleshoot cloud and on-premises server performance and availability.
  • 🤝 Collaborate with development teams to support application deployments and maintenance.
  • 🔒 Implement security best practices across cloud and server environments.



Required Skills:

  • ☁️ Hands-on experience with AWS, Google Cloud (GCP), and Azure cloud services.
  • 🖥️ Strong understanding of Windows Server administration and IIS.
  • 🐧 Proficiency in Linux server management.
  • 🚀 Experience in deploying .NET applications and working with Jenkins for CI/CD automation.
  • 🔗 Knowledge of version control systems such as GitLab or GitHub.
  • 🛠️ Good troubleshooting skills and ability to resolve system issues efficiently.
  • 📝 Strong documentation and communication skills.



Preferred Skills:

  • 🖥️ Experience with scripting languages (PowerShell, Bash, or Python) for automation.
  • 📦 Knowledge of containerization technologies (Docker, Kubernetes) is a plus.
  • 🔒 Understanding of networking concepts, firewalls, and security best practices.


Read more
Recruiting Bond

at Recruiting Bond

2 candid answers
Pavan Kumar
Posted by Pavan Kumar
Bengaluru (Bangalore), Mumbai
10 - 16 yrs
₹75L - ₹130L / yr
Distributed Systems
Microservices
Enterprise architecture
System Design & Architecture
Event-Driven Architecture
+29 more

🚨 We’re Building a “Top 1% Engineering Org”


We’re building a high-talent-density, AI-first R&D organization from scratch — inside a publicly listed company undergoing a full-scale transformation.

Think:

→ Rewriting legacy systems into AI-native architectures

→ Embedding LLMs + Agentic AI into core workflows

→ Reimagining platforms, infra, and data systems for the next decade

This is the kind of shift you’d expect from Google, Microsoft, or Meta —

Except you get to build it from day 0 → scale it globally.


About the Role / Team

We are building a next-generation AI-first R&D organization in Bengaluru, focused on solving complex problems across LLMs, Agentic AI systems, distributed computing, and enterprise-scale architectures.


This initiative is part of a publicly listed global company investing heavily in AI-driven transformation, re-architecting its platforms into intelligent, autonomous systems powered by large language models, workflows, and decision engines.


You will be working on:

  • Agentic AI systems & LLM-powered workflows
  • Distributed, scalable backend systems
  • Enterprise-grade AI platforms
  • Automation-first engineering environments


🚀 The Mandate

Own and evolve the technical backbone of an AI-first enterprise platform.


You will define architecture across LLM-powered systems, distributed services, and data platforms — and lead critical transformations from legacy → AI-native systems.


🧩 What You’ll Do

  • Architect large-scale distributed systems powering AI-driven workflows
  • Lead 0→1 and 1→N platform builds (LLM integrations, agentic systems, orchestration layers)
  • Redesign legacy systems into scalable, modular, AI-native architectures
  • Drive system design excellence across teams (APIs, infra, observability, reliability)
  • Make high-stakes decisions on trade-offs (latency, cost, scalability, model performance)
  • Mentor senior engineers and influence engineering culture/org standards
  • Partner with product, data, and leadership on long-term technical strategy


🧠 What We’re Looking For

  • Proven track record building high-scale backend or platform systems
  • Deep expertise in distributed systems, microservices, cloud (AWS/GCP/Azure)
  • Strong exposure to data systems/infra / Data / real-time architectures
  • Experience or strong interest in LLMs, GenAI, or AI system design
  • Exceptional system design, abstraction, and problem-solving ability
  • High ownership mindset — you think in terms of systems, not tickets
  • Strong coding skills in Python / Java / Go / Node.js
  • Solid understanding of data structures, system design basics, and backend architecture
  • Experience building scalable APIs and services
  • Familiarity or curiosity around AI/LLMs, async systems, or event-driven design
  • Strong debugging, problem-solving, and ownership mindset
  • Solve hard system problems (latency, scale, reliability)
  • Drive cross-team technical decisions and standards
  • Mentor senior engineers and influence org-wide architecture 
  • Design large-scale distributed systems and backend platforms
  • Mentorship & Technical Leadership 
  • Expertise in system design, scalability, and performance optimization


Nice to Have

  • Experience integrating LLMs, vector databases, or AI pipelines
  • Contributions to architecture at scale
  • Experience with Agentic AI / LLM orchestration frameworks
  • Background in product engineering or platform companies
  • Exposure to global-scale systems (millions of users / high throughput)


🔥 What Sets You Apart

  • Built platforms used by millions of users / high-throughput systems
  • Experience with event-driven systems, stream processing, or infra platforms
  • Prior work on AI/ML platforms, model serving, or intelligent systems
Read more
SAAS Industry

SAAS Industry

Agency job
via Peak Hire Solutions by Dharati Thakkar
Bengaluru (Bangalore)
5 - 8 yrs
₹20L - ₹25L / yr
skill iconAmazon Web Services (AWS)
skill iconNodeJS (Node.js)
RESTful APIs
NOSQL Databases
Systems design
+39 more

Job Details

Job Title: Senior Backend Engineer

Industry: SAAS

Function – Information Technology

Experience Required: 5-8 years

- Working Days: 6 days a week, (5 days-in-office, Saturdays WFH)

Employment Type: Full Time

Job Location: Bangalore

CTC Range: Best in Industry

 

Preferred Skills: AWS, NodeJS, RESTful APIs, NoSQL

 

Criteria

· Minimum 5+ years in backend engineering with strong system design expertise

· Experience building scalable systems from scratch

· Expert-level proficiency in Node.js

· Deep understanding of distributed systems

· Strong NoSQL design skills

· Hands-on AWS cloud experience

· Proven leadership and mentoring capability

· Preferred candidates from SAAS/Software/IT Services based startups or scaleup companies

 

Job Description

The Role:

What You’ll Build:

1. System Architecture & Design

● Architect highly scalable backend systems from the ground up

● Define technology choices: frameworks, databases, queues, caching layers

● Evaluate microservices vs monoliths based on product stage

● Design REST, GraphQL, and real-time WebSocket APIs

● Build event-driven systems for asynchronous processing

● Architect multi-tenant systems with strict data isolation

● Maintain architectural documentation and technical specs

2. Core Backend Services

● Build high-performance APIs for 3D content, XR experiences, analytics, and user interactions

● Create 3D asset processing pipelines for uploads, conversions, and optimization

● Develop distributed job workers for CPU/GPU-intensive tasks

● Build authentication/authorization systems (RBAC)

● Implement billing, subscription, and usage metering

● Build secure webhook systems and third-party integration APIs

● Create real-time collaboration features via WebSockets/SSE

3. Data Architecture & Databases

● Design scalable schemas for 3D metadata, XR sessions, and analytics

● Model complex product catalogs with variants and hierarchies

● Implement Redis-based caching strategies

● Build search and indexing systems (Elasticsearch/Algolia)

● Architect ETL pipelines and data warehouses

● Implement sharding, partitioning, and replication strategies

● Design backup, restore, and disaster recovery workflows

4. Scalability & Performance

● Build systems designed for 10x–100x traffic growth

● Implement load balancing, autoscaling, and distributed processing

● Optimize API response times and database performance

● Implement global CDN delivery for heavy 3D assets

● Build rate limiting, throttling, and backpressure mechanisms

● Optimize storage and retrieval of large 3D files

● Profile and improve CPU, memory, and network performance

5. Infrastructure & DevOps

● Architect AWS infrastructure (EC2, S3, Lambda, RDS, ElastiCache)

● Build CI/CD pipelines for automated deployments and rollbacks

● Use IaC tools (Terraform/CloudFormation) for infra provisioning

● Set up monitoring, logging, and alerting systems

● Use Docker + Kubernetes for container orchestration

● Implement security best practices for data, networks, and secrets

● Define disaster recovery and business continuity plans

6. Integration & APIs

● Build integrations with Shopify, WooCommerce, Magento

● Design webhook systems for real-time events

● Build SDKs, client libraries, and developer tools

● Integrate payment gateways (Stripe, Razorpay)

● Implement SSO and OAuth for enterprise customers

● Define API versioning and lifecycle/deprecation strategies

7. Data Processing & Analytics

● Build analytics pipelines for engagement, conversions, and XR performance

● Process high-volume event streams at scale

● Build data warehouses for BI and reporting

● Develop real-time dashboards and insights systems

● Implement analytics export pipelines and platform integrations

● Enable A/B testing and experimentation frameworks

● Build personalization and recommendation systems

 

Technical Stack:

1. Backend Languages & Frameworks 

●  Primary: Node.js (Express, NestJS), Python (FastAPI, Django)

●  Secondary: Go, Java/Kotlin (Spring)

●  APIs: REST, GraphQL, gRPC


2. Databases & Storage

● SQL: PostgreSQL, MySQL

● NoSQL: MongoDB, DynamoDB

● Caching: Redis, Memcached

● Search: Elasticsearch, Algolia

● Storage/CDN: AWS S3, CloudFront

● Queues: Kafka, RabbitMQ, AWS SQS

 

3. Cloud & Infrastructure: 

● Cloud: AWS (primary), GCP/Azure (nice to have)

● Compute: EC2, Lambda, ECS, EKS

● Infrastructure: Terraform, CloudFormation

● CI/CD: GitHub Actions, Jenkins, CircleCI

● Containers: Docker, Kubernetes

 

4. Monitoring & Operations 

● Monitoring: Datadog, New Relic, CloudWatch

● Logging: ELK Stack, CloudWatch Logs

● Error Tracking: Sentry, Rollbar

● APM tools

 

5. Security & Auth

● Auth: JWT, OAuth 2.0, SAML

● Secrets: AWS Secrets Manager, Vault

● Security: Encryption (at rest/in transit), TLS/SSL, IAM

 


What We’re Looking For:

1. Must-Haves

● 5+ years in backend engineering with strong system design expertise

● Experience building scalable systems from scratch

● Expert-level proficiency in at least one backend stack (Node, Python, Go, Java)

● Deep understanding of distributed systems and microservices

● Strong SQL/NoSQL design skills with performance optimization

● Hands-on AWS cloud experience

● Ability to write high-quality production code daily

● Experience building and scaling RESTful APIs

● Strong understanding of caching, sharding, horizontal scaling

● Solid security and best-practice implementation experience

● Proven leadership and mentoring capability


2. Highly Desirable

● Experience with large file processing (3D, video, images)

● Background in SaaS, multi-tenancy, or e-commerce

● Experience with real-time systems (WebSockets, streams)

● Knowledge of ML/AI infrastructure

● Experience with HA systems, DR planning

● Familiarity with GraphQL, gRPC, event-driven systems

● DevOps/infrastructure engineering background

● Experience with XR/AR/VR backend systems

● Open-source contributions or technical writing

● Prior senior technical leadership experience

 

Technical Challenges You’ll Solve:

● Designing large-scale 3D asset processing pipelines

● Serving XR content globally with ultra-low latency

● Scaling from thousands to millions of daily requests

● Efficiently handling CPU/GPU-heavy workloads

● Architecting multi-tenancy with complete data isolation

● Managing billions of analytics events at scale

● Building future-proof APIs with backward compatibility

 

Why company:

● Architectural Ownership: Build foundational systems from scratch

● Deep Technical Work: Solve distributed systems and scaling challenges

● Hands-On Impact: Design and code mission-critical infrastructure

● Diverse Problems: APIs, infra, data, ML, XR, asset processing

● Massive Scale Opportunity: Build systems for exponential growth

● Modern Stack and best practices

● Product Impact: Your architecture directly powers millions of users

● Leadership Opportunity: Shape engineering culture and direction

● Learning Environment: Stay at the forefront of backend engineering

● Backed by AWS, Microsoft, Google

 

Location & Work Culture:

● Location: Bengaluru

● Schedule: 6 days a week, (5 days-in-office, Saturdays WFH)

● Culture: Builder mindset, strong ownership, technical excellence

● Team: Small, highly skilled backend and infra team

● Resources: AWS credits, latest tooling, learning budget

 

Read more
Timble Technologies

at Timble Technologies

1 recruiter
Shefali Gupta
Posted by Shefali Gupta
Delhi, Gurugram, Noida, Ghaziabad, Faridabad
1 - 4 yrs
₹2L - ₹5L / yr
Advanced Linux Admin
Ansible
Terraform
skill iconDocker
skill iconJenkins
+7 more

Job Title: Devops Engineer

Location: Delhi, Arjan Garh

Job Type: Full-Time

IMMEDIATE JOINERS REQUIRED

 

About Us:

Timble is a forward-thinking organization dedicated to leveraging cutting-edge technology to solve real-world problems. Our mission is to drive innovation and create impactful solutions through artificial intelligence and machine learning.


About the Role

We are looking for a high-ownership Senior DevOps Engineer to architect and maintain the mission-critical infrastructure supporting our global algorithmic trading operations. You will be the bridge between development and live trading, ensuring zero-latency performance and 100% system availability.

Key Responsibilities

  • Infrastructure Architecture: Design scalable, fault-tolerant systems for high-frequency trading environments.
  • Performance Optimization: Tune Linux servers and Python environments for maximum speed and efficiency.
  • Incident Management: Lead real-time response for live trading systems, performing RCA and preventive fixes.
  • Automation & CI/CD: Build and enhance robust pipelines using Docker, Jenkins, and Ansible.
  • Proactive Monitoring: Implement advanced logging and alerting (Prometheus/Grafana) to ensure high uptime.
  • Database Admin: Manage relational databases and write optimized SQL for operational reporting.
  • Mentorship: Guide junior DevOps members and maintain rigorous system documentation.

Technical Requirements

  • OS/Scripting: Advanced Linux Admin and expert-level Python scripting.
  • IaC & Tools: Hands-on experience with Ansible, Terraform, and Docker.
  • CI/CD: Proficiency in Jenkins or GitLab CI.
  • Data: Strong SQL skills with experience in performance tuning.
  • Education: B.Tech/M.Tech in Computer Science or related engineering field.
Read more
Get to hear about interesting companies hiring right now
Company logo
Company logo
Company logo
Company logo
Company logo
Linkedin iconFollow Cutshort
Why apply via Cutshort?
Connect with actual hiring teams and get their fast response. No spam.
Find more jobs
Get to hear about interesting companies hiring right now
Company logo
Company logo
Company logo
Company logo
Company logo
Linkedin iconFollow Cutshort