50+ Kubernetes Jobs in India
Apply to 50+ Kubernetes Jobs on CutShort.io. Find your next job, effortlessly. Browse Kubernetes Jobs and apply today!
Why this role exists
Our infrastructure footprint is growing faster than our headcount, and we believe most of that
gap should be closed by automation and AI agents — not by hiring more humans to do toil. We
need someone early in their career who treats manual work as a bug, ships scripts and agents
instead of tickets, and wants to grow into deeper ownership over the next two years.
You will not be the most senior person on the team. You will be the one who multiplies the team.
What you'll own
In your first 1 months
• Take ownership of one slice of our CI/CD pipeline and make it measurably
faster, more reliable, or cheaper. We expect a number on a dashboard to move.
• Build at least three internal automations that replace manual ops toil —
using AI agents (Claude Code, agentic CLIs, scripted LLM workflows) as your force
multiplier.
• Be the first responder for a defined set of alerts. Write the runbooks. Drive
the alert volume down.
• Support senior engineers on AI/ML infrastructure (GPU nodes, inference
services, model deployment) — observe, document, and gradually take on contained
changes under review.
By 3 months you should be
• The go-to person for at least two production systems.
• Shipping routine infrastructure changes without needing senior review.
• Treating "manual" as a code smell.
Required (we will reject without these)
• 0–3 years hands-on experience with one major cloud (AWS, GCP, or
Azure — one is fine, depth beats breadth).
• Fluent in Linux command line, bash, and at least one scripting language
(Python or Go preferred).
• Have shipped something to production that real users hit. A side project
counts; a graded coursework lab does not.
• Comfortable with Docker — you can explain what an image vs. a
container is and why it matters.
• Working knowledge of networking fundamentals: DNS, HTTP/HTTPS,
TLS, ports, basic subnets — enough to debug "it works on my machine."
• Git fluency: branches, merges, rebases, conflict resolution.
• CI/CD pipelines — you have authored or substantially modified pipelines
in GitHub Actions, GitLab CI, ArgoCD, Jenkins, or similar. Not just "I clicked Re-run."
• Kubernetes basics — kubectl for real work, can read pod logs,
understand deployments and services, can debug a CrashLoopBackOff without
panicking. You do not need to have run a cluster; you do need to have lived inside one.
• Active user of AI coding agents (Claude Code, Cursor, Copilot, agentic
CLIs, etc.). You should be able to walk us through specific tasks where they made you
faster, and specific tasks where they failed you and how you noticed. "I have tried it" is
not enough.
Bonus (real plus, not required)
• Infrastructure as Code: Terraform, Pulumi, or Ansible.
• Observability: Prometheus/Grafana, Datadog, OpenTelemetry, any APM.
• Have built or extended an LLM-based agent — a custom MCP server, a
scripted multi-step workflow, an internal tool that calls models in a loop. Anything beyond
chat-with-Claude.
• Exposure to GPU workloads, model serving (vLLM, Triton, TGI, etc.), or
ML pipelines.
What we don't care about
• Whether your degree is in CS — or whether you have a degree at all.
• Brand-name companies on your resume.
• Certifications. They are fine. They do not substitute for having shipped.
How we work
• We default to automation. If you do something manually twice, the third
time you script it or hand it to an agent.
• AI agents are part of the workflow, not a novelty. Expect interview
questions about exactly how you use them — and where you have caught them being
wrong.
• Small, reversible changes beat big-bang rollouts.
• Postmortems are blameless and written down.
• We push back on each other. If you only execute, you will be unhappy
here.
How to apply
Send:
• Your resume.
• A short note (≤200 words) describing one infra or automation problem you
solved, and how AI agents factored in — or did not, and why. We read these. Generic
notes get rejected.
Internal note — delete before posting externally
• Comp band, location policy, team name, and reporting line marked
[CONFIRM] need to be filled in before this goes external.
• The Required list is intentionally tight: CI/CD and Kubernetes basics
promoted from bonus. Expect this to filter ~80% of typical junior DevOps applicants. The
remaining pool will skew toward people who have actually shipped infra at a startup, not
bootcamp grads or pure cloud-cert holders.
• IaC, observability, agent-building, and GPU/ML serving stay as bonus.
Promoting any of these to required at 0–3 yrs collapses the pool to near-zero or forces
hiring senior people at junior comp. If you want IaC required, re-level this to mid (3–5
yrs) and raise the band.
• Screening implication: the resume screen should explicitly check for
CI/CD pipeline authorship and any K8s-touching production work. If neither is on the
resume, reject at screen. Do not waste interview slots.
• Pipeline watch: if fewer than ~15 qualified resumes after 2 weeks of
active sourcing, the first thing to relax is the AI-agent-fluency bar (move to bonus and
screen for it in interview instead). Do not relax the "shipped to production" requirement
— that is the load-bearing filter.
Job Description: Java Developer for Trade Processing system
Project Duration: 2 years (with possible extension)
Overview
“Client” Japan is launching a 2+ years project to modernize its settlement processes which will require design and implement new Java-based interfaces with upstream and downstream systems, together with ETL processes that support trade-flow data across the Global Markets business. The project will create a modern, API-first, micro-services architecture, integrated with the more legacy system, while maintaining BAU coverage for existing applications.
We are looking for experienced Java consultants to:
•Build new application components, REST/SOAP APIs and integration flows.
•Design and develop ETL pipelines (e.g., using Apache Camel, Kafka, Spark, or commercial ETL tools).
•Integrate the new components and functions with the existing platform
•Provide back-fill support for BAU team members who transition to the project.
The role sits within the Global Markets IT and collaborates closely with on-shore teams in Tokyo and offshore team in India.
Key Responsibilities
# Responsibility
1.Develop high-quality, production-ready Java code (Java 8+) for new interfaces and micro-services.
2.Design, implement and document REST/SOAP APIs, following API-First principles and security best-practices.
3.Create and maintain ETL/ELT pipelines using tools such as Apache Camel, Kafka Streams, Spark, or commercial ETL suites.
# Responsibility
4.Integrate new applications with existing legacy system and database while ensuring no impact on existing flows
5.Participate in technical design workshops; contribute architecture proposals (service-oriented, event-driven).
6.Perform unit testing (JUnit, Mockito) and support integration-test activities; ensure code passes SonarQube/ Fortify quality gates.
7.Conduct peer-code reviews and mentor junior developers on Java, design patterns, and DevSecOps practices.
8.Collaborate with offshore and on-shore teams via daily stand-ups, sprint reviews and ad-hoc technical discussions (English).
9.Provide BAU back-fill support when required, ensuring continuity of existing trade-processing applications.
10.Escalate impediments promptly and propose mitigation actions.
11.Continuously improve development processes (CI/CD pipelines, automated testing, static analysis).
Required Skills & Experience, Area, Requirement
Java Development ≥ 5 years of professional experience in Java 8+ (Core, Streams, Concurrency).
Frameworks Strong knowledge of Spring Boot, Spring Framework, Hibernate/JPA.
API Design Experience designing and implementing RESTful and SOAP services.
ETL / Data Integration Hands-on experience with ETL tools or frameworks (Apache Camel, Kafka, Spark, Talend, Informatica, etc.).
Databases Proficient in SQL (Oracle, PostgreSQL, MySQL) and writing stored procedures.
Testing & Quality Unit-test expertise (JUnit, Mockito) and familiarity with SonarQube, Fortify, Nexus IQ.
DevSecOps Exposure to CI/CD tools (Bitbucket/Git, Jenkins, Artifactory).
Agile Practices Worked in Scrum/XP environments; comfortable with TDD and story-point estimation.
Communication Fluent English (written & spoken) for global collaboration.
Preferred Qualifications, Preference, Details
Micro-services & Event-Driven Architecture
Design/implementation experience with Kubernetes, Docker, Kafka, or similar.
API-Management Platforms
Experience with Apigee, Kong, or IBM API Connect.
Domain Knowledge
Understanding of trade-flow, settlement or other capital-markets processes.
Location & Benefits
Client India Solution office or possibly hybrid mode
Senior Software Engineer
Responsibilities:
• Lead by the principle of "customer first" to analyse, debug, develop, and maintain customer-centric software.
• Collaborate closely with multidisciplinary teams to analyse, debug and fix issues with high quality code, zero regressions, scalable, innovative technical solutions.
• Optimizing components for maximum performance and scalability
• Participates in R&D, Proof of Concepts, Prototyping, Code review, Root Causing, etc.
• At least 2-3 yrs of experience in taking full ownership of software development lifecycle including planning, design, architecture, development, test & deployment. And 2+ years of experience in supporting production or customer issues and escalations.
• Review and analyze support tickets that are complex in nature and require more technical knowledge to analyze. Investigate issues to identify root causes and document findings clearly.
• Influences the development practices so that they follow best practices, policies, and procedures.
• Ensure software products meet all non-functional requirements including operational and security needs.
• Excellent verbal and written communication skills, problem solving skills.
• Address complex technical challenges within software systems, ensuring robustness, compliance, and customer satisfaction.
• Contribute to knowledge base
• Support the Lead and Mentor the team of software engineers and own the technical health of the service the team is working on.
Requirements:
• Bachelor’s or Master’s degree in Computer Science, Engineering, or a related field.
• Minimum of 5 years of professional experience in Java development with expertise in core Java, JDK, data structures, and multithreading.
• Strong analytical skills with experience in root cause analysis and fixes • Strong experience with Spring and Spring Boot frameworks.
• Strong understanding of software design principles, architecture, and best practices
• Familiarity with server technologies, including Tomcat and WebLogic.
• Proficiency in working with relational databases such as Oracle and PostgreSQL
• Experience with messaging queues, particularly JMS MQ or Artemis MQ.
• Possess exceptional debugging and troubleshooting skills to resolve complex issues across the entire, cross-functional technology stack
• Awareness / exposure to debugging Frontend applications (ReactJS / .Net) is a good to have.
• Exposure to Kubernetes, containerization and cloud (AWS) technologies in building scalable, resilient, and distributed environments.
• Excellent problem-solving skills and the ability to work in a fast-paced and in production customer sensitive environments.
• Strong communication and collaboration skills. Quick learner.
• Experience and knowledge in working in the HRP application – specifically the Claims functionality. Added experience in other areas including Enrollment, Billing, Financials is a plus. Candidate must have worked previously in HRP
• Familiarity with ticketing systems (Jira, SalesForce) and production support workflows.
Job Title : Senior Node.js Developer (SDE 2)
Experience : 5+ Years
Location : Bengaluru (Whitefield)
Work Mode : Hybrid (3 Days WFO)
Openings : 2
Notice Period : Immediate–20 Days Preferred
Role Overview :
We are looking for an experienced Senior Node.js Developer (SDE 2) professional with strong expertise in building scalable backend applications, microservices, and distributed systems in Agile environments.
Mandatory Keywords :
Node.js, JavaScript, TypeScript, GraphQL, Microservices, Cloud, REST APIs, Docker, Kubernetes, Redis, System Design.
Key Responsibilities :
• Design, develop, and deploy scalable backend systems using Node.js & TypeScript.
• Build REST APIs and GraphQL services.
• Develop secure, reusable, and high-performance applications.
• Work on microservices, cloud platforms, and production support.
• Follow best practices including TDD, testing, and DevOps processes.
• Mentor team members and contribute to technical excellence.
Mandatory Skills :
• 5+ years of software development experience.
• Strong hands-on experience in Node.js, JavaScript, TypeScript.
• Experience with Express.js/Koa/Sails.js.
• GraphQL, REST APIs, Microservices
• AWS/Azure/GCP Cloud
• Docker, Kubernetes, Redis
• Strong understanding of distributed systems, Git, and DevOps.
Good to Have :
Kafka, RabbitMQ, Apollo Federation, New Relic, Datadog, Splunk, React.js/Next.js, Jest/Mocha/Cucumber
Interview Process :
- L1 – Technical
- L2 – Coding
- L3 – System Design (HLD + LLD)
Engineering Manager
Total Experience - 10+ Years Experience
Work Mode - Remote
What You’ll Do
Core Responsibilities
- Lead end-to-end project delivery from planning to execution.
- Communicate effectively with clients, stakeholders, and internal teams.
- Prior experience in Infrastructure Engineering or Application Development environments.
- Drive technical discussions around system architecture, scalability, and delivery.
- Balance client expectations, project priorities, and team bandwidth effectively.
- Mentor and guide team members while ensuring healthy work-life balance.
- Continuously learn and stay updated with evolving technologies and industry practices.
Management Responsibilities
- Act as the primary point of contact for clients and delivery teams.
- Track milestones, risks, dependencies, and project schedules proactively.
- Create and maintain regular project status reports and delivery metrics.
- Establish efficient technical and operational processes for teams.
- Provide accurate effort estimations and prioritize tasks across modules and teams.
- Build strong feedback loops with internal and external stakeholders.
Organization Building
- Drive key initiatives with organization-wide impact across teams and functions.
- Improve reporting, operational efficiency, and delivery management practices.
What We’re Looking For
- 10+ years of experience in software delivery and engineering management.
- Experience managing and mentoring teams of 10+ members.
- Strong understanding of SDLC, system design, and software architecture.
- Hands-on experience delivering projects in both Fixed Scope and T&M models.
- Background in engineering delivery, ideally having grown from a developer role.
- Strong analytical and reporting skills with experience in delivery metrics.
- Excellent client-facing communication and stakeholder management abilities.
- Experience handling high-pressure situations, risks, and disaster management scenarios.
- Ability to understand business priorities and align team execution accordingly.
Senior Spark QA Engineer – Functional & Performance Testing
Location: Remote
Experience: 5+ Years
Job Description -
We are looking for a Senior Spark QA Engineer with strong expertise in functional and performance testing of Apache Spark applications and distributed data platforms.
Key Responsibilities -
- Perform manual and automated testing of Spark jobs, Spark SQL queries, and ETL pipelines.
- Execute functional, scalability, and performance testing for Spark workloads.
- Set up and manage Spark clusters on Standalone, YARN, Kubernetes, Mesos, Databricks, EMR, and Dataproc.
- Conduct benchmarking, validation, and performance analysis of Spark applications.
- Identify bottlenecks and troubleshoot distributed system issues.
- Lead QA initiatives and mentor team members.
Required Skills -
- 5+ years of QA/testing experience with strong hands-on expertise in Apache Spark.
- Experience in functional and performance testing of distributed systems.
- Strong understanding of Spark architecture and optimization techniques.
- Experience with Kubernetes and cloud-based Spark environments.
- Proficiency in Python or Java.
- Experience with automation frameworks and CI/CD pipelines.
About Us:
We are hiring for a pre seed funded startup called Zeromoblt (https://zeromoblt.com/), a high-agency Hyderabad-based startup revolutionizing student transportation with lean, intelligent tech stacks.
Our mission: architect world-class systems from scratch—fast, scalable, and algorithmically sharp—using Kotlin, React, AWS (EC2, IoT, IAM), Google Maps, and multi-cloud setups. Stealth mode operations mean you're building 0→1 products with founders, not fixing tickets.
What You'll Do
- Lead end-to-end ownership of complex systems: design, build, deploy, monitor, and iterate at scale.
- Architect high-performance backends in Kotlin (or JVM langs) that handle real-time routing and IoT data.
- Craft scalable React UIs that power ops dashboards and parent-facing apps.
- Drive cloud decisions across AWS, Azure/GCP—optimising costs for our bootstrap runway.
- Apply DSA/system design to solve hard problems like dynamic route optimization and predictive scaling.
- Shape the engineering roadmap: propose, prioritise, and ship features with founders.
- Mentor juniors while executing solo on high-impact bets—no layers, just results.
We're Looking For
- 3-6 years of hands-on engineering where you've owned and shipped production systems (prove it with code/stories).
- Elite CS fundamentals: advanced DSA, system design (distributed systems a must), design patterns.
- Mastery of Kotlin/Java + modern React; real AWS experience (EC2, IAM, CLI—you know our stack).
- Proven "leap-taker": startup grit, side projects, or open-source that screams hunger.
- Figure-it-out velocity: you thrive in chaos, learn our domain overnight, and deliver 10x faster than peers.
This Role Is Not For You If…
- You need structured roadmaps, PM hand-holding, or big-tech process.
- Comfort > impact: stable salary over equity upside and chaos.
- You've never worn all hats (dev, ops, product) in a resource-constrained environment.
Why Join Us
- Massive ownership: lead tech for 10k+ students, direct founder access, shape ZeroMoblt's scale.
- Flat, high-trust team: flexible Hyderabad/remote, no bureaucracy.
- Hungry culture: we hire hustlers scaling from 700 to 10k students your wins are visible daily.
- Hungry to Leap? Apply now!
About MyOperator
MyOperator is a Business AI Operator platform that enables businesses, teams, and AI agents to work together seamlessly for customer operations such as Sales, Support, Escalations, Feedback, and Refund processes. With 12,000+ businesses using our platform, we operate at meaningful scale and power mission-critical communication workflows including voice bots, WhatsApp automation, and intelligent call routing.
We are building for reliability, speed, and impact. MyOperator values ownership, critical thinking, and execution. This is a high-expectation, high-learning environment where engineers are empowered to solve complex problems and build systems that directly affect customer outcomes.
Role Overview
We are looking for a skilled and proactive Site Reliability Engineer (SRE) to take end-to-end ownership of production reliability, observability, and performance engineering across MyOperator’s AI-powered communication infrastructure.
This role is not operational-only — it requires strong system design thinking, deep troubleshooting ability, and a production ownership mindset. You will define reliability standards, build observability frameworks, lead incident response, and drive SLO-based engineering practices across distributed AWS and Kubernetes environments.
Key Responsibilities
- Own production reliability, uptime, latency, and error budgets across critical services.
- Design and manage production-grade monitoring using Grafana, VictoriaMetrics (Prometheus), and PromQl, AWS CloudWatch.
- Define and enforce SLIs, SLOs, and SLA thresholds for AI communication systems (voice bots, WhatsApp APIs, call routing).
- Build real-time operational dashboards for incident response, capacity planning, and leadership visibility.
- Implement end-to-end distributed tracing using OpenTelemetry (OTEL Collector).
- Design and maintain centralized logging with strong correlation between logs, metrics, and traces.
- Create SLO-based alerting systems with minimal noise and fast incident detection.
- Lead incident response lifecycle: alert triage, mitigation, RCA documentation, and preventive improvements.
- Drive MTTR reduction through structured monitoring, automation, and reliability engineering practices.
- Monitor and troubleshoot AWS EKS (Kubernetes) production workloads.
- Instrument and monitor LLM API integrations, AI inference pipelines, and messaging systems.
- Analyze logs using OpenSearch / ELK for anomaly detection and root cause identification.
- Automate operational workflows using Python or Bash to eliminate manual toil.
- Drive performance optimization, scalability improvements, and capacity planning.
- Collaborate with engineering teams to instrument new services from day one.
Required Skills & Qualifications
- 3–6 years of experience in Site Reliability Engineering, DevOps, or Platform Engineering roles.
- Hands-on experience with:
- VictoriaMetrics / Prometheus (time-series monitoring)
- Grafana dashboards and visualization
- PromQL for writing complex queries and alerts
- Experience implementing distributed tracing using OpenTelemetry (Mandatory).
- Strong experience with centralized logging systems (ELK / OpenSearch / Loki).
- Experience with alerting frameworks such as Alertmanager or Grafana Alerts.
- Strong understanding of SLIs, SLOs, SLA design, and reliability engineering principles.
- Hands-on experience managing AWS production workloads (EC2, RDS, ELB, CloudWatch, IAM).
- Experience with Kubernetes (AWS EKS preferred).
- Familiarity with CI/CD pipelines and automation tools.
- Good understanding of Linux systems, networking, and cloud infrastructure.
- Experience handling production incidents and participating in on-call rotations.
- Ability to automate operational tasks using Python or Bash.
Good to Have
- Experience with OpenSearch / ELK log pipelines and anomaly detection.
- Kubernetes monitoring (pod health, node metrics, autoscaling behavior).
- CI/CD observability integration (Jenkins, GitHub Actions).
- Experience in monitoring LLM APIs and AI inference pipelines.
- Familiarity with MLOps or AI observability tools (Arize, WhyLabs, etc.).
- Service mesh exposure (Istio).
- Infrastructure as Code (Terraform, CloudFormation).
- Experience with chaos engineering or load testing tools.
- Multi-cluster or multi-region architecture exposure.
Key Expectations
- Ownership of production systems and high availability.
- Strong troubleshooting and debugging skills.
- Focus on automation and reliability improvements.
- Proactive approach to incident prevention.
- Ability to reduce alert noise and improve signal quality.
- Data-driven approach to reliability engineering.
This Role Is Not For
- Candidates with purely development experience and no production ownership.
- Candidates without real incident response or on-call experience.
- Freshers or candidates with less than 3 years of experience.
Immediate Hiring: GCP DevOps Engineer | Mumbai & Bengaluru (On-site)
OpsTree Global is urgently hiring a GCP DevOps Engineer with 4–9 years of experience for immediate requirements in Mumbai and Bengaluru.
Key Skills
- Google Cloud Platform (GCP)
- Terraform / Infrastructure as Code (IaC)
- Kubernetes & Helm Charts
- CI/CD – Jenkins, GitLab CI, GitHub Actions
- Linux Administration
- Scripting – Python / Go / Java
Role Responsibilities
- Build and manage scalable cloud infrastructure on GCP
- Automate deployments and infrastructure provisioning
- Ensure system reliability, monitoring, and performance optimization
- Collaborate with development and operations teams for seamless delivery
📍 Locations: Mumbai & Bengaluru (On-site)
⚡ Immediate Joiners Preferred
💼 Experience: 4–9 Years
Job Title: DevOps Engineer
Experience Required: 4-6 Years
Desired Skills: Kubernetes, Docker, AWS, Azure, Python, Java, Terraform
Location: Pune
Employment Type: Full-time
We are looking for a DevOps Engineer who can think of automation at every step of software development. This includes hands-on experience and knowledge of the following -
- 4+ years of professional DevOps experience
- Hands-on expert development skills in Python/Java is a must
- Experience with Infrastructure as code tools such as Terraform
- Hands on experience with Kubernetes (EKS, AKS, GKE) or other container orchestration tools
- Containerization technologies and tools such as Docker
- Understanding of Cloud computing providers such as AWS, GCP, etc. and their services
- Configuring and spinning up applications and microservices on K8s clusters
- Good academics
- Good communication skills
Immediately available candidates would be preferred.
About Searce
Searce (pronounced 'search') is a global, AI-native, and engineering-led modern technology consultancy. Founded in 2004 with a vision to "solve for better," we partner with organizations to "futurify" their businesses by leveraging the full power of Cloud, AI, and Data Engineering.
With a presence across 10+ countries—including the US, India, Singapore, and Australia—Searce has evolved over two decades into a trusted technology partner for over 3,000 clients. We are not just a service provider; we are a group of "solvers-at-heart" who thrive on complex technical challenges.
Why Join the "Solvers" Brigade?
- Award-Winning Excellence: In 2026, Searce was recognized as the Google Cloud Workplace AI Transformation Partner of the Year (APAC). We are a Premier Google Cloud Partner and a top-tier Managed Services Provider (MSP).
- AI-First Mindset: We specialize in Applied AI (Generative & Conventional), Cloud Modernization, and Location Intelligence, helping industries from FinServ and Healthcare to Retail and Manufacturing reinvent themselves.
- The "Futurify" DNA: We don't just maintain; we improve. We use our proprietary EVLOS business innovation framework to ensure our clients aren't just moving to the cloud, but are staying ahead of the curve.
Our Culture: The HAPPIER Values
We look for individuals who live and breathe our HAPPIER values:
- Humble: We learn from everyone.
- Adaptable: We embrace change as the only constant.
- Positive: We focus on solutions, not just problems.
- Passionate: We are obsessed with engineering excellence.
- Innovative: We challenge the status quo.
- Excellence: We deliver impactful, futuristic outcomes.
- Responsible: We take ownership of our work and its impact.
Your Mission: The Role
solving for better.
You are a reliability-owning, hands-on solver. Not just a "break-fix engineer."
As a DRI (directly responsible individual) for our clients' most critical systems, you’ll be the go-to expert within the squad that ensures their environments are secure, reliable, and optimized 24/7. You will deliver measurable impact – improved uptime, faster response times, and real cost savings. Not just closed tickets. Not just alerts. Real outcomes you engineer yourself.
You will lead the charge on technical execution, from complex troubleshooting and root cause analysis to engineering proactive, automated solutions. This role is about building the future of reliable cloud operations and shipping it into today's production environments.
Your Responsibilities
what you will wake up to solve.
This isn’t a “manage tickets” role. You are the architect, the executioner and the DRI for our Cloud Managed Services GTM, deploying solutions that turn operational noise into hardened outcomes. Here’s how you’ll make your mark:
- Own Service Reliability: You will be the go-to technical expert for 24/7 cloud operations and incident management. You'll ensure strict adherence to SLOs by getting your hands dirty, leading high-stakes troubleshooting to deliver a superior client experience.
- Engineer the Blueprint: You'll translate client needs into scalable, automated, and secure cloud architectures. You will write and maintain the operational playbooks and Infrastructure as Code (IaC) that your squad uses every day.
- Automate with Intelligence: You'll lead the charge from the keyboard to futurify our operations. You'll embed AI-driven automation, predictive monitoring, and AIOps into core processes to eliminate toil and preempt incidents.
- Drive FinOps & Impact: You'll own the technical execution of the FinOps framework. You will continuously analyze, configure, and optimize cloud spend for clients through hands-on engineering.
- Be the Expert in the Room: You'll share your knowledge through internal demos, documentation, and technical deep dives, representing the deep expertise that turns operational complexity into business resilience.
- Mentor & Elevate: You will be a technical mentor for your peers. Through code reviews and collaborative problem-solving, you'll help build a high-performing squad that lives the “Always Hardened” mindset.
Experience & Relevance
We are looking for future technology leaders, not just coders. We value raw intelligence, analytical rigor, and an obsessive passion for technology over any prior experience.
- Cloud Operations Pedigree: 4+ years of experience in Azure cloud infrastructure, with a significant portion in cloud managed services. Hands-on experience in Kubernetes is mandatory.
- Commercial Acumen: Proven track record of building and scaling a net-new managed services business.
- Client-Facing Tech Acumen: 2+ years of experience in a client-facing technical role, acting as the trusted advisor for cloud operations, security, and reliability.
Functional Skills:
- Service Delivery Mindset: A deep understanding of MSP business models, SLAs, and the importance of client satisfaction in an operational context.
- Client Engagement: Ability to ask appropriate questions to get to the heart of an operational issue and win trust with stakeholders.
- Cross-Functional Catalyst: Thrive in multi-disciplinary teams, bringing together operations, security, and development teams.
- Repository builder: Creates reusable frameworks, IaC modules, and operational playbooks for scale.
ABOUT EGNYTE
Egnyte is the secure multi-cloud platform for content security and governance that enables organizations to better protect and collaborate on their most valuable content. Established in 2008, Egnyte has democratized cloud content security for more than 23,000 organizations, helping customers improve data security, maintain compliance, prevent and detect ransomware threats, and boost employee productivity on any app, any cloud, anywhere. For more information, visit www.egnyte.com.
Egnyte is looking for a Performance Engineer to join our performance engineering team. As a Performance Engineer, you will drive proactive monitoring and improve automation for regular operational tasks.
WHAT YOU’LL DO:
- Develop tools to measure & monitor performance bottlenecks within the application
- Triage reported performance issues and translate them into reproducible test scenarios
- Collaborate with production Engineering and application engineering teams to design and execute production scenarios that will assess code and 3rd party performance
- Develop and run PSR tests and measure stats for them
- Work with various sub teams to ensure SLA of their core apis are tracked and maintained release over release
- Application and architecture code profiling
- Infrastructure and application performance tuning
- Troubleshooting performance issues
- Ability to distill volumes of data, analyze performance results, and diagnose performance problems
- Capacity estimating, modeling, or planning
YOUR QUALIFICATIONS:
- Bachelor’s degree in computer science or related field. Advanced degree preferred
- 5+ years of work experience in performance engineering
- Expert knowledge and strong experience using tools, LoadRunner/JMeter etc. and understanding of APM solutions like Grafana, AppDynamics, Dynatrace etc
- Experience with microservice architecture, Docker, Kubernetes, Jenkins, Azure, GCP and application monitoring tools
- Strong expertise on monitoring and analyzing application logs, database reports, system metrics like CPU Utilization, Memory usage, Network usage, Garbage Collection and DB Parameters
- Strong expertise on identifying potential performance issues and providing recommendations to improve performance
- Proficiency in JVM technology and JVM troubleshooting skills
- Proficiency in debugging application in production at a large scale
- JVM Profiling, GC Analysis and Tuning experience
- Experience with Performance, Load, Stress, and Scalability Testing
Role Overview:
Virtana is looking for a Senior DevOps Engineer to join our R&D Infrastructure team. In this role, you won't just follow conventions — you'll help redefine them. You will own the architecture, build, and day-to-day operations of the GCP-based cloud platform that powers Virtana's SaaS products and the AI-driven observability experience our Global 2000 customers depend on. This is a hands-on senior individual contributor role with meaningful technical leadership scope, working alongside engineers and architects on a unified observability platform.
Work Location: Pune
Job Type: Hybrid
Role Responsibilities:
- GCP Cloud Operations: Develop, deploy, operate, and support production cloud infrastructure primarily on GCP — leveraging GKE, BigTable, BigQuery, Dataflow, Cloud Storage, IAM, and core networking services.
- Reliability & SLAs: Ensure production systems are running at all times with multiple levels of redundancy to meet committed SLAs; lead incident response, root cause analysis, and post-incident reviews.
- Build & Release Automation: Design, implement, and continuously improve scalable CI/CD pipelines and test frameworks leveraged by QA and development teams across the company.
- Infrastructure as Code: Manage large-scale, repeatable deployments using Terraform, Ansible, Puppet, or SaltStack; champion Git-based workflows and version control standards for distributed engineering teams.
- Security & Availability: Maintain the ongoing maintenance, security, patching, and availability of services in line with tight operations, security, and procedural models.
- Monitoring & Alerting: Plan and deliver high-value monitoring and alerting features to support operations, support, and customer-facing reliability — eating our own dog food with the Virtana Platform wherever possible.
- Capacity & Cost: Forecast capacity, plan upgrades, patches, and migrations, and drive cloud cost efficiency across hybrid and multi-cloud environments.
- Cross-Functional Partnership: Work with development, operations, and support personnel to identify, isolate, and diagnose issues; handle support escalations and drive permanent fixes.
Required Qualifications:
- Bachelor's degree in Computer Science / Engineering or equivalent relevant experience.
- 5–7 years of professional hands-on DevOps / SRE experience supporting production cloud environments.
- Strong, demonstrable production experience on GCP — including GKE, BigTable, BigQuery, Dataflow, IAM, and core GCP networking services.
- Deep, hands-on expertise with container orchestration (Kubernetes) and Docker in production.
- Advanced proficiency with at least one infrastructure-as-code / configuration management tool: Terraform, Ansible, Puppet, or SaltStack.
- Solid understanding of networking, firewalls, load balancers, DNS, and database operations.
- Strong working knowledge of Git-based workflows and version control standards for distributed engineering teams.
- Comfort operating hybrid environments that include both Linux and Windows ecosystems.
- Excellent verbal and written communication skills, with the ability to explain highly technical topics to both technical and non-technical audiences.
- Self-motivated, detail-oriented, and able to work both independently and within a globally distributed team.
Good to Have:
- Strong scripting skills and a demonstrated ability to automate operational toil — Python preferred; Bash, Go, or Groovy a plus.
- Hands-on experience designing and operating CI/CD pipelines with Jenkins (Spinnaker, GitHub Actions, or GitLab CI also welcome).
- Exposure to AWS or other public clouds in addition to GCP.
- Experience operating SaaS platforms built on microservices architectures.
Key responsibilities
• Design, build, and maintain robust CI/CD pipelines using Azure DevOps Services (Azure Pipelines) and Git-based workflows.
• Implement and manage infrastructure as code (IaC) using ARM templates, Bicep, and/or Terraform for repeatable environment provisioning.
• Containerize applications (Docker) and manage container orchestration platforms such as AKS (Azure Kubernetes Service).
• Automate build, test, release, and rollback processes; integrate automated testing and quality gates into pipelines.
• Monitor and improve platform reliability and observability using logging and monitoring tools (e.g., Azure Monitor, Application Insights, Prometheus, Grafana).
• Drive platform security and compliance through pipeline controls, secrets management (Key Vault / Vault), and secure configuration practices.
• Implement cost-optimization and governance for Azure resources (tags, policies, budgets).
• Troubleshoot build/release failures, production incidents, and performance bottlenecks; perform root-cause analysis and implement permanent fixes.
• Mentor developers in Git workflows, pipeline authoring, best practices for IaC, and cloud-native design.
• Maintain clear documentation: runbooks, deployment playbooks, architecture diagrams, and pipeline templates.
Required skills & experience
• 4+ years hands-on experience working with Azure and cloud-native application delivery.
• Deep experience with Azure DevOps (Repos, Pipelines, Artifacts, Boards).
• Strong IaC skills with Terraform, ARM templates, or Bicep.
• Solid experience with CI/CD design and YAML pipeline authoring.
• Practical knowledge of containerization (Docker) and Kubernetes — preferably AKS.
• Scripting skills: PowerShell, Bash, and/or Python for automation.
• Experience with Git workflows (branching strategies, PRs, code reviews).
• Familiarity with configuration management and secrets management (Azure Key Vault, HashiCorp Vault).
• Understanding of networking, identity (Azure AD), and security fundamentals in Azure.
• Strong troubleshooting, debugging, and incident response skills.
• Good collaboration and communication skills; ability to work across teams.
Certification
AZ-400: Microsoft Certified: DevOps Engineer Expert or AZ-104 or AZ 305 or Terraform Associate.
The Role
As a Senior Site Reliability Engineer at Blitzy's Pune headquarters, you will be the backbone of our platform's reliability, scalability, and operational excellence. You'll work at the intersection of software engineering and infrastructure, ensuring our AI-powered development platform remains highly available and performant as we scale rapidly. This is a high-impact, hands-on role for an engineer who thrives in a fast-moving environment and takes deep ownership of the systems they build.
What Success Looks Like
- In 30 days: You have a deep understanding of Blitzy's infrastructure architecture, have identified key reliability risks, and are actively contributing to on-call rotations.
- In 90 days: You have shipped meaningful improvements to observability, incident response workflows, and deployment pipelines that measurably reduce MTTR and increase system uptime.
- In 6 months: You have driven at least one major reliability initiative from inception to production, established SLO/SLA frameworks for critical services, and are a trusted technical voice shaping our infrastructure roadmap.
Areas of Ownership
- Design, build, and operate scalable, fault-tolerant infrastructure across cloud environments (AWS, GCP, or Azure).
- Define and enforce SLOs, SLAs, and error budgets; lead blameless postmortems and drive systemic improvements.
- Build and maintain robust CI/CD pipelines, release automation, and deployment infrastructure.
- Own observability: design and maintain logging, metrics, tracing, and alerting stacks (e.g., Prometheus, Grafana, Datadog, OpenTelemetry).
- Partner closely with software engineering teams to embed reliability practices into the development lifecycle.
- Drive capacity planning, performance benchmarking, and cost optimization across our infrastructure.
- Champion security best practices within the infrastructure and deployment layers.
Required Experience
- 5+ years of experience in Site Reliability Engineering, DevOps, or Infrastructure Engineering roles.
- Strong proficiency in at least one major cloud platform (AWS preferred); experience with Kubernetes and container orchestration at scale.
- Hands-on experience with infrastructure-as-code tools (Terraform, Pulumi, or equivalent).
- Proven track record designing and maintaining high-availability, distributed systems.
- Deep expertise in observability tooling, incident management, and on-call practices.
- Strong scripting and automation skills (Python, Go, Bash, or similar).
- Excellent communication skills with the ability to collaborate across engineering teams and present technical findings to leadership.
What Makes You Stand Out
- Experience supporting AI/ML workloads or GPU-accelerated infrastructure.
- Prior experience in a high-growth startup environment where you wore multiple hats.
- Familiarity with eBPF, service mesh technologies (Istio, Linkerd), or advanced networking.
- Contributions to open-source SRE/DevOps tooling or communities.
- Experience building global, multi-region infrastructure with strict latency and availability requirements.
What Makes This Role Different
You won't be maintaining legacy systems or fighting fires in a sprawling monolith. At Blitzy, you're building reliability into a greenfield AI platform that is redefining how the world creates software. You'll have direct influence over architectural decisions, work side-by-side with world-class engineers, and see the tangible impact of your work as we scale to serve Fortune 500 customers. As a founding member of the Pune SRE team, you'll help shape the culture and technical standards of a team that will grow with the company.
The Role
As a DevOps Engineer at Blitzy's Pune headquarters, you'll build and operate the infrastructure that powers our AI agents and the applications they produce. You'll work at the intersection of cloud infrastructure, developer tooling, and AI-native systems — designing the pipelines, clusters, and automation that allow Blitzy to ship production-ready software at machine speed. This is a hands-on, high-ownership role for an engineer who moves fast, automates everything, and cares deeply about developer experience and system reliability.
What Success Looks Like
- Kubernetes clusters are running reliably at scale, with clear deployment standards, Helm-managed releases, and minimal manual intervention required from engineering teams.
- CI/CD pipelines are fast, consistent, and trusted — developers ship confidently knowing the automation handles the rest.
- Observability is comprehensive: alerts are actionable, dashboards are meaningful, and incidents are resolved faster because the right data is always available.
- Infrastructure provisioning is fully automated — no snowflake environments, no manual setup, everything reproducible through code.
- AI agent orchestration infrastructure is stable and scalable, directly enabling Blitzy's core product to deliver for enterprise customers.
- Engineering teams notice the difference — developer productivity is measurably higher and infrastructure is no longer a bottleneck to shipping.
Areas of Ownership
- Build and manage Kubernetes clusters supporting AI agent workloads and application deployment at scale.
- Design, implement, and maintain CI/CD pipelines for application and AI service delivery — ensuring speed, reliability, and repeatability.
- Automate infrastructure provisioning and dynamic scaling using Python scripts and Terraform IaC.
- Deploy and manage applications using Helm charts; own packaging standards and release automation.
- Build and maintain comprehensive observability stacks — alerting, distributed tracing, metrics, and logging (e.g., Prometheus, Grafana, Datadog, OpenTelemetry).
- Monitor and maintain production services and APIs; own incident response and drive blameless postmortems.
- Build dedicated infrastructure for AI agent orchestration and management, enabling Blitzy's core autonomous development capabilities.
- Collaborate with engineering teams on deployment strategies and continuously improve developer experience through tooling and automation.
Required Experience
- 5–8 years of DevOps, infrastructure, or platform engineering experience.
- Python proficiency for scripting, automation, and infrastructure tooling.
- Deep Kubernetes expertise — cluster management, workload deployment, scaling, and troubleshooting.
- Hands-on Helm experience for application packaging and release management.
- Proven ability to design and implement CI/CD pipelines across complex, multi-service environments.
- Practical experience with at least one major cloud platform (AWS, GCP, or Azure).
- Terraform proficiency for infrastructure-as-code provisioning and state management.
- Strong Linux administration and containerization fundamentals (Docker, OCI).
What Makes You Stand Out
- CKA (Certified Kubernetes Administrator) certification.
- Familiarity with MLOps tooling such as MLflow, Kubeflow, or similar platforms for AI/ML workload management.
- Experience with microservices architecture and distributed systems design.
- Knowledge of API gateways and service mesh technologies (Istio, Linkerd, or equivalent).
- Prior experience in a high-growth AI or software startup where you moved fast and owned broadly.
- Track record of meaningfully improving developer productivity through platform and tooling investments.
What Makes This Role Different
Most DevOps roles have you maintaining existing systems. At Blitzy, you're building the infrastructure layer for a platform that autonomously writes enterprise software — a genuinely new category of product. You'll work on AI agent orchestration, Kubernetes at scale, and developer tooling that is directly responsible for how fast Blitzy delivers value to Fortune 500 customers. As an early member of the Pune engineering team, you'll have outsized influence over our infrastructure culture and technical direction. High performers are eligible for company equity — giving you real ownership in what you build.
About the Role
We are seeking a proactive and detail-oriented Site Reliability Engineer (SRE) with 3+ years of experience to ensure high availability, reliability, and performance of production systems.
This role focuses on automation, observability, incident management, and cross-team coordination to drive operational excellence.
Key Responsibilities
· Maintain reliable, scalable, and secure production environments.
· Implement and manage monitoring, alerting, and logging solutions.
· Contribute to defining and tracking SLIs/SLOs and support error budget practices.
· Automate operational tasks to improve efficiency and reduce manual effort.
· Perform troubleshooting and Root Cause Analysis (RCA) for production incidents.
· Optimize system performance, availability, and capacity.
· Maintain runbooks, SOPs, and incident documentation in Confluence.
· Adhere to change management, deployment governance, and disaster recovery standards.
· Support incident response for critical production services.
Collaboration & Tools
· Coordinate with external vendors and internal cross-functional teams.
· Work closely with Engineering, Product Owners, and Operations teams.
· Manage incidents and changes using ServiceNow & JIRA.
· Collaborate through Slack and structured communication channels.
Technical Skills
Systems & Cloud
· Strong knowledge of Windows and Linux/Unix systems.
· Solid understanding of networking fundamentals (DNS, TCP/IP, Load Balancing, Firewalls).
· Experience with at least one cloud platform (AWS, Azure, or GCP).
Automation & CI/CD
· Proficiency in one scripting/programming language (Python, Go, Bash, PowerShell, or Java).
· Understanding of CI/CD pipelines and automation practices.
Containers & Observability
· Hands-on experience with Docker and Kubernetes.
· Experience with monitoring tools such as Grafana or Power BI.
· Ability to analyze logs, metrics, and traces for troubleshooting.
ITSM & Documentation
· Experience with ServiceNow & JIRA (incident/change/problem workflows).
· Working knowledge of Confluence for technical documentation and knowledge management.
Additional Experience (Preferred)
· Background in DevOps, Cloud Engineering, or Platform Engineering.
· Understanding of security best practices and compliance standards.
· Familiarity with AI-assisted engineering tools (Claude Code, Jellyfish, GitHub Copilot).
· Exposure to large-scale or production-grade systems.
Soft Skills
· Strong analytical and troubleshooting mindset.
· Excellent written and verbal communication skills.
· Effective stakeholder and vendor coordination.
· Ownership-driven and composed during high-severity incidents.
Applying to jobs at Ampera is completely free. We never ask candidates for any payment.
About Simbian
Simbian is at the forefront of cybersecurity innovation, leveraging purpose-built AI Agents to deliver 10x security outcomes for global enterprises and MSSPs. Our platform autonomously investigates and responds to alerts, freeing security teams from repetitive tasks. Simbian combines privacy-first technology, proven integration with 70+ enterprise tools, and rapid deployment for measurable value. Role
Overview
We are seeking a collaborative, innovative DevOps Engineer passionate about enabling secure, scalable operations for cutting-edge cybersecurity products. Join our team during a period of high growth and help architect the future of agentic AI security platforms.
Key Responsibilities
• Kubernetes Management:
o Manage and maintain production-grade Kubernetes clusters across multiple cloud providers (AWS is essential, Azure is valuable, GCP is a plus).
o Deploy, upgrade, troubleshoot, and scale stateful and stateless workloads (NGINX, Postgres, MongoDB, OpenCTI, OpenSearch, Kafka, Hadoop, Fluentd) in Kubernetes.
• Cloud Operations:
o Operate and optimize cloud environments, with strong expertise in AWS (AWS Certified Solutions Architect Professional or equivalent Azure cert preferred).
o Design, deploy, and manage infrastructure on AWS and Azure (GCP optional). • SQL Database Management:
o Administer SQL databases, ideally Postgres, on Kubernetes clusters or cloud VMs.
o Perform routine maintenance, backups, upgrades, monitoring, and optimization.
• Infrastructure as Code:
o Build, install, upgrade, and maintain Helm charts with expertise.
o Use and understand Ansible for cloud automation (AWS/Azure), and Terraform for infrastructure provisioning.
• Monitoring, Logging, Observability:
o Implement and manage logging and metrics stacks using OpenSearch/Elasticsearch, Prometheus, Grafana, Thanos or similar open source tools.
• Programming & Scripting:
o Develop automation scripts in Bash (proficient with control structures). o Produce scripts or microservices in Node.js (preferred) or Python/Django (bonus).
• CI/CD:
o Build and maintain CI/CD pipelines preferably using GitHub Actions (Jenkins or equivalent is acceptable).
• Containerization:
o Create, manage, and troubleshoot Docker/Podman containers, images, volumes, and use Docker Compose for local development.
• Customer-Facing On-Prem Deployments (Bonus):
o Install, configure, and support Kubernetes on customer premises.
o Demonstrate ownership, initiative, and strong customer communication skills.
o Solid knowledge of Linux administration, networking, and cloud environments.
What You’ll Bring:
• 4+ years’ experience in DevOps, SRE, or Production Engineering.
• Mastery of Kubernetes, AWS, infrastructure automation, and database management.
• Strong collaborative, curious, and growth-driven mindset.
• Ability to challenge ideas, drive innovation, and embrace rapid change.
• Excellent communication for technical customer interactions.
Why Join Simbian?
• Work with pioneering agentic AI security—impact global security teams.
• Shape infrastructure for privacy-first technology in a high-growth startup.
• Enjoy a dynamic remote-first work culture with opportunities for ownership and advancement.
we are currently hiring for Junior DevOps Developer
Can you pls check below Job Description for the post
Job Description: Junior DevOps Developer (0.6 – 1.5 Years Experience)
Job Title: Junior DevOps Developer
Experience: 6 months to 1.5 years
Employment Type: Full-time
About the Role:
We are looking for a motivated Junior DevOps Developer to support our development and operations teams. You will assist in managing cloud infrastructure, improving deployment processes, and maintaining system reliability.
Key Responsibilities:
- Assist in managing and maintaining cloud infrastructure (AWS/GCP/Azure)
- Support CI/CD pipeline setup and maintenance
- Help automate deployment processes and routine tasks
- Monitor system performance and troubleshoot issues
- Assist in containerization using Docker and Kubernetes
- Perform root cause analysis for production issues
- Collaborate with developers to improve system performance and scalability
- Maintain documentation for infrastructure and processes
- cloud platform and infrastructure include hetzener
Required Skills:
- Basic understanding of DevOps concepts and workflows
- Knowledge of cloud platforms like AWS, GCP, or Azure
- Familiarity with Docker and Kubernetes
- Basic understanding of Infrastructure as Code tools (Terraform is a plus)
- Knowledge of Git and version control systems
- Basic scripting knowledge (Bash/Python preferred)
Good to Have:
- Exposure to CI/CD tools (Jenkins, GitHub Actions, GitLab CI/CD)
- Understanding of monitoring tools (Grafana, Prometheus)
- Understanding of monitoring tools (Grafana, Prometheus)
You can contact me on this WhatsApp number: Nine three one six one two zero one three two
Platform Engineer – Cloud & On-Prem Infrastructure
Location - Pune or Bangalore (WFO- 5 days)
Must-Have Skills:
- 8+ years deploying, upgrading, and maintaining infrastructure across on-premises and public cloud, with Kubernetes and Docker
- Proficiency in Infrastructure as Code using Terraform or Pulumi
- Hands-on coding in Golang or Python, plus Bash scripting
Good-to-Have Skills:
- Familiarity with Kubernetes management solutions (OpenShift, Rancher, GKE, EKS, AKS, VMware TKG)
- Experience with VM management platforms (e.g., Red Hat OpenShift Virtualization, VMware)
- Kubernetes certifications (CKA, CKAD)
- Exposure to service mesh technologies (Istio, Linkerd)
Who You Are
- A platform engineer who builds and maintains the infrastructure backbone for both on-prem and cloud environments
- Passionate about automating daily operations and eliminating manual toil
- Comfortable authoring and evolving IaC (Infrastructure as a Code) templates to enforce consistency.
What You’ll Do & Learn
- Roll out & maintain on-premises and cloud infrastructure for development and testing environments
- Implement & support CI/CD pipelines to drive our software delivery processes
- Develop automation tools that streamline routine operations and improve reliability
- Build & enhance Infrastructure-as-Code templates (Terraform, Pulumi) for rapid, repeatable provisioning
- Document system designs, configurations, and processes to enable an asynchronous, distributed team culture
Location: Chennai (Hybrid)
Commitment: Minimum 2 Years (Excluding 3 months of Probation)
Experience Level: Fresher / Entry Level
Job Overview:
We are looking for a skilled and versatile System Administrator with strong expertise in Windows and Linux environments, along with working knowledge of cloud infrastructure, cybersecurity, automation, and AI/ML systems.
The ideal candidate should be capable of handling enterprise IT infrastructure, supporting multi-cloud environments, and contributing to AI/ML deployment and integration activities. Strong communication skills and the ability to collaborate with technical and client-facing teams are essential.
Key Responsibilities:
- Manage and maintain Windows and Linux server environments ensuring stability, performance, and security.
- Support deployment, configuration, and administration of IT infrastructure components across on-prem and cloud environments.
- Monitor system health, troubleshoot issues, and ensure high availability of services.
- Work with cloud platforms such as AWS, Microsoft Azure, and Google Cloud.
- Assist in implementation of security solutions including IAM, firewalls, endpoint protection, and SIEM tools.
- Develop and maintain automation scripts using Python, PowerShell, or JavaScript.
- Support deployment and integration of AI/ML models into production environments.
- Collaborate with engineering and development teams to optimize infrastructure and application performance.
- Participate in technical discussions, documentation, and client support activities when required.
Required Skills & Qualifications:
- Strong knowledge of Windows and Linux system administration.
- Good understanding of networking, servers, and cloud fundamentals.
- Experience or exposure to AWS, Azure, or GCP.
- Proficiency in scripting languages such as Python, PowerShell, or JavaScript.
- Basic understanding of cybersecurity principles and system hardening.
- Familiarity with AI/ML concepts and deployment workflows is an advantage.
- Strong analytical and troubleshooting skills.
- Excellent verbal and written communication skills.
Preferred Qualifications:
- Experience with virtualization and containerization (VMware, Docker, Kubernetes).
- Knowledge of CI/CD pipelines and DevOps practices.
- Exposure to MLOps concepts and model deployment workflows.
- Understanding of monitoring tools and logging systems.
- Experience working in hybrid or enterprise IT environments.
What We Offer:
- Exposure to enterprise-level infrastructure and cloud environments.
- Opportunity to work on real-world AI/ML integration projects.
- Structured career growth into Cloud, DevOps, Security, or AI/ML engineering roles.
- Collaborative work environment with hands-on learning opportunities.
- Competitive compensation and long-term growth path.
Who Should Apply:
- Freshers or candidates with up to 2 years of experience.
- Candidates passionate about system administration, cloud computing, and AI/ML.
- Individuals eager to work in infrastructure-heavy, production environments.
- Strong communicators who can work in team-oriented and client-facing roles.
About Us
We believe the future of software development is AI-native — where engineers operate at a higher level of abstraction and quality remains non-negotiable.
Incubyte is a software craft consultancy where the “how” of building software matters as much as the “what”.
We partner with companies of all sizes, from helping enterprises build, scale, and modernize to early-stage founders bring their ideas to life.
Our engineers operate in an AI-native development model, using AI as a collaborator across the SDLC to accelerate development while upholding the discipline of software craftsmanship. Guided by Software Craftsmanship and Extreme Programming practices, we build reliable, maintainable, and scalable systems with speed, without compromising quality. If this way of building software resonates with you, we’d like to talk.
Our Guiding Principles
These principles define how we work at Incubyte. They are non-negotiable.
Relentless Pursuit of Quality with Pragmatism
We build high-quality systems without losing sight of delivery.
Extreme Ownership
We take responsibility end-to-end for decisions, execution, and outcomes.
Proactive Collaboration
We collaborate closely, challenge each other, and solve problems together.
Active Pursuit of Mastery
We continuously improve our craft and raise our bar.
Invite, Give, and Act on Feedback
We seek, give, and act on feedback to get better every day.
Ensuring Client Success
We act as trusted partners and focus on real outcomes, not just output.
Job Description
This is a remote position.
Experience Level
This role is ideal for engineers with 3-5 years of experience and a strong background in building secure, scalable platforms.
We are looking for hands-on DevOps and Backend Engineers with real-world experience in application/feature development, system design, testing practices such as TDD, full-stack development, handling production incidents, distributed systems, and modern infrastructure challenges.
What You’ll Do as a Software Craftsperson
- Design and document real-world DevOps and backend scenarios based on production incidents such as outages, scaling challenges, and secure deployments
- Translate real engineering experiences into benchmark tasks that contribute to training next-generation AI systems
- Contribute to building secure, scalable, Kubernetes-native architectures across modern infrastructure environments
- Work across critical engineering domains including CI/CD pipelines, observability, identity & access management, infrastructure-as-code, and backend services
- Collaborate with internal teams to design and simulate realistic engineering workflows and system behaviors
- Apply practical engineering judgment to model distributed systems challenges and improve system resilience and reliability
Requirements
What You’ll Bring
3-5 years of experience in DevOps and Backend Engineering with a strong foundation in building secure, scalable systems.
Strong hands-on expertise in DevOps and backend technologies (Node.js/Java/Go/Python) including:
- Kubernetes, Terraform, and CI/CD pipelines
- Tools such as k9s, k3s (GitLab CI preferred)
- Backend technologies such as Go, Python, or Java
- Experience with Docker, gRPC, and Kubernetes-native services
Demonstrated experience working with secure, offline or air-gapped deployments (highly preferred)
Familiarity with distributed systems and backend architecture, with exposure to ML or distributed pipelines being a plus.
Hands-on experience across multiple core functional areas, with exposure to at least five of the following:
- Identity & Access Management
- Observability (Prometheus + Grafana)
- CI/CD Pipelines
- Keycloak
- GitLab CI
- Terraform OSS
- Kubernetes ecosystem tools
Strong problem-solving ability with real-world experience in handling production systems, incidents, and infrastructure challenges
Ability to work across multiple layers of the stack, from infrastructure to backend services, while ensuring scalability, reliability, and security
Benefits
Life at Incubyte
We are a remote-first company with structured flexibility. Teams commit to shared rhythms during core hours, ensuring smooth collaboration while maintaining autonomy. Twice a year, we come together in person for a co-working sprint and once a year for a retreat - with all travel expenses covered.
Our environment is built for crafters: experimenting with real-world systems, solving complex infrastructure challenges, and contributing to cutting-edge AI initiatives. We are all lifelong learners, and our work is our passion.
Perks
- Dedicated learning & development budget
- Sponsorship for conference talks
- Comprehensive medical & term insurance
- Employee-friendly leave policies
- Home Office fund
- Medical Insurance

A Bengaluru-based IT services and consulting firm.
We are looking for a highly skilled Linux-focused DevOps Engineer with strong troubleshooting capabilities. The ideal candidate must have hands-on experience in Linux systems, OpenStack, Kubernetes, and Ansible automation. This role requires deep debugging skills and structured root cause analysis.
Key Responsibilities
• Troubleshoot complex infrastructure and application issues through deep log analysis.
• Manage and optimize Linux-based production environments.
• Deploy and manage workloads in Kubernetes clusters.
• Provision and manage infrastructure using OpenStack.
• Automate configuration management using Ansible. • Diagnose networking issues including DNS, routing, and firewall configurations.
• Develop automation scripts using Bash or Python.
• Participate in production incident handling and RCA documentation. Required Skills & Experience
• Strong Linux internals knowledge (systemd, processes, memory, I/O). • Experience analyzing system logs (/var/log, journalctl, dmesg).
• Hands-on Kubernetes production troubleshooting.
• OpenStack VM provisioning and networking knowledge.
• Ansible playbook and role development experience.
• Strong scripting skills (Bash or Python).
Good to Have
• Terraform exposure. • Monitoring tools such as Prometheus, Grafana, ELK. • Experience in multi-tenant environments. • On-call production support experience.
Role & Responsibilities
Own the Client’s Outcome:
- Embed with enterprise customers – on-site and remotely – to understand their supply chain operations, data estate, and what success actually looks like for their business.
- Scope and design technical solutions for messy, real-world logistics problems – with a clear line to measurable impact: cost per delivery, SLA performance, empty kilometres.
- Own the full deployment lifecycle: architecture through go-live through steady-state. You’re accountable for the outcome, not just the code.
Build and Ship:
- Design, build, and maintain backend services in Node.js or Python that power routing, planning, and execution at enterprise scale.
- Build and own the integrations connecting Locus to client ERPs, TMS, WMS, and OMS platforms – these integrations are often the riskiest part of a deployment.
- Write production code that runs under real load. If it isn’t in production, it hasn’t shipped.
Be the Technical Interface with the Client:
- Run architecture reviews, lead integration workshops, and represent Locus in executive steering meetings. You need to be credible at every level of the client organisation.
- Bring field learnings back into the product and platform teams. Some of Locus’s best features started as a client workaround.
- Push back when a client request would compromise platform integrity – and propose a better alternative.
Show Up On-Site:
- Travel to client sites – domestic and international, up to ~30% of the time – for kick-offs, integration sprints, go-lives, and post-live reviews.
- Build the kind of relationship where the client’s ops lead calls you directly when something goes wrong at 2am, not a support ticket.
- Be comfortable wherever the work is: a warehouse floor, a logistics control tower, a C-suite boardroom.
Make the Next Deployment Easier:
- Document architecture decisions, integration patterns, and deployment playbooks – every engagement should make the next one faster.
- Work closely with Product, Customer Success, and Platform Engineering. Share what you’re seeing in the field; don’t wait to be asked.
- Mentor junior FDEs and raise the technical bar across the team.
Ideal Candidate
- Strong Forward Deployed / Field Engineer
- Mandatory (Experience 1): Must have 5+ years of backend engineering experience with hands-on coding in Node.js or Python, building production-grade systems
- Mandatory (Experience 2): Must have minimum 2+ years in client-facing / deployment-heavy roles, where they worked directly with enterprise customers
- Mandatory (Experience 3): Must have experience shipping and owning production systems end-to-end: From design → build → deployment → post-production support
- Mandatory (Tech Skills 1 - Backend & Systems): Strong in: Node.js or Python (must-have), Building scalable backend services
- Mandatory (Tech Skills 2 - Integrations): Must have experience with: Enterprise integrations (APIs, third-party systems), Systems like ERP / TMS / WMS / OMS
- Mandatory (Tech Skills 3 - Data & Messaging): Hands-on with: Relational + NoSQL databases, Event streaming / queues (Kafka / RabbitMQ or similar)
- Mandatory (Tech Skills 4 - Cloud & Deployment): Experience with: Cloud platforms (AWS / GCP / Azure), Docker + Kubernetes (or containerised deployments)
- Mandatory (Company): Top Product companies / Startups / SaaS / platform companies
Role: Senior Software Developer (Backend) - .Net with Microservices & Cloud
YOE: 6.5+ years
Skills: C#, ASP. Net, Microservices Architecture, ASP.NET Core, Web API development, Azure
Kubernetes Service (AKS), API Gateway / Azure API, Entra (Authentication), Azure Service Bus, Azure
Functions, Azure Blob storage, Caching, NoSQL Databases
About the role:
The Software Developer Senior Designs, builds, tests, and – most importantly – ships high-value software
that solves real problems. Strives for security, performance, simplicity, usability, and maintainability.
Mentors and guides less experienced software engineers.
Responsibilities:
1. Team Contribution
● Works within established agile methods, promoting an atmosphere of continuous improvement.
● Continuously learns new technologies and patterns and practices.
● Documents knowledge for the benefit of the team.
● Reports to the team on obstacles and roadblocks.
● Participates in, and occasionally leads, sprint planning, standups, retrospectives, and other team
meetings.
● Promote patterns and best practices on the team.
● Mentors and guides the less experienced software engineers.
2. Planning and Design
● Works with the product team and stakeholders to refine and document requirements.
● Estimates effort for planning purposes.
● Designs and documents enterprise-level software architecture, consulting with Enterprise
Architecture when appropriate.
3. Development
● Writes code to develop software that meets requirements and specifications.
● Follows established software development life cycle (SLDC).
● Writes code with readability and future maintenance in mind.
● Follows established source control standards and best practices.
● Adheres to established secure coding practices.
● Reviews code for other developers.
● Leads team-based development efforts.
4. Quality Assurance
● Validates QA findings and fixes defects.
● Develops integration and testing points in the software that allow for QA testing.
● Assists QA in running performance and load tests.
5. Release
● Assists with release planning and releases.
6. Support
● Assists the support team as needed, including root cause analysis.
● Writes maintenance and metric statistics scripts and entry points for measuring and monitoring.
Requirements:
Solid Understanding of The Following:
● Microservices Architecture:
Confidential. c Foxsense Innovations. 1
● Microservices design principles (bounded contexts, loose coupling)
● API-first design and contract management
● Event-driven design principles
● Asynchronous messaging patterns
● Eventual consistency concepts
● Idempotency and message replay handling
● ASP.NET Core Web API development
● Web Apps
● Azure Kubernetes Service (AKS)
● Azure Blob Storage usage and lifecycle management
● API Gateway / Azure API Management concepts
● Entra (Authentication)
● Azure Service Bus
● Azure Functions
● Caching
● NoSQL DatabasesProcesses & Standards: Git, GitFlow, OO Programming, Kanban, Secure
Coding, & Agile Methodologies
Bonus Skills:
● Excellent written and verbal communication
● Excellent documentation
● Continuous learning
● Collaboration across team and functional boundaries
● Troubleshooting and creative problem solving
● Design simple architecture that supports complex applications and APIs
● Architect extensible databases
● Author complex component-based client applications and restful APIs
● Perform advanced CRUD operations against multiple data sources
● Manipulate enterprise level data structures
● Mentor less experienced team members
● Take ownership of team processes and legacy applications
● Perform business analysis tasks, such as requirements gathering and wireframing
Your Mission: The Role
solving for better.
You are a reliability-owning, hands-on solver. Not just a "break-fix engineer."
As a DRI (directly responsible individual) for our clients' most critical systems, you’ll be the go-to expert within the squad that ensures their environments are secure, reliable, and optimized 24/7. You will deliver measurable impact – improved uptime, faster response times, and real cost savings. Not just closed tickets. Not just alerts. Real outcomes you engineer yourself.
You will lead the charge on technical execution, from complex troubleshooting and root cause analysis to engineering proactive, automated solutions. This role is about building the future of reliable cloud operations and shipping it into today's production environments.
Your Responsibilities
what you will wake up to solve.
This isn’t a “manage tickets” role. You are the architect, the executioner and the DRI for our Cloud Managed Services GTM, deploying solutions that turn operational noise into hardened outcomes. Here’s how you’ll make your mark:
- Own Service Reliability: You will be the go-to technical expert for 24/7 cloud operations and incident management. You'll ensure strict adherence to SLOs by getting your hands dirty, leading high-stakes troubleshooting to deliver a superior client experience.
- Engineer the Blueprint: You'll translate client needs into scalable, automated, and secure cloud architectures. You will write and maintain the operational playbooks and Infrastructure as Code (IaC) that your squad uses every day.
- Automate with Intelligence: You'll lead the charge from the keyboard to futurify our operations. You'll embed AI-driven automation, predictive monitoring, and AIOps into core processes to eliminate toil and preempt incidents.
- Drive FinOps & Impact: You'll own the technical execution of the FinOps framework. You will continuously analyze, configure, and optimize cloud spend for clients through hands-on engineering.
- Be the Expert in the Room: You'll share your knowledge through internal demos, documentation, and technical deep dives, representing the deep expertise that turns operational complexity into business resilience.
- Mentor & Elevate: You will be a technical mentor for your peers. Through code reviews and collaborative problem-solving, you'll help build a high-performing squad that lives the “Always Hardened” mindset.
Experience & Relevance
We are looking for future technology leaders, not just coders. We value raw intelligence, analytical rigor, and an obsessive passion for technology over any prior experience.
- Cloud Operations Pedigree: 3+ years of experience in AWS cloud infrastructure, with a significant portion in a cloud managed services.
- Commercial Acumen: Proven track record of building and scaling a net-new managed services business.
- Client-Facing Tech Acumen: 2+ years of experience in a client-facing technical role, acting as the trusted advisor for cloud operations, security, and reliability.
Functional Skills:
- Service Delivery Mindset: A deep understanding of MSP business models, SLAs, and the importance of client satisfaction in an operational context.
- Client Engagement: Ability to ask appropriate questions to get to the heart of an operational issue and win trust with stakeholders.
- Cross-Functional Catalyst: Thrive in multi-disciplinary teams, bringing together operations, security, and development teams.
- Repository builder: Creates reusable frameworks, IaC modules, and operational playbooks for scale.
Join the ‘real solvers’
ready to futurify?
If you are excited by the possibilities of what an AI-native engineering-led, modern tech consultancy can do to futurify businesses, apply here and experience the ‘Art of the possible’. Don’t Just Send a Resume. Send a Statement.
Your Mission: The Role
solving for better.
You are a reliability-owning, hands-on solver. Not just a "break-fix engineer."
As a DRI (directly responsible individual) for our clients' most critical systems, you’ll be the go-to expert within the squad that ensures their environments are secure, reliable, and optimized 24/7. You will deliver measurable impact – improved uptime, faster response times, and real cost savings. Not just closed tickets. Not just alerts. Real outcomes you engineer yourself.
You will lead the charge on technical execution, from complex troubleshooting and root cause analysis to engineering proactive, automated solutions. This role is about building the future of reliable cloud operations and shipping it into today's production environments.
Your Responsibilities
what you will wake up to solve.
This isn’t a “manage tickets” role. You are the architect, the executioner and the DRI for our Cloud Managed Services GTM, deploying solutions that turn operational noise into hardened outcomes. Here’s how you’ll make your mark:
- Own Service Reliability: You will be the go-to technical expert for 24/7 cloud operations and incident management. You'll ensure strict adherence to SLOs by getting your hands dirty, leading high-stakes troubleshooting to deliver a superior client experience.
- Engineer the Blueprint: You'll translate client needs into scalable, automated, and secure cloud architectures. You will write and maintain the operational playbooks and Infrastructure as Code (IaC) that your squad uses every day.
- Automate with Intelligence: You'll lead the charge from the keyboard to futurify our operations. You'll embed AI-driven automation, predictive monitoring, and AIOps into core processes to eliminate toil and preempt incidents.
- Drive FinOps & Impact: You'll own the technical execution of the FinOps framework. You will continuously analyze, configure, and optimize cloud spend for clients through hands-on engineering.
- Be the Expert in the Room: You'll share your knowledge through internal demos, documentation, and technical deep dives, representing the deep expertise that turns operational complexity into business resilience.
- Mentor & Elevate: You will be a technical mentor for your peers. Through code reviews and collaborative problem-solving, you'll help build a high-performing squad that lives the “Always Hardened” mindset.
Experience & Relevance
We are looking for future technology leaders, not just coders. We value raw intelligence, analytical rigor, and an obsessive passion for technology over any prior experience.
- Cloud Operations Pedigree: 3+ years of experience in GCP cloud infrastructure, with a significant portion in a cloud managed services.
- Commercial Acumen: Proven track record of building and scaling a net-new managed services business.
- Client-Facing Tech Acumen: 2+ years of experience in a client-facing technical role, acting as the trusted advisor for cloud operations, security, and reliability.
Functional Skills:
- Service Delivery Mindset: A deep understanding of MSP business models, SLAs, and the importance of client satisfaction in an operational context.
- Client Engagement: Ability to ask appropriate questions to get to the heart of an operational issue and win trust with stakeholders.
- Cross-Functional Catalyst: Thrive in multi-disciplinary teams, bringing together operations, security, and development teams.
- Repository builder: Creates reusable frameworks, IaC modules, and operational playbooks for scale.
Join the ‘real solvers’
ready to futurify?
If you are excited by the possibilities of what an AI-native engineering-led, modern tech consultancy can do to futurify businesses, apply here and experience the ‘Art of the possible’. Don’t Just Send a Resume. Send a Statement.
Location: Bangalore preferred / Hybrid as applicable
Experience: 3+ years
Education: B.E/B.Tech in Computer Science, Engineering or a related technical discipline
Salary: Above market standards, flexible for the right candidate
Career growth: Long-term opportunity with potential to lead DevOps architecture and cloud platform operations
About FrontM
FrontM builds software platforms for frontline workforces operating in remote and low-connectivity environments, with a strong focus on the maritime industry. The platform supports communication, collaboration, healthcare, learning, welfare and operational workflows across mobile, web, kiosk and connected device environments.
The platform runs across cloud infrastructure, constrained networks and specialised customer environments, requiring reliable DevOps practices, strong observability, secure architecture and careful operational discipline.
Role Summary
As a Senior DevOps Engineer, you will take ownership of FrontM’s AWS cloud infrastructure, CI/CD pipelines, platform reliability and technical operations. You will work closely with the VP of Delivery, CTO and CEO to maintain secure, scalable and high-availability infrastructure for FrontM’s production systems.
This role requires strong hands-on DevOps experience, broad AWS knowledge, Kubernetes experience and the ability to troubleshoot complex networking and production issues across multi-domain SaaS environments.
Key Responsibilities
Cloud Infrastructure & DevOps Architecture (≈45%)
· Own, maintain and improve AWS cloud infrastructure for FrontM platforms
· Create and maintain Terraform scripts for infrastructure deployment and management
· Manage Kubernetes workloads deployed within AWS EKS
· Support multi-zone AWS infrastructure design for availability, resilience and scale
· Maintain AWS services including Route 53, EC2, API Gateway, VPC, VPN, AWS Cognito, ElastiCache, DynamoDB and Lambda
· Contribute to DevOps architecture planning in line with FrontM’s platform roadmap
CI/CD, Operations & Platform Reliability (≈35%)
· Build, maintain and improve CI/CD pipelines for backend and platform services
· Oversee technical operations with hands-on administration, monitoring and release support
· Ensure continuous server uptime, stability, performance and maintainability
· Debug, respond to and restore system outages in production and staging environments
· Improve observability across infrastructure and applications, including migration from Elastic stack to logz.io
· Support backend stability, scale and performance across Node.js, Java and related services
Security, Networking & Production Support (≈20%)
· Maintain AWS security configurations, access controls and monitoring practices
· Support complex networking requirements across multi-domain SaaS implementations
· Troubleshoot network, infrastructure and access issues with internal teams and customer-side users
· Work with backend teams to support API integrations and infrastructure abstractions for complex requirements
· Document operational procedures, incident findings and technical support steps clearly
Required Technical Skills
Cloud Infrastructure & AWS
· Strong hands-on experience with AWS infrastructure and cloud operations
· Experience with Route 53, EC2, API Gateway, VPC, VPN, AWS Cognito, ElastiCache, DynamoDB and Lambda
· Experience with AWS security setup, monitoring and multi-zone infrastructure
· Ability to manage infrastructure using Terraform
Kubernetes, CI/CD & Observability
· Strong experience with Kubernetes, preferably AWS EKS
· Extensive CI/CD and DevOps experience
· Experience with infrastructure observability and application monitoring tools
· Ability to diagnose production bottlenecks, server failures and performance issues
Backend, Networking & SaaS Operations
· Experience supporting Node.js, Java and backend system procedures for stability and scale
· Good understanding of APIs, integrations and backend service dependencies
· Experience with complex networking and multi-domain SaaS implementations
· Ability to troubleshoot technical issues with non-technical end users
Nice to Have
· Experience with MongoDB clusters in MongoDB Atlas
Personal Attributes
· Strong ownership mindset for uptime, reliability and production stability
· Practical problem-solving approach with the ability to act quickly during incidents
· Clear written and spoken communication in English
· Ability to work independently and coordinate with senior management when required
· Comfortable working in fast-moving engineering teams
· Attention to detail in security, monitoring, documentation and operational processes
Why join FrontM?
Long-Term Career Growth
Opportunity to work on cloud infrastructure used by global maritime and remote workforce customers, with scope to grow into DevOps architecture and platform leadership roles.
Engineering Challenges That Matter
Work on infrastructure that supports applications used in remote, low-bandwidth and operationally demanding environments.
Broad Technical Ownership
Take responsibility across cloud infrastructure, Kubernetes, CI/CD, observability, networking, security and production reliability.
Apply now
Join a team focused on building reliable software infrastructure for real-world use cases and contribute to systems used across the global maritime workforce.
Your Mission: The Role
solving for better.
You are a reliability-owning, hands-on solver. Not just a "break-fix engineer."
As a DRI (directly responsible individual) for our clients' most critical systems, you’ll be the go-to expert within the squad that ensures their environments are secure, reliable, and optimized 24/7. You will deliver measurable impact – improved uptime, faster response times, and real cost savings. Not just closed tickets. Not just alerts. Real outcomes you engineer yourself.
You will lead the charge on technical execution, from complex troubleshooting and root cause analysis to engineering proactive, automated solutions. This role is about building the future of reliable cloud operations and shipping it into today's production environments.
Your Responsibilities
what you will wake up to solve.
This isn’t a “manage tickets” role. You are the architect, the executioner and the DRI for our Cloud Managed Services GTM, deploying solutions that turn operational noise into hardened outcomes. Here’s how you’ll make your mark:
- Own Service Reliability: You will be the go-to technical expert for 24/7 cloud operations and incident management. You'll ensure strict adherence to SLOs by getting your hands dirty, leading high-stakes troubleshooting to deliver a superior client experience.
- Engineer the Blueprint: You'll translate client needs into scalable, automated, and secure cloud architectures. You will write and maintain the operational playbooks and Infrastructure as Code (IaC) that your squad uses every day.
- Automate with Intelligence: You'll lead the charge from the keyboard to futurify our operations. You'll embed AI-driven automation, predictive monitoring, and AIOps into core processes to eliminate toil and preempt incidents.
- Drive FinOps & Impact: You'll own the technical execution of the FinOps framework. You will continuously analyze, configure, and optimize cloud spend for clients through hands-on engineering.
- Be the Expert in the Room: You'll share your knowledge through internal demos, documentation, and technical deep dives, representing the deep expertise that turns operational complexity into business resilience.
- Mentor & Elevate: You will be a technical mentor for your peers. Through code reviews and collaborative problem-solving, you'll help build a high-performing squad that lives the “Always Hardened” mindset.
Experience & Relevance
We are looking for future technology leaders, not just coders. We value raw intelligence, analytical rigor, and an obsessive passion for technology over any prior experience.
- Cloud Operations Pedigree: 3+ years of experience in GCP cloud infrastructure, with a significant portion in a cloud managed services.
- Commercial Acumen: Proven track record of building and scaling a net-new managed services business.
- Client-Facing Tech Acumen: 2+ years of experience in a client-facing technical role, acting as the trusted advisor for cloud operations, security, and reliability.
Functional Skills:
- Service Delivery Mindset: A deep understanding of MSP business models, SLAs, and the importance of client satisfaction in an operational context.
- Client Engagement: Ability to ask appropriate questions to get to the heart of an operational issue and win trust with stakeholders.
- Cross-Functional Catalyst: Thrive in multi-disciplinary teams, bringing together operations, security, and development teams.
- Repository builder: Creates reusable frameworks, IaC modules, and operational playbooks for scale.
Join the ‘real solvers’
ready to futurify?
If you are excited by the possibilities of what an AI-native engineering-led, modern tech consultancy can do to futurify businesses, apply here and experience the ‘Art of the possible’. Don’t Just Send a Resume. Send a Statement.
Position: Microsoft .NET Full Stack Developer
Experience: 4–6 Years
Open Positions: 10
Location: PAN India (Final Round – Face-to-Face Interview)
Budget: Up to 15 LPA
Notice Period: Immediate joiners preferred
Key Responsibilities:
· Work on highly distributed and scalable system architecture
· Design, develop, test, and maintain high-quality software solutions
· Ensure performance, security, and maintainability of applications
· Collaborate with cross-functional teams and stakeholders
· Perform system testing and resolve technical issues
Required Skills:
· Strong experience in ASP.NET, C#, .NET Core, MVC
· Hands-on experience with SQL Server / PostgreSQL
· Experience in Angular / React (Frontend technologies)
· Knowledge of microservices architecture & RESTful APIs
· Familiarity with CQRS pattern
· Exposure to AWS / Docker / Kubernetes
· Experience with CI/CD pipelines (Azure DevOps, Jenkins)
· Knowledge of Node.js is an added advantage
· Understanding of Agile methodology
· Good exposure to cybersecurity and compliance
Technology Stack:
· Microsoft .NET technologies (primary)
· Cloud platforms: AWS (SaaS/PaaS/IaaS)
· Databases: MSSQL, MongoDB, PostgreSQL
· Caching: Redis, Memcached
· Messaging queues: RabbitMQ, Kafka, SQS
Core Responsibilities:
- Design & Development: Architect and implement scalable backend services and APIs using Python or Golang, ensuring high performance, resilience, and extensibility.
- System Ownership: Take end-to-end ownership of critical modules, from design and development to deployment and support.
- Technical Leadership: Conduct design and code reviews, enforce best practices, and mentor junior engineers to raise the team’s technical bar.
- Collaboration: Work closely with product managers, architects, and other engineers to translate business requirements into technical solutions.
- Performance & Reliability: Troubleshoot complex issues in production systems, identify root causes, and design sustainable long-term solutions.
- Innovation: Evaluate new technologies, contribute to proof-of-concepts, and recommend tools that can improve developer productivity.
- Process Improvement: Drive initiatives to improve coding standards, CI/CD pipelines, and automated testing practices.
- Knowledge Sharing: Document designs, create technical guides, and share insights with the broader engineering team.
Experience and Expertise:
- 4–7 years of backend development experience with Python or Golang.
- Strong expertise in designing, developing, and scaling microservices and distributed systems.
- Solid understanding of concurrency, multi-threading, and performance optimization.
- Proficiency with databases (SQL/NoSQL), caching systems (Redis, Memcached), and messaging systems (Kafka, RabbitMQ, etc.).
- Hands-on experience with Linux development, Docker, and Kubernetes.
- Familiarity with cloud platforms (AWS/GCP/Azure) and related services.
- Strong debugging, profiling, and optimization skills for production-grade systems.
- Experience with AI-powered development tools is a strong plus; familiarity with concepts like 'agentic coding' for workflow automation or 'context engineering' for leveraging LLMs in system design is highly desirable.
Skills:
- Strong problem-solving ability, with experience handling complex technical challenges.
- Ability to lead technical initiatives and mentor junior engineers.
- Excellent communication skills to collaborate with cross-functional teams and articulate trade-offs.
- Self-motivated, proactive, and able to operate independently while aligning with team goals.
- Passionate about engineering culture, quality, and developer productivity.
Core Responsibilities:
- Design, develop, and maintain backend services and APIs using Python or Golang.
- Write high-quality, testable, and maintainable code with a focus on performance and scalability.
- Implement automated tests and contribute to CI/CD pipelines.
- Collaborate with product, QA, and DevOps teams for end-to-end feature delivery.
- Troubleshoot production issues and provide timely resolutions.
- Participate in design and architecture discussions to improve system efficiency.
- Contribute to improving development processes, coding standards, and best practices.
Experience and Expertise:
- 2–4 years of experience in backend development with Python or Golang.
- Solid understanding of RESTful APIs, microservices, and distributed systems.
- Strong knowledge of data structures, algorithms, and OOPS principles.
- Hands-on experience with relational and/or NoSQL databases.
- Familiarity with Linux development, Docker, and basic cloud concepts (AWS/GCP/Azure).
- Proficiency with Git and version control workflows.
- Familiarity with AI-powered development tools or exposure to projects involving large language models (LLMs) is a plus.
Skills:
- Strong analytical and debugging skills with the ability to solve complex problems.
- Good communication and collaboration skills across teams.
- Ability to work independently with minimal supervision while being a strong team player.
- Growth mindset – eagerness to learn new technologies and improve continuously.
Core Responsibilities:
- Design, develop, and maintain backend services using Python or Golang.
- Write clean, efficient, and well-documented code following best practices.
- Build and consume RESTful APIs and microservices.
- Collaborate with QA, DevOps, and product teams for smooth feature delivery.
- Participate in peer code reviews and technical discussions.
- Debug and fix issues, ensuring system stability and performance.
- Continuously learn and apply new technologies and tools in backend development.
Experience and Expertise:
- 0–2 years of software development experience (internships or projects acceptable).
- Proficiency in at least one backend programming language (Python or Golang).
- Strong understanding of object-oriented programming and software fundamentals.
- Knowledge of data structures, algorithms, and database concepts.
- Familiarity with Linux-based development environments.
- Exposure to Git and version control workflows.
Skills:
- Strong analytical and problem-solving ability.
- Willingness to learn, adapt, and take ownership.
- Effective communication and teamwork skills.
- Curiosity for emerging technologies, including AI-driven development, backend technologies, distributed systems, and modern engineering practices.
Role Overview:
As a Backend Developer at LearnTube.ai, you will ship the backbone that powers 2.3 million learners in 64 countries—owning APIs that crunch 1 billion learning events & the AI that supports it with <200 ms latency.
Skip the wait and get noticed faster by completing our AI-powered screening. Click this link to start your quick interview. It only takes a few minutes and could be your shortcut to landing the job! -https://bit.ly/LT_Python
What You'll Do:
At LearnTube, we’re pushing the boundaries of Generative AI to revolutionize how the world learns. As a Backend Engineer, your roles and responsibilities will include:
- Ship Micro-services – Build FastAPI services that handle ≈ 800 req/s today and will triple within a year (sub-200 ms p95).
- Power Real-Time Learning – Drive the quiz-scoring & AI-tutor engines that crunch millions of events daily.
- Design for Scale & Safety – Model data (Postgres, Mongo, Redis, SQS) and craft modular, secure back-end components from scratch.
- Deploy Globally – Roll out Dockerised services behind NGINX on AWS (EC2, S3, SQS) and GCP (GKE) via Kubernetes.
- Automate Releases – GitLab CI/CD + blue-green / canary = multiple safe prod deploys each week.
- Own Reliability – Instrument with Prometheus / Grafana, chase 99.9 % uptime, trim infra spend.
- Expose Gen-AI at Scale – Publish LLM inference & vector-search endpoints in partnership with the AI team.
- Ship Fast, Learn Fast – Work with founders, PMs, and designers in weekly ship rooms; take a feature from Figma to prod in < 2 weeks.
What makes you a great fit?
Must-Haves:
- 3+ yrs Python back-end experience (FastAPI)
- Strong with Docker & container orchestration
- Hands-on with GitLab CI/CD, AWS (EC2, S3, SQS) or GCP (GKE / Compute) in production
- SQL/NoSQL (Postgres, MongoDB) + You’ve built systems from scratch & have solid system-design fundamentals
Nice-to-Haves
- k8s at scale, Terraform,
- Experience with AI/ML inference services (LLMs, vector DBs)
- Go / Rust for high-perf services
- Observability: Prometheus, Grafana, OpenTelemetry
About Us:
At LearnTube, we’re on a mission to make learning accessible, affordable, and engaging for millions of learners globally. Using Generative AI, we transform scattered internet content into dynamic, goal-driven courses with:
- AI-powered tutors that teach live, solve doubts in real time, and provide instant feedback.
- Seamless delivery through WhatsApp, mobile apps, and the web, with over 1.4 million learners across 64 countries.
Meet the Founders:
LearnTube was founded by Shronit Ladhani and Gargi Ruparelia, who bring deep expertise in product development and ed-tech innovation. Shronit, a TEDx speaker, is an advocate for disrupting traditional learning, while Gargi’s focus on scalable AI solutions drives our mission to build an AI-first company that empowers learners to achieve career outcomes. We’re proud to be recognised by Google as a Top 20 AI Startup and are part of their 2024 Startups Accelerator: AI First Program, giving us access to cutting-edge technology, credits, and mentorship from industry leaders.
Why Work With Us?
At LearnTube, we believe in creating a work environment that’s as transformative as the products we build. Here’s why this role is an incredible opportunity:
- Cutting-Edge Technology: You’ll work on state-of-the-art generative AI applications, leveraging the latest advancements in LLMs, multimodal AI, and real-time systems.
- Autonomy and Ownership: Experience unparalleled flexibility and independence in a role where you’ll own high-impact projects from ideation to deployment.
- Rapid Growth: Accelerate your career by working on impactful projects that pack three years of learning and growth into one.
- Founder and Advisor Access: Collaborate directly with founders and industry experts, including the CTO of Inflection AI, to build transformative solutions.
- Team Culture: Join a close-knit team of high-performing engineers and innovators, where every voice matters, and Monday morning meetings are something to look forward to.
- Mission-Driven Impact: Be part of a company that’s redefining education for millions of learners and making AI accessible to everyone.
Job Title : Backend Engineer (AI-First | FinTech/Crypto)
Experience : 3 to 6 Years
Location : Gurugram (Sector 49)
Working Hours : 10:00 AM – 6:00 PM
Work Mode : On-site | 6 Days Working
Employment Type : Full-time
Role Overview :
This is not a typical ticket-based engineering role. You will take end-to-end ownership of systems—designing architecture, building scalable solutions, and solving real-world performance challenges.
We operate in an AI-first engineering environment, leveraging advanced tools and automation workflows to build high-performance distributed systems.
Mandatory Skills :
Java/Spring Boot or Node.js, System Design (HLD/LLD), Distributed Systems, Event-Driven Architecture (Kafka/RabbitMQ), Low-Latency APIs, PostgreSQL/MongoDB, CI/CD, Docker/Kubernetes, AI-assisted development (Copilot/Cursor/Claude)
Key Responsibilities :
- Design and build scalable backend systems (Java/Spring Boot, Node.js, or similar).
- Architect and implement event-driven systems (Kafka, RabbitMQ, pub/sub).
- Develop secure and reliable financial systems with strong data integrity.
- Solve scalability and performance challenges in fintech/crypto environments.
- Own features end-to-end: design → development → deployment → monitoring.
- Work with real-time data pipelines (WebSockets, streaming, event sourcing).
- Define service contracts and optimize system architecture.
AI-First Engineering (Must-Have Mindset) :
You will :
- Use tools like GitHub Copilot, Cursor, and Claude in daily development
- Follow spec-driven development using structured instructions
- Review, validate, and ship AI-generated code with strong engineering judgment
Core Requirements :
- 3+ years of backend development experience.
- Strong expertise in Java (Spring Boot) or Node.js.
- Solid understanding of System Design (HLD/LLD, Distributed Systems).
- Experience with event-driven architectures (Kafka, RabbitMQ, async pipelines).
- Hands-on experience building low-latency, high-throughput systems.
- Strong database knowledge (PostgreSQL, MongoDB, etc.).
- Understanding of security, performance optimization, and reliability.
- Experience with CI/CD, Git, Docker, Kubernetes.
- Exposure to React / React Native is a plus.
Good to Have (Differentiators) :
- Experience in FinTech / Crypto / Web3 / Blockchain.
- Built systems for trading, payments, or real-time financial data.
- Experience with AI agents, automation pipelines, or agent-based systems.
- Exposure to parallel AI workflows (coding / testing / refactoring).
- Contributions to open source or technical blogs.
- Experience handling production-scale systems.
Role Overview
We are looking for a hands-on Senior Telephony Engineer who actively writes production-grade code and has deep experience with Asterisk-based systems, Java backend development, and high-scale dialler platforms.
Key Responsibilities
This is NOT an architecture-only role we need someone who can:
- Write code
- Debug real-time call issues
- Build and optimize telephony flows end-to-end
- Key Responsibilities (Hands-on Coding Focus)
- Develop and maintain Asterisk dialplans, AGI scripts, and call flows
- Build Java-based backend services for telephony control and orchestration
- Implement and optimize predictive / preview / progressive diallers
- Integrate telephony stack with:
Kafka
RabbitMQ
- Write scalable code for call routing, retry logic, and queue handling
- Work directly on SIP signalling, RTP flows, and debugging call issues
- Handle real-time call events, CDR processing, and logging pipelines
- Optimize systems for high concurrency (thousands of parallel calls)
- Debug production issues like:
Call drops
Latency
One-way audio
SIP failures
Qualifications & Skills
- Bachelors degree in Computer Engineering; Masters is a plus.
- Telephony (Core Requirement)
- Strong hands-on experience with Asterisk
- Deep knowledge of:
SIP / RTP / VoIP
Dialplans
AGI / AMI
- Experience building or maintaining dialers (very important)
- Backend Development
- Strong coding skills in Java (Spring Boot preferred)
- Experience building microservices / APIs
- Comfortable writing high-performance, low-latency code
- Messaging & Event Systems
- Hands-on experience with:
Apache Kafka
RabbitMQ
- Ability to implement event-driven systems
- Scaling & Performance
- Experience handling high call volumes (1000+ concurrent calls)
Understanding of:
- Multi-threading
- Queue management
- Load handling
- Good to Have
- Experience with predictive dialers
- Exposure to WebRTC / real-time communication
- Experience with Docker / Kubernetes
- Understanding of TRAI / Indian telecom ecosystem
- Experience with FreeSWITCH (bonus)
What We Are NOT Looking For
- Pure solution architects who dont code
- People with only theoretical telecom knowledge
- Candidates without real dialer / Asterisk production experience
What We Are Looking For
Someone who has:
- Written real dialplans and backend code
- Debugged live call issues
- Worked on production telephony systems
- A problem solver who can go deep into logs, packets, and code
Impact of the Role
You will directly contribute to building a high-scale telephony + AI voice platform, working on real-time systems that handle thousands of concurrent calls.
Job Title : DevOps Engineer
Experience : 3+ Years
Location : Indiranagar, Bengaluru (Work From Office – 5 Days)
Employment Type : Full-Time
Work Timings : 11:00 AM to 7:00 PM IST
Notice Period : Immediate Joiners Preferred
Role Overview :
We are seeking a skilled DevOps Engineer with 3+ years of experience in building and managing scalable cloud-native infrastructure.
The ideal candidate will have strong expertise in Kubernetes and Helm, along with hands-on experience in deploying and maintaining production-grade systems on cloud platforms.
This role offers an opportunity to work in a high-growth startup environment, contributing to both existing systems and new infrastructure development.
Key Responsibilities :
- Design, deploy, and manage scalable infrastructure using Kubernetes.
- Build and maintain CI/CD pipelines for efficient and automated deployments.
- Manage and optimize cloud environments (preferably GCP).
- Implement Infrastructure as Code using Helm/Terraform.
- Monitor system performance and ensure high availability and reliability.
- Handle bug fixes, system improvements, and performance optimization.
- Collaborate with engineering teams to design scalable microservices architecture.
- Implement logging, monitoring, and alerting solutions.
- Ensure security best practices including IAM, secrets management, and network policies.
Mandatory Skills :
- Strong hands-on experience with Kubernetes.
- Expertise in Helm Charts.
- Experience with Google Cloud Platform (GCP).
- Hands-on experience with ArgoCD or similar CI/CD tools.
- Knowledge of CI/CD tools like Jenkins, GitHub Actions, GitLab CI.
- Experience in database hosting and scaling.
Nice to Have :
- Exposure to other cloud platforms (AWS/Azure).
- Experience with modern DevOps and automation tools.
- Ability to quickly learn and adapt to new technologies.
Team & Work Scope :
- No dedicated DevOps team currently – high ownership role.
- Work on both existing systems (maintenance & improvements) and new system builds (greenfield projects).
- Opportunity to shape DevOps practices and infrastructure from scratch.
Preferred Candidate Profile :
- 3+ years of relevant DevOps experience.
- Strong problem-solving and debugging skills.
- Experience working in fast-paced startup environments.
- Understanding of scalability, security, and performance optimization.
- Good communication and collaboration skills.
Hiring Process :
- Profile Screening
- GT Assessment
- Technical Interview – Round 1
- Technical Interview – Round 2
- Final Round (if required with US team)
Key Responsibilities:
• Work on highly distributed and scalable system architecture
• Design, develop, test, and maintain high-quality software solutions
• Ensure performance, security, and maintainability of applications
• Collaborate with cross-functional teams and stakeholders
• Perform system testing and resolve technical issues
Required Skills:
• Strong experience in ASP.NET, C#, .NET Core, MVC
• Hands-on experience with SQL Server / PostgreSQL
• Experience in Angular / React (Frontend technologies)
• Knowledge of microservices architecture & RESTful APIs
• Familiarity with CQRS pattern
• Exposure to AWS / Docker / Kubernetes
• Experience with CI/CD pipelines (Azure DevOps, Jenkins)
• Knowledge of Node.js is an added advantage
• Understanding of Agile methodology
• Good exposure to cybersecurity and compliance
Technology Stack:
• Microsoft .NET technologies (primary)
• Cloud platforms: AWS (SaaS/PaaS/IaaS)
• Databases: MSSQL, MongoDB, PostgreSQL
• Caching: Redis, Memcached
• Messaging queues: RabbitMQ, Kafka, SQS
Budget: 35 LPA to 45 LPA
Work schedule is Mon to Fri, 3:30am to 12:30pm IST
Key Responsibilities:
- Design, develop, and deploy computer vision and machine learning models for analyzing visual and document-based data.
- Build pipelines that convert unstructured visual inputs into structured and usable information.
- Develop and evaluate models for tasks such as object detection, segmentation, document parsing, and image understanding.
- Apply OCR and related techniques to extract meaningful information from complex documents and imagery.
- Work with large datasets and build efficient training and evaluation pipelines.
- Handle real-world visual datasets that may contain noise, inconsistencies, incomplete information, or varying formats.
- Experiment with different approaches to solve challenging computer vision problems and evaluate tradeoffs between accuracy, performance, and complexity.
- Collaborate with product and engineering teams to integrate machine learning models into scalable production systems.
- Continuously improve model performance, accuracy, and robustness in real-world environments.
- Stay up to date with the latest developments in AI and computer vision and apply relevant techniques where appropriate.
- Actively leverage modern AI tools and frameworks to accelerate experimentation, development, and engineering workflows.
Requirements:
- 5+ years of hands-on experience building and deploying machine learning models, particularly in Computer Vision or document understanding.
- Strong proficiency in Python for machine learning and data processing.
- Hands-on experience with modern ML frameworks such as PyTorch and libraries in the Hugging Face ecosystem.
- Experience with computer vision tooling such as OpenCV.
- Experience with common ML and data science libraries such as scikit-learn, NumPy, and Pandas.
- Experience developing models for tasks such as segmentation, object detection, or document analysis.
- Experience working with large image datasets and building training pipelines.
- Solid understanding of model evaluation, data preprocessing, and performance optimization.
- Strong problem-solving skills and ability to work in a fast-paced product environment.
- Ability to collaborate effectively with cross-functional engineering and product teams.
- The candidate should be based in India
- Willing to work remotely full-time
- Work schedule is Mon to Fri, 3:30am to 12:30pm IST
Preferred Qualifications:
- Experience with TensorFlow or other deep learning frameworks.
- Experience working with OCR pipelines or document analysis systems.
- Experience deploying machine learning models in production environments.
- Experience with containerized deployments such as Docker or Kubernetes.
- Experience working with complex technical documents, diagrams, or structured visual data.
- Familiarity with spatial or geometry-related data problems.
- Experience with libraries such as Detectron2, MMDetection, or similar.
- Familiarity with frameworks used to integrate modern AI models into applications (e.g., LangChain or similar tooling).
- Contributions to open-source ML or computer vision projects are a plus.
Additional Information:
- The problems we work on involve complex visual and document-based data, so we value engineers who enjoy tackling challenging technical problems and experimenting with different approaches to reach practical solutions.
- Candidates are required to include links to relevant projects, GitHub repositories, research work, or examples of machine learning systems they have built.
Benefits:
- Flexible remote work opportunities with career development opportunities
- Engagement with a supportive and collaborative global team
- Competitive market based salary
Dear Candidates,
We have an urgent requirement for a Technical Lead – Full Stack role based in Bangalore. Please find the details below:
Work Location (WFO):
Nagar, Bengaluru, Karnataka
Interview Process:
L1 Interview – Face-to-Face at Office
Experience Required:
4-6 Years (Minimum1+ years in Technical Leadership role)
Role Overview:
The candidate will lead the technical vision and architecture of a compliance platform by designing scalable, secure, and high-performance systems. The role involves driving full-stack development across .NET and open-source technologies, enabling unified AI Agent capabilities, Single Authentication (SSO), and a One-UI experience.
Key Responsibilities:
- Define and own end-to-end architecture including micro-frontends, .NET services, FastAPI APIs, and microservices
- Lead full-stack development using .NET and modern open-source technologies
- Modernize legacy systems (ASP.NET, .NET Core, MS SQL Server) to cloud-native architecture
- Design and implement AI Agents, SSO, and unified UI experiences
- Manage sprint planning, backlogs, and collaborate with Product Owners
- Implement CI/CD pipelines using Jenkins, GitHub Actions
- Drive containerization and orchestration using Docker & Kubernetes
- Ensure secure deployments and cloud infrastructure management
- Establish engineering best practices, code reviews, and architecture governance
- Mentor teams on Clean Architecture, SOLID principles, and DevOps practices
Required Skills:
- ReactJS, FastAPI, Python, REST/GraphQL
- ASP.NET, MVC, .NET Core, Entity Framework, MS SQL Server
- Strong experience in Microservices Architecture
- DevOps: CI/CD, Jenkins, GitOps, Docker, Kubernetes
- Cloud Platforms: AWS / Azure / GCP
- AI/ML & LLM tools: OpenAI, Llama, LangChain, etc.
- Security: RBAC, API security, secrets management
Qualifications:
- BE / BTech in Computer Science
Lead Cloud Reliability Engineer
Job Responsibilities
● Lead and manage the Cloud Reliability teams to provide strong Managed Services support to end-customers.
● Isolate, troubleshoot and resolve issues reported by CMS clients in their cloud environment
● Drive the communication with the customer providing details about the issue, current steps, next plan of action, ETA
● Gather client's requirements related to use of specic cloud services and provide assistance in seing them up and resolving issues
● Create SOPs and knowledge articles for use by the L1 teams to resolve common issues
● Identify recurring issues, perform root cause analysis and propose/implement preventive actions
● Follow change management procedure to identify, record and implement changes
● Plan and deploy OS, security patches in Windows/Linux environment and upgrade k8s clusters
● Identify the recurring manual activities and contribute to automation
● Provide technical guidance and educate team members on development and operations. Monitor metrics and develop ways to improve.
● System troubleshooting and problem-solving across plaorm and application domains. Ability to use a wide variety of open-source technologies and cloud services.
● Build, maintain, and monitor conguration standards.
● Ensuring critical system security through using best-in-class cloud security solutions.
Qualifications
● 4-7 years experience in Cloud Infrastructure and Operations domains and IT operational experience preferably in a global enterprise environment.
● Specialize in one or two cloud deployment platforms: AWS, GCP
● Hands on experience with AWS/GCP services (EKS, ECS, EC2, VPC, RDS, Lambda, GKE, Compute Engine)
● Understanding of one or more programming languages (Python, JavaScript, Ruby, Java, .Net)
● Logging and Monitoring tools (ELK, Stackdriver, CloudWatch)
● Knowledge on Conguration Management tools such as Ansible, Terraform, Puppet, Chef
● Experience working with deployment and orchestration technologies (such as Docker, Kubernetes, Mesos)
● Good analytical, communication, problem solving, and learning skills.
● Knowledge on programming against cloud plaorms such as Google Cloud Platform and lean development methodologies.
● Strong service aitude and a commitment to quality.
● Willingness to work in shifts.
- Bachelor’s degree in computer science, Web Development, or a related field (or equivalent practical experience).
- Minimum of 4 to 8 years of professional experience in Java
development.
- Strong proficiency in Java and object-oriented programming.
- Minimum of 4 years of experience in building microservices with Spring Boot.
- Solid understanding of RESTful APIs and experience with API design and integration.
- Strong problem-solving skills and the ability to think critically.
Job Title: Senior Java Architect (12+ Years Experience)
Location: Remote (2 PM - 11 PM IST)
Experience: 12+ Years
Salary: ₹15L - ₹21L/yr
Employment Type: Contract (1 Year Extendable)
Job Description:
We are looking for a highly experienced Java Architect to join our team on a long-term contract basis. The ideal candidate should have deep expertise in designing scalable enterprise applications using Java and microservices architecture. The candidate should be capable of driving architecture decisions, mentoring development teams, and delivering high-performance solutions for enterprise-grade systems.
Key Responsibilities:
- Design and architect scalable enterprise applications using Java and microservices
- Lead system design and architecture decisions for complex applications
- Develop and implement microservices architecture patterns
- Drive technical architecture across multiple development teams
- Mentor and guide senior developers and engineering teams
- Handle high-traffic, scalable enterprise application architecture
- Collaborate with stakeholders to define technical requirements and roadmaps
- Ensure system performance, scalability, and reliability
- Review code and architecture designs for best practices
- Work with Spring Boot, Spring Cloud, and modern Java frameworks
Required Skills & Qualifications:
- 12+ years of hands-on experience in Java development and architecture
- Strong expertise in microservices architecture and design patterns
- Deep knowledge of system design principles and enterprise architecture
- Hands-on experience with Spring Boot and Spring Cloud
- Experience designing scalable, high-performance enterprise applications
- Proficiency in RESTful APIs, messaging systems, and API gateways
- Strong understanding of cloud platforms (AWS, Azure, or GCP)
- Experience with containerization (Docker) and orchestration (Kubernetes)
- Knowledge of database design (SQL and NoSQL)
- Expertise in JVM tuning and performance optimization
Technical Skills:
- Java 11+ (Java 17/21 preferred)
- Spring Boot, Spring Cloud, Spring Security
- Microservices architecture and design patterns
- RESTful APIs, GraphQL, gRPC
- Apache Kafka, RabbitMQ, or similar messaging systems
- Docker, Kubernetes, CI/CD pipelines
- PostgreSQL, MySQL, MongoDB, or similar databases
- Redis, Elasticsearch, or caching solutions
- Maven, Gradle, Git
Additional Requirements:
- Ability to work 2 PM - 11 PM IST (US/Europe Shift Timing)
- Immediate availability or short notice period (15 days max)
- Strong problem-solving and analytical skills
- Excellent communication skills for stakeholder collaboration
- Experience mentoring technical teams
- Contract commitment for a minimum of 1 year (extendable)
Good to Have (Preferred Skills):
- Experience with reactive programming (WebFlux, Project Reactor)
- Cloud certification (AWS Solutions Architect, etc.)
- Experience with observability tools (Prometheus, Grafana)
- Knowledge of domain-driven design (DDD)
- Experience with multi-cloud or hybrid cloud architectures
- Freelance/contract experience (preferred)
What We Offer:
- 1.2 ~ 1.8 LPM fixed contract salary
- Long-term contract (1-year extendable)
- Remote work with a flexible 2-11 PM IST schedule
- Work with cutting-edge enterprise technologies
- Opportunity to architect large-scale systems
- Collaborate with experienced engineering teams
- Immediate start for the right candidates
Work mode- WFO 5 days
Location: Hyderabad (Onsite)
Experience- 7+
- K8s Hands-on experience
- Linux Troubleshooting Skills
- Experience on OnPrem Servers and Management
- Helm
- Docker
- Ingress and Ingress Controllers
- Networking Basics
- Proficient Communication
Must-Have Skills:
- Hands-on experience with airgap Kubernetes clusters, ideally in regulated industries (finance, healthcare, etc.).
- Strong expertise in CI/CD pipelines, programmable infrastructure, and automation.
- Proficiency in Linux troubleshooting, observability (Prometheus, Grafana, ELK), and multi-region disaster recovery.
- Security & compliance knowledge for regulated industries.
- Preferred: Experience with GKE, RKE, Rook-Ceph, and certifications like CKA, CKAD.
Who You Are
- A Kubernetes expert who thrives on scalability, automation, and security.
- Passionate about optimizing infrastructure, CI/CD, and high-availability systems.
- Comfortable troubleshooting Linux, improving observability, and ensuring disaster recovery readiness.
- A problem solver who simplifies complexity and drives cloud-native adoption.
What You’ll Do
- Architect & automate Kubernetes solutions for airgap and multi-region clusters.
- Optimize CI/CD pipelines & cloud-native deployments.
- Work with open-source projects, selecting the right tools for the job.
- Educate & guide teams on modern cloud-native infrastructure best practices.
- Solve real-world scaling, security, and infrastructure automation challenges.
Why Join Us?
- Work on high-impact Kubernetes projects in regulated industries.
- Solve real-world automation & infrastructure challenges with cutting-edge tools.
- Grow in a team that values learning, open-source contributions, and innovation.
About the role
We are looking for talented Senior Backend Engineers (5+ years of experience) to join our team and take ownership of different parts of our stack. You will be working alongside a team of Engineers locally and directly with the U.S. Engineering team on all aspects of product/application development. You will leverage your experiences and abilities to inform decisions across product development and technology. You will help us build the foundation of our 2nd Headquarters in Pune: its culture, its processes, and its practices. There are a ton of interesting problems to solve, so come hungry. If your colleagues describe you as curious, driven, kind, and creative you are a culture fit.
What Success Looks Like
- You write, review and ship code in production. Your employer or client's success depends on the software you build
- You use Generative AI tools on a daily basis to enhance the quality and efficacy of your software and non-software deliverables
- You are a self-starter and enjoy working with minimal supervision
- You evaluate and make technical architecture decisions with a long-term view, optimizing for speed, quality, and safety
- You take pride in the product you create and the code that you write
- Your team can rely on you to get them out of a sticky situation in production
- You can work well on a team of sales executives, designers and engineers in an in-person environment
- You are passionate about the enterprise software development lifecycle and feel strongly about improving it
- You are a first principles engineer who exercises curiosity about the technologies you work with
- You can learn quickly about technologies, software and code that you are not familiar with, often from rudimentary documentation
- You take ownership of the code that you write, and you help the team operate with everything that you build, throughout its lifecycle
- You communicate openly and solicit feedback on important decisions, keeping the team aligned on your rationale
- You exercise an optimistic mindset and are willing to go the extra mile to make things work
Areas of Ownership
Our hiring process is designed for you to demonstrate a generalist set of capabilities, with a specialization in Backend Technologies.
Required Technical Experience (MUST HAVE):
- Expertise in Python -
- Deep hands-on experience with Terraform -
- Proficiency in Kubernetes -
- Experience with cloud platforms (GCP strongly preferred, AWS/Azure acceptable) -
Additional experience with some of the following:
- Backend Frameworks and Technologies (Node.js, NuxtJS, Express.js)
- Programming languages (JavaScript, TypeScript, Java, C++, Go)
- RPCs (REST, gRPC or GraphQL)
- Databases (SQL, NoSQL, Postgres, MongoDB, or Firebase)
- CI/CD (Jenkins, CircleCI, GitLab or similar)
- Source code versioning tools such as Git or Perforce
- Microservices architecture
Ways to stand out
- Familiarity with AI Platforms
- Extensive experience with building enterprise-scale applications with >99% SLAs
- Deep expertise across the full required stack: Python, Terraform, Kubernetes, and GCP
You'll Get...
- Competitive Salary
- Medical Insurance Benefits
- Employer Provident Fund contributions with Gratuity after 5 years of service
- Company-sponsored US onsite trips for high performers, based on business requirements
- Potential international transfer support for top performers, based on business requirements
- Technology (hardware, software, trainings, etc.) equipment and/or allowance
- The opportunity to re-shape an entire industry
- Beautiful office environment
- Meal allowance and/or food provision on site
Culture
Who we are: Our Co-Founder and CTO is a Serial Gen AI Inventor who grew up in Pune, India, is a BITS Pilani graduate, and worked at NVIDIA's Pune office for 6 years. There, he was promoted 5 times in 6 years and was transferred to the NVIDIA Headquarters in Santa Clara, California. After making significant contributions to NVIDIA, he proceeded to attend Harvard for his dual Masters in Engineering and MBA from HBS. Our other Co-Founder/CEO is a successful Serial Entrepreneur who has built multiple companies. As a team, we work very hard, have a curious mind-set, and believe in a low-ego high output approach.
Virtual Hiring Drive Site Reliability Engineer (SRE)
Date: 25th April 2026, Saturday (Single-Day Drive)
Mode: 100% Virtual - All interview rounds on the same day
Experience: 3 to 7 Years
Note : We are looking for quick joiners who can join us within 30 days.
About the Role
We are looking for a Site Reliability Engineer who understands the realities of running production systems at scale. If building reliable, scalable, and observable systems excites you, you'll enjoy working with us.
At One2N, we solve One-to-N problems where proof of concept is already built and the real challenge lies in scalability, maintainability, performance, and reliability.
You will work closely with startups and mid-sized clients, helping them architect production-grade infrastructure and observability systems.
Key Responsibilities
- Design and build platform engineering solutions with a self-serve model
- Architect and optimize observability systems (metrics, logs, traces)
- Implement monitoring, logging, alerting & dashboards
- Build and optimize CI/CD pipelines
- Automate repetitive operational and infrastructure tasks (IaC-first approach)
- Improve Developer Experience (DX)
- Guide teams on SRE best practices & on-call processes
- Participate in code reviews and mentor engineers
- Contribute to cloud-native and platform engineering initiatives
Must-Have Skills
- 3 - 7 years experience in DevOps / SRE / Platform Engineering
- Strong hands-on with Kubernetes on AWS
- Expertise in observability tools like Datadog / Honeycomb / ELK / Grafana / Prometheus
- Experience with Docker & Microservices architecture
- Infrastructure as Code using Terraform / Pulumi
- Strong Linux troubleshooting skill
- Programming knowledge in Golang / Python / Java
- Automation & scripting expertise
6 + years of hands-on development experience and in-depth knowledge of , Spring Java, Spring boot, Quarkus and nice to have front-end technologies like Angular, React JS
● Excellent Engineering skills in designing and implementing scalable solutions
● Good knowledge of CI/CD Pipeline with strong focus on TDD
● Strong communication skills and ownership
● Exposure to Cloud, Kubernetes, Docker, Microservices is highly desired.
● Experience in working on public cloud environments like AWS, Azure, GCP w.r.t. solutions development, deployment & adoption of cloud-based technology components like IaaS / PaaS offerings
● Proficiency in PL/SQL and Database development.
Strong in J2EE & OOPS Design Patterns.
Dear Candidates,
Exp: 3+ years
NP: Immediate to 7 days
Location: Bangalore, Chennai
5 days week
Job Description
Function: Software Engineering → Full-Stack Development
Fintech/BFSI domain experience.
- React.js
- Node.js
- AWS
Requirements:
- Mandatory Skill: Strong Experience in React JS, Node JS, and AWS -3+ years of relevant experience from Current Projects.
- Expertise with at least one Object-oriented JavaScript Framework (React, Angular, Ember, Dojo, Node, etc. ).
- Good to have hands-on experience in Python development.
- Proficiency with Object Oriented Programming, multi-threading, data serialization, and REST API to connect applications to back-end services.
- Proficiency in Docker, Kubernetes (k8s), Jenkins, and GitHub Actions is essential for this role.
- Proven cloud development experience AWS.
- Understanding of IT life cycle methodology and processes.
- Experience in understanding and Leading Enterprise Platforms/Solutions.
- Experience working with Microservices/Service Oriented Architecture Frameworks.
- Good Understanding of Middleware technologies.
- Possess expertise in at least one unit testing framework.
- Education: Avoid UG Degree alone and look only at B. E/B. Tech/MCA/M. Sc.
Must-Have Skills:
- Hands-on experience with airgap Kubernetes clusters, ideally in regulated industries (finance, healthcare, etc.).
- Strong expertise in CI/CD pipelines, programmable infrastructure, and automation.
- Proficiency in Linux troubleshooting, observability (Prometheus, Grafana, ELK), and multi-region disaster recovery.
- Security & compliance knowledge for regulated industries.
- Preferred: Experience with GKE, RKE, Rook-Ceph and certifications like CKA, CKAD.
Who You Are
- A Kubernetes expert who thrives on scalability, automation, and security.
- Passionate about optimizing infrastructure, CI/CD, and high-availability systems.
- Comfortable troubleshooting Linux, improving observability, and ensuring disaster recovery readiness.
- A problem solver who simplifies complexity and drives cloud-native adoption.
What You’ll Do
- Architect & automate Kubernetes solutions for airgap and multi-region clusters.
- Optimize CI/CD pipelines & cloud-native deployments.
- Work with open-source projects, selecting the right tools for the job.
- Educate & guide teams on modern cloud-native infrastructure best practices.
- Solve real-world scaling, security, and infrastructure automation challenges.
Why Join Us?
- Work on high-impact Kubernetes projects in regulated industries.
- Solve real-world automation & infrastructure challenges with cutting-edge tools.
- Grow in a team that values learning, open-source contributions, and innovation.
We are looking for an experienced Backend Engineer with strong expertise in Kubernetes internals, control-plane development, and Golang-based microservices. If you enjoy building scalable infrastructure components and working with modern cloud-native technologies, this role is for you.
Key Responsibilities
- Design and implement Kubernetes controllers/operators and define new Custom Resource Definitions (CRDs).
- Develop REST and gRPC APIs for platform components and services.
- Contribute to Kubernetes management plane development and enhancements.
- Build and maintain microservices using Golang (1–2 years of hands-on experience required).
- Work with virtualization and container platforms such as KubeVirt, OpenShift, and similar technologies.
- Collaborate with engineering teams to design scalable infrastructure solutions.
- Use AI-assisted development tools such as GitHub Copilot or Augment (nice to have).
Required Skills
- Strong experience with Kubernetes internals, especially controller/operator implementation.
- Experience defining and managing CRDs.
- Hands-on development experience with Golang microservices.
- Experience building and consuming REST and gRPC APIs.
- Exposure to virtualisation technologies (KubeVirt, OpenShift, etc.).
- Prior work on Kubernetes management plane components or similar distributed systems.
Good to Have
- Experience working with AI-driven development tools.
- Exposure to cloud-native environments, containers, and orchestration technologies.





















