Responsibilities
- Building and maintenance of resilient and scalable production infrastructure
- Improvement of monitoring systems
- Creation and support of development automation processes (CI / CD)
- Participation in infrastructure development
- Detection of problems in architecture and proposing of solutions for solving them
- Creation of tasks for system improvements for system scalability, performance and monitoring
- Analysis of product requirements in the aspect of devops
- Managing a team of DevOps, control of task deliveries
- Incident analysis and fixing
Technology stack
Linux, Bash, Salt/Ansible, LXC, libvirt, IPsec, VXLAN, Open vSwitch, OpenVPN, OSPF, BIRD, Cisco NX-OS, Multicast, PIM, LVM, software RAID, LUKS, PostgreSQL, nginx, haproxy, Prometheus, Grafana, Zabbix, GitLab, Capistrano
Skills and Experience
- Understanding of the distributed systems principles
- Understanding of principles for building a resistant network infrastructure
- Experience of Ubuntu Linux administration (Debian-like will be a plus)
- Strong knowledge of Bash
- Experience of working with LXC-containers
- Understanding and experience with infrastructure as a code approach
- Experience of development idempotent Ansible roles
- Experience with relational databases (PostgeSQL), ability to create simple SQL queries
- Experience with git
- Experience with monitoring and metric collect systems (Prometheus, Grafana, Zabbix)
- Understanding of dynamic routing (OSPF)
Preferred experience
- Experience of working with highload zero-downtown environments
- Experience of coding on Python
- Experience of working with IPsec, VXLAN, Open vSwitch
- Knowledge and experience of working with network equipment Cisco
- Experience of working with Cisco NX-OS
- Knowledge of principles of multicast protocols IGMP, PIM
- Experience of setting multicast on Cisco equipment
- Experience of working with Solarflare Onload
- Experience administering Atlassian products

Similar jobs
Location: Bangalore
Experience: 2–5 years
Type: Full-time | On-site
Open Roles: 1
Start: Immediate
Why this role exists
Most engineering teams choose between speed and stability.
We need both.
Today:
- Deployments carry risk
- Cloud costs are higher than they should be
- Compliance is reactive, not built-in
This role exists to build a platform where:
- We can deploy fast without breaking production
- We can scale without runaway cost
- We can pass enterprise InfoSec reviews without firefighting
What you’ll do
You will not just manage infrastructure.
You will build the platform that engineering runs on.
1. Drive cloud cost efficiency
- Reduce Azure compute spend by 40%
- Implement:
- Reserved Instances / savings plans
- Right-sizing of workloads
- Scheduling for non-critical workloads
- Continuously monitor and optimize cost vs performance
2. Build zero-downtime deployment systems
- Ship a deployment pipeline that supports:
- 5+ production deployments per week
- Zero customer-visible downtime
- Implement:
- Blue-green / canary deployments
- Automated health checks
- Safe rollout strategies
3. Enable fast and safe releases
- Reduce time-to-launch significantly
- Ensure:
- High reliability in every release
- Ability to rollback instantly if something breaks
- Create systems where:
- Scaling up is seamless when things go right
- Failures are contained when they don’t
4. Build disaster recovery and compliance readiness
- Create DR/BCP systems that pass enterprise audits from:
- HDFC Life, SBI Life
- Ensure:
- Backup and recovery processes are defined and tested
- Failover strategies are documented and executable
- Build compliance as part of the system, not an afterthought
5. Embed security into the pipeline
- Integrate:
- SAST (Static Application Security Testing)
- DAST (Dynamic Application Security Testing)
- SCA (Software Composition Analysis)
- Secret scanning
- Container scanning
- IaC scanning
- Ensure vulnerabilities are caught before deployment
6. Enforce policy-as-code
- Implement:
- OPA / Gatekeeper
- Azure Policy
- Prevent non-compliant infrastructure from being deployed
- Ensure consistency across environments
7. Build a scalable platform layer
- Create systems that:
- Support increasing deployment frequency
- Maintain reliability under scale
- Work closely with backend and SRE teams to:
- Improve system stability
- Reduce operational overhead
What success looks like
- Cloud costs reduce by ≥ 40%
- Deployments are:
- Frequent
- Safe
- Invisible to customers
- Rollbacks are instant and reliable
- DR/BCP passes enterprise audits in the first attempt
- Security is embedded in the pipeline, not patched later
- Engineering teams ship faster with confidence
Who you are
- You have 2-5 years of experience in DevOps / Platform Engineering
- You have worked with:
- Cloud platforms (Azure preferred)
- CI/CD systems
- Infrastructure as Code
- You think in:
- Systems
- Trade-offs (speed vs reliability vs cost)
- You are comfortable owning:
- Production infrastructure
- Deployment systems
What will make you stand out
- Experience with:
- High-frequency deployment systems
- Cost optimization at scale
- Security-first pipelines
- Strong understanding of:
- Kubernetes / container orchestration
- Monitoring and observability
- Distributed system reliability
- Experience passing enterprise security/compliance audits
Why join
- You will define how engineering ships and scales
- Your work directly impacts:
- Reliability
- Cost
- Deployment velocity
- You will build a platform that moves from:
- Fragile → predictable and scalable
What this role is not
- Not manual infra management
- Not reactive firefighting
- Not limited to CI/CD maintenance
What this role is
- A builder of deployment systems
- A driver of cost efficiency
- A guardian of reliability and compliance
One question to self-evaluate
Can you build a platform where we deploy faster, spend less, and never break production?
∙Need 8+ years of experience in Devops CICD
∙Managing large-scale AWS deployments using Infrastructure as Code (IaC) and k8s developer tools
∙Managing build/test/deployment of very large-scale systems, bridging between developers and live stacks
∙Actively troubleshoot issues that arise during development and production
∙Owning, learning, and deploying SW in support of customer-facing applications
∙Help establish DevOps best practices
∙Actively work to reduce system costs
∙Work with open-source technologies, helping to ensure robustness and secureness of said technologies
∙Actively work with CI/CD, GIT and other component parts of the build and deployment system
∙Leading skills with AWS cloud stack
∙Proven implementation experience with Infrastructure as Code (Terraform, Terragrunt, Flux, Helm charts)
at scale
∙Proven experience with Kubernetes at scale
∙Proven experience with cloud management tools beyond AWS console (k9s, lens)
∙Strong communicator who people want to work with – must be thought of as the ultimate collaborator
∙Solid team player
∙Strong experience with Linux-based infrastructures and AWS
∙Strong experience with databases such as MySQL, Redshift, Elasticsearch, Mongo, and others
∙Strong knowledge of JavaScript, GIT
∙Agile practitioner
Key Responsibilities:
- Design, implement, and maintain scalable, secure, and cost-effective infrastructure on AWS and Azure
- Set up and manage CI/CD pipelines for smooth code integration and delivery using tools like GitHub Actions, Bitbucket Runners, AWS Code build/deploy, Azure DevOps, etc.
- Containerize applications using Docker and manage orchestration with Kubernetes, ECS, Fargate, AWS EKS, Azure AKS.
- Manage and monitor production deployments to ensure high availability and performance
- Implement and manage CDN solutions using AWS CloudFront and Azure Front Door for optimal content delivery and latency reduction
- Define and apply caching strategies at application, CDN, and reverse proxy layers for performance and scalability
- Set up and manage reverse proxies and Cloudflare WAF to ensure application security and performance
- Implement infrastructure as code (IaC) using Terraform, CloudFormation, or ARM templates
- Administer and optimize databases (RDS, PostgreSQL, MySQL, etc.) including backups, scaling, and monitoring
- Configure and maintain VPCs, subnets, routing, VPNs, and security groups for secure and isolated network setups
- Implement monitoring, logging, and alerting using tools like CloudWatch, Grafana, ELK, or Azure Monitor
- Collaborate with development and QA teams to align infrastructure with application needs
- Troubleshoot infrastructure and deployment issues efficiently and proactively
- Ensure cloud cost optimization and usage tracking
Required Skills & Experience:
- 3-4 years of hands-on experience in a DevOps
- Strong expertise with both AWS and Azure cloud platforms
- Proficient in Git, branching strategies, and pull request workflows
- Deep understanding of CI/CD concepts and experience with pipeline tools
- Proficiency in Docker, container orchestration (Kubernetes, ECS/EKS/AKS)
- Good knowledge of relational databases and experience in managing DB backups, performance, and migrations
- Experience with networking concepts including VPC, subnets, firewalls, VPNs, etc.
- Experience with Infrastructure as Code tools (Terraform preferred)
- Strong working knowledge of CDN technologies: AWS CloudFront and Azure Front Door
- Understanding of caching strategies: edge caching, browser caching, API caching, and reverse proxy-level caching
- Experience with Cloudflare WAF, reverse proxy setups, SSL termination, and rate-limiting
- Familiarity with Linux system administration, scripting (Bash, Python), and automation tools
- Working knowledge of monitoring and logging tools
- Strong troubleshooting and problem-solving skills
Good to Have (Bonus Points):
- Experience with serverless architecture (e.g., AWS Lambda, Azure Functions)
- Exposure to cost monitoring tools like CloudHealth, Azure Cost Management
- Experience with compliance/security best practices (SOC2, ISO, etc.)
- Familiarity with Service Mesh (Istio, Linkerd) and API gateways
- Knowledge of Secrets Management tools (e.g., HashiCorp Vault, AWS Secrets Manager)
About GradRight
Our vision is to be the world’s leading Ed-Fin Tech company dedicated to making higher education accessible and affordable to all. Our mission is to drive transparency and accountability in the global higher education sector and create significant impact using the power of technology, data science and collaboration.
GradRight is the world’s first SaaS ecosystem that brings together students, universities and financial institutions in an integrated manner. It enables students to find and fund high return college education, universities to engage and select the best-fit students and banks to lend in an effective and efficient manner.
In the last three years, we have enabled students to get the best deals on a $ 2.8+ Billion of loan requests and facilitated disbursements of more than $ 350+ Million in loans. GradRight won the HSBC Fintech Innovation Challenge supported by the Ministry of Electronics & IT, Government of India & was among the top 7 global finalists in The PIEoneer awards, UK.
GradRight’s team possesses extensive domestic and international experience in the launch and scale-up of premier higher education institutions. It is led by alumni of IIT Delhi, BITS Pilani, IIT Roorkee, ISB Hyderabad and University of Pennsylvania. GradRight is a Delaware, USA registered company with a wholly owned subsidiary in India.
About the Role
We are looking for a passionate DevOps Engineer with hands-on experience in AWS cloud infrastructure, containerization, and orchestration. The ideal candidate will be responsible for building, automating, and maintaining scalable cloud solutions, ensuring smooth CI/CD pipelines, and supporting development and operations teams.
Core Responsibilities
Design, implement, and manage scalable, secure, and highly available infrastructure on AWS.
Build and maintain CI/CD pipelines using tools like Jenkins, GitLab CI/CD, or GitHub Actions.
Containerize applications using Docker and manage deployments with Kubernetes (EKS, self-managed, or other distributions).
Monitor system performance, availability, and security using tools like CloudWatch, Prometheus, Grafana, ELK/EFK stack.
Collaborate with development teams to optimize application performance and deployment processes.
Required Skills & Experience
3–4 years of professional experience as a DevOps Engineer or similar role.
Strong expertise in AWS services (EC2, S3, RDS, Lambda, VPC, IAM, CloudWatch, EKS, etc.).
Hands-on experience with Docker and Kubernetes (EKS or self-hosted clusters).
Proficiency in CI/CD pipeline design and automation.
Experience with Infrastructure as Code (Terraform / AWS CloudFormation).
Solid understanding of Linux/Unix systems and shell scripting.
Knowledge of monitoring, logging, and alerting tools.
Familiarity with networking concepts (DNS, Load Balancing, Security Groups, Firewalls).
Basic programming/scripting experience in Python, Bash, or Go.
Nice to Have
Exposure to microservices architecture and service mesh (Istio/Linkerd).
Knowledge of serverless (AWS Lambda, API Gateway).
Primary Skills:
Linux – Ubuntu Administration, Git, Gerrit, Jenkins Administration, Cloud services (Preferred AWS) Apache, Ansible, Python, Postgresql, Rabbit MQ, CloudWatch AWS, CFT in AWS
Additional Skills Required:
- Should have experience working with Jenkins, Git, Gerrit
- Should have Good understanding of AWS Security and execution.
- Should have Good python skills
- Should have experience of working with GIT, Gerrit, Jira, Confluence,
- Exposure to messaging systems Rabbit MQ
- Exposure to Html, Groovy, Javascript, shell scripting
- Exposure to Kibana, Provisioning, capacity planning and performance analysis at various levels
- Exposure to Android skills.
- Should have experience in working with cloud-native architecture.
- Experience with log stash and elastic search
- Expert in Full Stack design technique as well as experience working across large environments with multiple operating systems/infrastructure for large-scale programs
- May be recognized as a leader in Agile and cultivating teams working in Agile frameworks
- Strong understanding of techniques such as Continuous Integration, Continuous Delivery, Test Driven Development, Cloud Development, resiliency, security
- Stays abreast of cutting edge technologies/trends and uses experience to influence application of those technologies/trends to support the business
- Experience on Modelling and Provisioning cloud infrastructure using AWS CloudFormation
Key Responsibilities:
- Perform a Technical Lead role for DevOPs development and support teams.
- Need to communicate & coordinate with both offshore and onsite teams
- Should translate business requirements into project plans and workable item/activities
- Have a thorough understanding of software development lifecycle and the ability to implement software following the structured approach.
- Need to perform in-depth technical reviews of project deliverables and ensure it should be defect free (minimize post release defects).
- Understand the current applications and technical architecture and improvise them as needed.
- Stay abreast of new technologies, methods to optimize development process and latest SDKs, testing tools etc
· Strong knowledge on Windows and Linux
· Experience working in Version Control Systems like git
· Hands-on experience in tools Docker, SonarQube, Ansible, Kubernetes, ELK.
· Basic understanding of SQL commands
· Experience working on Azure Cloud DevOps
This company is a network of the world's best developers - full-time, long-term remote software jobs with better compensation and career growth. We enable our clients to accelerate their Cloud Offering, and Capitalize on Cloud. We have our own IOT/AI platform and we provide professional services on that platform to build custom clouds for their IOT devices. We also build mobile apps, run 24x7 devops/site reliability engineering for our clients.
We are looking for very hands-on SRE (Site Reliability Engineering) engineers with 3 to 6 years of experience. The person will be part of team that is responsible for designing & implementing automation from scratch for medium to large scale cloud infrastructure and providing 24x7 services to our North American / European customers. This also includes ensuring ~100% uptime for almost 50+ internal sites. The person is expected to deliver with both high speed and high quality as well as work for 40 Hours per week (~6.5 hours per day, 6 days per week) in shifts which will rotate every month.
This person MUST have:
- B.E Computer Science or equivalent
- 2+ Years of hands-on experience troubleshooting/setting up of the Linux environment, who can write shell scripts for any given requirement.
- 1+ Years of hands-on experience setting up/configuring AWS or GCP services from SCRATCH and maintaining them.
- 1+ Years of hands-on experience setting up/configuring Kubernetes & EKS and ensuring high availability of container orchestration.
- 1+ Years of hands-on experience setting up CICD from SCRATCH in Jenkins & Gitlab.
- Experience configuring/maintaining one monitoring tool.
- Excellent verbal & written communication skills.
- Candidates with certifications - AWS, GCP, CKA, etc will be preferred
- Hands-on experience with databases (Cassandra, MongoDB, MySQL, RDS).
Experience:
- Min 3 years of experience as SRE automation engineer building, running, and maintaining production sites. Not looking for candidates who have experience only as L1/L2 or Build & Deploy..
Location:
- Remotely, anywhere in India
Timings:
- The person is expected to deliver with both high speed and high quality as well as work for 40 Hours per week (~6.5 hours per day, 6 days per week) in shifts which will rotate every month.
Position:
- Full time/Direct
- We have great benefits such as PF, medical insurance, 12 annual company holidays, 12 PTO leaves per year, annual increments, Diwali bonus, spot bonuses and other incentives etc.
- We dont believe in locking in people with large notice periods. You will stay here because you love the company. We have only a 15 days notice period.
Requirements
Cloud Software Engineer
Notice Period: 45 days / Immediate Joining
Banyan Data Services (BDS) is a US-based Infrastructure services Company, headquartered in San Jose, California, USA. It provides full-stack managed services to support business applications and data infrastructure. We do provide the data solutions and services on bare metal, On-prem, and all Cloud platforms. Our engagement service is built on the DevOps standard practice and SRE model.
We offer you an opportunity to join our rocket ship startup, run by a world-class executive team. We are looking for candidates that aspire to be a part of the cutting-edge solutions and services we offer, that address next-gen data evolution challenges. Candidates who are willing to use their experience in areas directly related to Infrastructure Services, Software as Service, and Cloud Services and create a niche in the market.
Roles and Responsibilities
· A wide variety of engineering projects including data visualization, web services, data engineering, web-portals, SDKs, and integrations in numerous languages, frameworks, and clouds platforms
· Apply continuous delivery practices to deliver high-quality software and value as early as possible.
· Work in collaborative teams to build new experiences
· Participate in the entire cycle of software consulting and delivery from ideation to deployment
· Integrating multiple software products across cloud and hybrid environments
· Developing processes and procedures for software applications migration to the cloud, as well as managed services in the cloud
· Migrating existing on-premises software applications to cloud leveraging a structured method and best practices
Desired Candidate Profile : *** freshers can also apply ***
· 2+years of experience with 1 or more development languages such as Java, Python, or Spark.
· 1 year + of experience with private/public/hybrid cloud model design, implementation, orchestration, and support.
· Certification or any training's completion of any one of the cloud environments like AWS, GCP, Azure, Oracle Cloud, and Digital Ocean.
· Strong problem-solvers who are comfortable in unfamiliar situations, and can view challenges through multiple perspectives
· Driven to develop technical skills for oneself and team-mates
· Hands-on experience with cloud computing and/or traditional enterprise datacentre technologies, i.e., network, compute, storage, and virtualization.
· Possess at least one cloud-related certification from AWS, Azure, or equivalent
· Ability to write high-quality, well-tested code and comfort with Object-Oriented or functional programming patterns
· Past experience quickly learning new languages and frameworks
· Ability to work with a high degree of autonomy and self-direction
http://www.banyandata.com" target="_blank">www.banyandata.com
As part of the engineering team, you would be expected to have
deep technology expertise with a passion for building highly scalable products.
This is a unique opportunity where you can impact the lives of people across 150+
countries!
Responsibilities
• Develop Collaborate in large-scale systems design discussions.
• Deploying and maintaining in-house/customer systems ensuring high availability,
performance and optimal cost.
• Automate build pipelines. Ensuring right architecture for CI/CD
• Work with engineering leaders to ensure cloud security
• Develop standard operating procedures for various facets of Infrastructure
services (CI/CD, Git Branching, SAST, Quality gates, Auto Scaling)
• Perform & automate regular backups of servers & databases. Ensure rollback and
restore capabilities are Realtime and with zero-downtime.
• Lead the entire DevOps charter for ONE Championship. Mentor other DevOps
engineers. Ensure industry standards are followed.
Requirements
• Overall 5+ years of experience in as DevOps Engineer/Site Reliability Engineer
• B.E/B.Tech in CS or equivalent streams from institute of repute
• Experience in Azure is a must. AWS experience is a plus
• Experience in Kubernetes, Docker, and containers
• Proficiency in developing and deploying fully automated environments using
Puppet/Ansible and Terraform
• Experience with monitoring tools like Nagios/Icinga, Prometheus, AlertManager,
Newrelic
• Good knowledge of source code control (git)
• Expertise in Continuous Integration and Continuous Deployment setup using Azure
Pipeline or Jenkins
• Strong experience in programming languages. Python is preferred
• Experience in scripting and unit testing
• Basic knowledge of SQL & NoSQL databases
• Strong Linux fundamentals
• Experience in SonarQube, Locust & Browserstack is a plus










