Engineering Leader, Cloud Infrastructure.
Bengaluru, Karnataka, India
Do you thrive on solving complex technical problems? Do you want to be at the cutting edge of technology? If so,we’re interested in speaking with you!
Your Impact:
We’re looking for a seasoned engineering leader in the Cloud team that is responsible for building, operating, and maintaining a customer-facing DBaaS service in multiple public clouds (AWS, GCP, and Azure). The service supports unified multiverse management of YugabyteDB, including fault-domain aware provisioning, rolling upgrades, security,
networking, monitoring, and day-2 operations (backups, scaling, billing etc). If you’re a strong leader who exemplifies collaboration, who is driven and thrive in a fast-paced startup environment, and who has a strong desire to build an internet-scale, extensible cloud based service with strong emphasis on simplicity and user experience, this job is for
you.
You Will:
Lead, inspire, and influence to make sure your team is successful
Partner with the recruiting team to attract and retain high-quality and diverse talent
Establish great rapport with other development teams, Product Managers, Sales and Customer Success tomaintain high levels of visibility, efficiency, and collaboration
Ensure teams have appropriate technical direction, leadership and balance between short-term impact andlong term architectural vision.
Occasionally contributing to development tasks such as coding and feature verifications to assist teamswith release commitments, to gain an understanding of the deeply technical product as well as to keepyour technical acumen sharp.
You'll need:
BS/MS degree in CS-or- a related field with 5+ years of engineering management experience leading productive, high-functioning teams
Strong fundamentals in distributed systems design and development
Ability to hire while ensuring a high hiring bar, keep engineers motivated, coach/mentor, and handle performance management
Experience running production services in Public Clouds such as AWS, GCP, and Azure
Experience with running large stateful data systems in the Cloud
Prior knowledge of Cloud architecture and implementation features (multi-tenancy, containerization,orchestration, elastic scalability)
A great track record of shipping features and hitting deadlines consistently; should be able to move fast,build in increments and iterate; have a sense of urgency, aggressive mindset towards achieving results and excellent prioritization skills; able to anticipate future technical needs for the product and craft plans to realize them
Ability to influence the team, peers, and upper management using effective communication and collaborative techniques; focused on building and maintaining a culture of collaboration within the team.
Similar jobs
What you’ll be doing at Novo:
● Systems thinking
● Creating best practices, templates, and automation for build, test, integration and
deployment pipelines on multiple projects
● Designing and developing tools for easily creating and managing dev/test infrastructure
and services in AWS cloud
● Providing expertise and guidance on CI/CD, Github, and other development tools via
containerization
● Monitoring and support systems in Dev, UAT and production environments
● Building mock services and production-like data sources for use in development and
testing
● Managing Github integrations, feature flag systems, code coverage tools, and other
development & monitoring tools tools
● Participating in support rotations to help troubleshoot to infrastructure issues
Stacks you eat everyday ( For Devops Engineer )
● Creating and working with containers, as well as using container orchestration tools
(Kubernetes / Docker)
● AWS: S3, EKS, EC2, RDS, Route53, VPC etc.
● Fair understanding of Linux
● Good knowledge of CI/CD : Jenkins / CircleCI / Github Actions
● Basic level of monitoring
● Support for Deployment along with various Web Servers and Linux environments , both
backend and frontend.
Bachelor's degree in Computer Science or a related field, or equivalent work experience
Strong understanding of cloud infrastructure and services, such as AWS, Azure, or Google Cloud Platform
Experience with infrastructure as code tools such as Terraform or CloudFormation
Proficiency in scripting languages such as Python, Bash, or PowerShell
Familiarity with DevOps methodologies and tools such as Git, Jenkins, or Ansible
Strong problem-solving and analytical skills
Excellent communication and collaboration skills
Ability to work independently and as part of a team
Willingness to learn new technologies and tools as required
Responsibilities:
- Design, implement, and maintain cloud infrastructure solutions on Microsoft Azure, with a focus on scalability, security, and cost optimization.
- Collaborate with development teams to streamline the deployment process, ensuring smooth and efficient delivery of software applications.
- Develop and maintain CI/CD pipelines using tools like Azure DevOps, Jenkins, or GitLab CI to automate build, test, and deployment processes.
- Utilize infrastructure-as-code (IaC) principles to create and manage infrastructure deployments using Terraform, ARM templates, or similar tools.
- Manage and monitor containerized applications using Azure Kubernetes Service (AKS) or other container orchestration platforms.
- Implement and maintain monitoring, logging, and alerting solutions for cloud-based infrastructure and applications.
- Troubleshoot and resolve infrastructure and deployment issues, working closely with development and operations teams.
- Ensure high availability, performance, and security of cloud infrastructure and applications.
- Stay up-to-date with the latest industry trends and best practices in cloud infrastructure, DevOps, and automation.
Requirements:
- Bachelor's degree in Computer Science, Engineering, or a related field (or equivalent work experience).
- Minimum of four years of proven experience working as a DevOps Engineer or similar role, with a focus on cloud infrastructure and deployment automation.
- Strong expertise in Microsoft Azure services, including but not limited to Azure Virtual Machines, Azure App Service, Azure Storage, Azure Networking, Azure Security, and Azure Monitor.
- Proficiency in infrastructure-as-code (IaC) tools such as Terraform or ARM templates.
- Hands-on experience with containerization and orchestration platforms, preferably Azure Kubernetes Service (AKS) or Docker Swarm.
- Solid understanding of CI/CD principles and experience with relevant tools such as Azure DevOps, Jenkins, or GitLab CI.
- Experience with scripting languages like PowerShell, Bash, or Python for automation tasks.
- Strong problem-solving and troubleshooting skills with a proactive and analytical mindset.
- Excellent communication and collaboration skills, with the ability to work effectively in a team environment.
- Azure certifications (e.g., Azure Administrator, Azure DevOps Engineer, Azure Solutions Architect) are a plus.
Job Responsibilities:
Work & Deploy updates and fixes Provide Level 2 technical support Support implementation of fully automated CI/CD pipelines as per dev requirement Follow the escalation process through issue completion, including providing documentation after resolution Follow regular Operations procedures and complete all assigned tasks during the shift. Assist in root cause analysis of production issues and help write a report which includes details about the failure, the relevant log entries, and likely root cause Setup of CICD frameworks (Jenkins / Azure DevOps Server), Containerization using Docker, etc Implement continuous testing, Code Quality, Security using DevOps tooling Build a knowledge base by creating and updating documentation for support
Skills Required:
DevOps, Linux, AWS, Ansible, Jenkins, GIT, Terraform, CI, CD, Cloudformation, Typescript
- Seeking an Individual carrying around 5+ yrs of experience.
- Must have skills - Jenkins, Groovy, Ansible, Shell Scripting, Python, Linux Admin
- Terraform, AWS deep knowledge to automate and provision EC2, EBS, SQL Server, cost optimization, CI/CD pipeline using Jenkins, Server less automation is plus.
- Excellent writing and communication skills in English. Enjoy writing crisp and understandable documentation
- Comfortable programming in one or more scripting languages
- Enjoys tinkering with tooling. Find easier ways to handle systems by doing some research. Strong awareness around build vs buy.
• Hands-on experience in Azure.
• Build and maintain CI/CD tools and pipelines.
• Designing and managing highly scalable, reliable, and fault-tolerant infrastructure & networking that forms the backbone of distributed systems at RARA Now.
• Continuously improve code quality, product execution, and customer delight.
• Communicate, collaborate and work effectively across distributed teams in a global environment.
• Operate to strengthen teams across their product with their knowledge base
• Contribute to improving team relatedness, and help build a culture of camaraderie.
• Continuously refactor applications to ensure high-quality design
• Pair with team members on functional and non-functional requirements and spread design philosophy and goals across the team
• Excellent bash, and scripting fundamentals and hands-on with scripting in programming languages such as Python, Ruby, Golang, etc.
• Good understanding of distributed system fundamentals and ability to troubleshoot issues in a larger distributed infrastructure
• Working knowledge of the TCP/IP stack, internet routing, and load balancing
• Basic understanding of cluster orchestrators and schedulers (Kubernetes)
• Deep knowledge of Linux as a production environment, and container technologies. e.g., Docker, Infrastructure as Code such as Terraform, and K8s administration at large scale.
• Have worked on production distributed systems and have an understanding of microservices architecture, RESTful services, and CI/CD.
- Development and maintenance of Continuous Integration System on JENKINS.
- Build management for the planned major/minor releases
- Release process management and maintenance
- Enhancement and development of build/release system features.
Required Qualifications:
- 2 - 3 years relevant work experience in Jenkins / Scripting / C / Linux
- Expertise in scripting languages like a shell, python, etc
- Work experience in handling Make/CMake build systems
- Expertise in GIT source revision control
- Experience with Yocto build systems and recipes
We are looking for an experienced DevOps engineer that will help our team establish DevOps practice. You will work closely with the technical lead to identify and establish DevOps practices in the company. You will also help us build scalable, efficient cloud infrastructure. You’ll implement monitoring for automated system health checks. Lastly, you’ll build our CI pipeline, and train and guide the team in DevOps practices.
Responsibilities
- Deployment, automation, management, and maintenance of production systems.
- Ensuring availability, performance, security, and scalability of production systems.
- Evaluation of new technology alternatives and vendor products.
- System troubleshooting and problem resolution across various application domains and platforms.
- Providing recommendations for architecture and process improvements.
- Definition and deployment of systems for metrics, logging, and monitoring on the AWS
platform.
- Manage the establishment and configuration of SaaS infrastructure in an agile way
by storing infrastructure as code and employing automated configuration management tools with a goal to be able to re-provision environments at any point in time.
- Be accountable for proper backup and disaster recovery procedures.
- Drive operational cost reductions through service optimizations and demand-based
auto-scaling.
- Have on-call responsibilities.
- Perform root cause analysis for production errors
- Uses open source technologies and tools to accomplish specific use cases encountered
within the project.
- Uses coding languages or scripting methodologies to solve a problem with a custom workflow.
Requirements
- Systematic problem-solving approach, coupled with strong communication skills and a sense of ownership and drive.
- Prior experience as a software developer in a couple of high-level programming
languages.
- Extensive experience in any Javascript-based framework since we will be deploying services to NodeJS on AWS Lambda (Serverless)
- Strong Linux system administration background.
- Ability to present and communicate the architecture in a visual form.
- Strong knowledge of AWS (e.g. IAM, EC2, VPC, ELB, ALB, Autoscaling, Lambda, NAT
gateway, DynamoDB)
- Experience maintaining and deploying highly-available, fault-tolerant systems at scale (~
1 Lakh users a day)
- A drive towards automating repetitive tasks (e.g. scripting via Bash, Python, Ruby, etc)
- Expertise with Git
- Experience implementing CI/CD (e.g. Jenkins, TravisCI)
- Strong experience with databases such as MySQL, NoSQL, Elasticsearch, Redis and/or
Mongo.
- Stellar troubleshooting skills with the ability to spot issues before they become problems.
- Current with industry trends, IT ops and industry best practices, and able to identify the
ones we should implement.
- Time and project management skills, with the capability to prioritize and multitask as
needed.