
- 3+ years experience leading a team of DevOps engineers
- 8+ years experience managing DevOps for large engineering teams developing cloud-native software
- Strong in networking concepts
- In-depth knowledge of AWS and cloud architectures/services.
- Experience within the container and container orchestration space (Docker, Kubernetes)
- Passion for CI/CD pipeline using tools such as Jenkins etc.
- Familiarity with config management tools like Ansible Terraform etc
- Proven record of measuring and improving DevOps metrics
- Familiarity with observability tools and experience setting them up
- Passion for building tools and productizing services that empower development teams.
- Excellent knowledge of Linux command-line tools and ability to write bash scripts.
- Strong in Unix / Linux administration and management,
KEY ROLES/RESPONSIBILITIES:
- Own and manage the entire cloud infrastructure
- Create the entire CI/CD pipeline to build and release
- Explore new technologies and tools and recommend those that best fit the team and organization
- Own and manage the site reliability
- Strong decision-making skills and metric-driven approach
- Mentor and coach other team members

Similar jobs
Objectives of this role
•Building and implementing new development tools and infrastructure
•Understanding the needs of stakeholders and conveying them to developers
•Working on ways to automate and improve development and release processes
•Testing and examining code written by others and analysing results
•Ensuring that systems are safe and secure against cybersecurity threats
•Identifying technical problems and developing software updates and fixes
•Working with software developers and software engineers to ensure that development follows established processes and works as intended
•Planning projects and being involved in project management decisions
Responsibilities:
• Set up CI/CD pipelines for automated deployment and delivery
•Setup and management of new and Existing cloud-based Kubernetes cluster services
•Write Ad/Hoc Bash/Python scripts to automate certain operational tasks.
•Designing, maintenance and management of tools for automation of different operational processes.
•Provision of critical system security by leveraging best practices and prolific cloud security solutions.
•System troubleshooting and problem resolution across various application domains and platforms
•Support/maintain development, UAT and production infrastructure.
•Providing recommendations for architecture and process improvements.
•Respond to L2 calls and emails.
•Help administer monitoring systems, alerting, log management, and other IT infrastructure systems.
•Perform root cause analysis of production errors and resolve technical issues
•Design procedures for system troubleshooting and maintenance
Technical Skill Requirements:
•Experience in a DevOps role in AWS/OCI cloud environment.
•Must have experience with CI/CD Pipelines and hands-on experience with DevOps tools such as, Jenkins, Git, Docker, Kubernetes, Ansible, etc.
•Strong knowledge in Terraform for multi-stack cloud infrastructure provisioning.
•Strong knowledge in OCI/AWS-based Kubernetes service management.
•Must have experience with Python/Bash as a scripting language.
•Good knowledge in software debugging, web applications and services (Apache, Nginx, HAProxy)
•Must have knowledge in monitoring setup with Prometheus, Alertmanager, Grafana, Thanos, Loki, Fluentbit, etc.
Good To Have Skills
•PostgreSQL, MySQL, MongoDB, Redis, Keycloak.
•Migrating application from one cloud to another; OCI certifications
•Test Driven Development
Soft Skill Requirements:
•Able to learn new skills and technology quickly.
•Energetic with amazing customer service skills and a team-oriented approach.
•Strong verbal and written communication skills
Skills We Require:- Dev Ops, AWS Admin, terraform, Infrastructure as a Code
SUMMARY:-
- Implement integrations requested by customers
- Deploy updates and fixes
- Provide Level 2 technical support
- Build tools to reduce occurrences of errors and improve customer experience
- Develop software to integrate with internal back-end systems
- Perform root cause analysis for production errors
- Investigate and resolve technical issues
- Develop scripts to automate visualization
- Design procedures for system troubleshooting and maintenance
Have good hands on experience on Dev Ops, AWS Admin, terraform, Infrastructure as a Code
Have knowledge on EC2, Lambda, S3, ELB, VPC, IAM, Cloud Watch, Centos, Server Hardening
Ability to understand business requirements and translate them into technical requirements
A knack for benchmarking and optimizationWe are seeking a skilled DevOps Engineer with 3+ years of experience to join our team on a permanent work-from-home basis.
Responsibilities:
- Develop and maintain infrastructure using Ansible.
- Write Ansible playbooks.
- Implement CI/CD pipelines.
- Manage GitLab repositories.
- Monitor and troubleshoot infrastructure issues.
- Ensure security and compliance.
- Document best practices.
Qualifications:
- Proven DevOps experience.
- Expertise with Ansible and CI/CD pipelines.
- Proficient with GitLab.
- Strong scripting skills.
- Excellent problem-solving and communication skills.
Regards,
Aishwarya M
Associate HR
The Key Responsibilities Include But Not Limited to:
Help identify and drive Speed, Performance, Scalability, and Reliability related optimization based on experience and learnings from the production incidents.
Work in an agile DevSecOps environment in creating, maintaining, monitoring, and automation of the overall solution-deployment.
Understand and explain the effect of product architecture decisions on systems.
Identify issues and/or opportunities for improvements that are common across multiple services/teams.
This role will require weekend deployments
Skills and Qualifications:
1. 3+ years of experience in a DevOps end-to-end development process with heavy focus on service monitoring and site reliability engineering work.
2. Advanced knowledge of programming/scripting languages (Bash, PERL, Python, Node.js).
3. Experience in Agile/SCRUM enterprise-scale software development including working with GiT, JIRA, Confluence, etc.
4. Advance experience with core microservice technology (RESTFul development).
5. Working knowledge of using Advance AI/ML tools are pluses.
6. Working knowledge in the one or more of the Cloud Services: Amazon AWS, Microsoft Azure
7. Bachelors or Master’s degree in Computer Science or equivalent related field experience
Key Behaviours / Attitudes:
Professional curiosity and a desire to a develop deep understanding of services and technologies.
Experience building & running systems to drive high availability, performance and operational improvements
Excellent written & oral communication skills; to ask pertinent questions, and to assess/aggregate/report the responses.
Ability to quickly grasp and analyze complex and rapidly changing systemsSoft skills
1. Self-motivated and self-managing.
2. Excellent communication / follow-up / time management skills.
3. Ability to fulfill role/duties independently within defined policies and procedures.
4. Ability to balance multi-task and multiple priorities while maintaining a high level of customer satisfaction is key.
5. Be able to work in an interrupt-driven environment.Work with Dori Ai world class technology to develop, implement, and support Dori's global infrastructure.
As a member of the IT organization, assist with the analyze of existing complex programs and formulate logic for new complex internal systems. Prepare flowcharting, perform coding, and test/debug programs. Develop conversion and system implementation plans. Recommend changes to development, maintenance, and system standards.
Leading contributor individually and as a team member, providing direction and mentoring to others. Work is non-routine and very complex, involving the application of advanced technical/business skills in a specialized area. BS or equivalent experience in programming on enterprise or department servers or systems.
Hiring for a funded fintech startup based out of Bangalore!!!
Our Ideal Candidate
We are looking for a Senior DevOps engineer to join the engineering team and help us automate the build, release, packaging and infrastructure provisioning and support processes. The candidate is expected to own the full life-cycle of provisioning, configuration management, monitoring, maintenance and support for cloud as well as on-premise deployments.
Requirements
- 5-plus years of DevOps experience managing the Big Data application stack including HDFS, YARN, Spark, Hive and Hbase
- Deeper understanding of all the configurations required for installing and maintaining the infrastructure in the long run
- Experience setting up high availability, configuring resource allocation, setting up capacity schedulers, handling data recovery tasks
- Experience with middle-layer technologies including web servers (httpd, ningx), application servers (Jboss, Tomcat) and database systems (postgres, mysql)
- Experience setting up enterprise security solutions including setting up active directories, firewalls, SSL certificates, Kerberos KDC servers, etc.
- Experience maintaining and hardening the infrastructure by regularly applying required security packages and patches
- Experience supporting on-premise solutions as well as on AWS cloud
- Experience working with and supporting Spark-based applications on YARN
- Experience with one or more automation tools such as Ansible, Teraform, etc
- Experience working with CI/CD tools like Jenkins and various test report and coverage plugins
- Experience defining and automating the build, versioning and release processes for complex enterprise products
- Experience supporting clients remotely and on-site
- Experience working with and supporting Java- and Python-based tech stacks would be a plus
Desired Non-technical Requirements
- Very strong communication skills both written and verbal
- Strong desire to work with start-ups
- Must be a team player
Job Perks
- Attractive variable compensation package
- Flexible working hours – everything is results-oriented
- Opportunity to work with an award-winning organization in the hottest space in tech – artificial intelligence and advanced machine learning
Engineering Leader, Cloud Infrastructure.
Bengaluru, Karnataka, India
Do you thrive on solving complex technical problems? Do you want to be at the cutting edge of technology? If so,we’re interested in speaking with you!
Your Impact:
We’re looking for a seasoned engineering leader in the Cloud team that is responsible for building, operating, and maintaining a customer-facing DBaaS service in multiple public clouds (AWS, GCP, and Azure). The service supports unified multiverse management of YugabyteDB, including fault-domain aware provisioning, rolling upgrades, security,
networking, monitoring, and day-2 operations (backups, scaling, billing etc). If you’re a strong leader who exemplifies collaboration, who is driven and thrive in a fast-paced startup environment, and who has a strong desire to build an internet-scale, extensible cloud based service with strong emphasis on simplicity and user experience, this job is for
you.
You Will:
Lead, inspire, and influence to make sure your team is successful
Partner with the recruiting team to attract and retain high-quality and diverse talent
Establish great rapport with other development teams, Product Managers, Sales and Customer Success tomaintain high levels of visibility, efficiency, and collaboration
Ensure teams have appropriate technical direction, leadership and balance between short-term impact andlong term architectural vision.
Occasionally contributing to development tasks such as coding and feature verifications to assist teamswith release commitments, to gain an understanding of the deeply technical product as well as to keepyour technical acumen sharp.
You'll need:
BS/MS degree in CS-or- a related field with 5+ years of engineering management experience leading productive, high-functioning teams
Strong fundamentals in distributed systems design and development
Ability to hire while ensuring a high hiring bar, keep engineers motivated, coach/mentor, and handle performance management
Experience running production services in Public Clouds such as AWS, GCP, and Azure
Experience with running large stateful data systems in the Cloud
Prior knowledge of Cloud architecture and implementation features (multi-tenancy, containerization,orchestration, elastic scalability)
A great track record of shipping features and hitting deadlines consistently; should be able to move fast,build in increments and iterate; have a sense of urgency, aggressive mindset towards achieving results and excellent prioritization skills; able to anticipate future technical needs for the product and craft plans to realize them
Ability to influence the team, peers, and upper management using effective communication and collaborative techniques; focused on building and maintaining a culture of collaboration within the team.
- Automate deployments of infrastructure components and repetitive tasks.
- Drive changes strictly via the infrastructure-as-code methodology.
- Promote the use of source control for all changes including application and system-level changes.
- Design & Implement self-recovering systems after failure events.
- Participate in system sizing and capacity planning of various components.
- Create and maintain technical documents such as installation/upgrade MOPs.
- Coordinate & collaborate with internal teams to facilitate installation & upgrades of systems.
- Support 24x7 availability for corporate sites & tools.
- Participate in rotating on-call schedules.
- Actively involved in researching, evaluating & selecting new tools & technologies.
- Cloud computing – AWS, OCI, OpenStack
- Automation/Configuration management tools such as Terraform & Chef
- Atlassian tools administration (JIRA, Confluence, Bamboo, Bitbucket)
- Scripting languages - Ruby, Python, Bash
- Systems administration experience – Linux (Redhat), Mac, Windows
- SCM systems - Git
- Build tools - Maven, Gradle, Ant, Make
- Networking concepts - TCP/IP, Load balancing, Firewall
- High-Availability, Redundancy & Failover concepts
- SQL scripting & queries - DML, DDL, stored procedures
- Decisive and ability to work under pressure
- Prioritizing workload and multi-tasking ability
- Excellent written and verbal communication skills
- Database systems – Postgres, Oracle, or other RDBMS
- Mac automation tools - JAMF or other
- Atlassian Datacenter products
- Project management skills
Qualifications
- 3+ years of hands-on experience in the field or related area
- Requires MS or BS in Computer Science or equivalent field
Should be open to embracing new technologies, keeping up with emerging tech.
Strong troubleshooting and problem-solving skills.
Willing to be part of a high-performance team, build mature products.
Should be able to take ownership and work under minimal supervision.
Strong Linux System Administration background (with minimum 2 years experience), responsible for handling/defining the organization infrastructure(Hybrid).
Working knowledge of MySQL databases, Nginx, and Haproxy Load Balancer.
Experience in CI/CD pipelines, Configuration Management (Ansible/Saltstack) & Cloud Technologies (AWS/Azure/GCP)
Hands-on experience in GitHub, Jenkins, Prometheus, Grafana, Nagios, and Open Sources tools.
Strong Shell & Python scripting would be a plus.
• At least 4 years of hands-on experience with cloud infrastructure on GCP
• Hands-on-Experience on Kubernetes is a mandate
• Exposure to configuration management and orchestration tools at scale (e.g. Terraform, Ansible, Packer)
• Knowledge and hand-on-experience in DevOps tools (e.g. Jenkins, Groovy, and Gradle)
• Knowledge and hand-on-experience on the various platforms (e.g. Gitlab, CircleCl and Spinnakar)
• Familiarity with monitoring and alerting tools (e.g. CloudWatch, ELK stack, Prometheus)
• Proven ability to work independently or as an integral member of a team
Preferable Skills:
• Familiarity with standard IT security practices such as encryption,
credentials and key management.
• Proven experience on various coding languages (Java, Python-) to
• support DevOps operation and cloud transformation
• Familiarity and knowledge of the web standards (e.g. REST APIs, web security mechanisms)
• Hands on experience with GCP
• Experience in performance tuning, services outage management and troubleshooting.
Attributes:
• Good verbal and written communication skills
• Exceptional leadership, time management, and organizational skill Ability to operate independently and make decisions with little direct supervision

Responsibilities
- Building and maintenance of resilient and scalable production infrastructure
- Improvement of monitoring systems
- Creation and support of development automation processes (CI / CD)
- Participation in infrastructure development
- Detection of problems in architecture and proposing of solutions for solving them
- Creation of tasks for system improvements for system scalability, performance and monitoring
- Analysis of product requirements in the aspect of devops
- Managing a team of DevOps, control of task deliveries
- Incident analysis and fixing
Technology stack
Linux, Bash, Salt/Ansible, LXC, libvirt, IPsec, VXLAN, Open vSwitch, OpenVPN, OSPF, BIRD, Cisco NX-OS, Multicast, PIM, LVM, software RAID, LUKS, PostgreSQL, nginx, haproxy, Prometheus, Grafana, Zabbix, GitLab, Capistrano
Skills and Experience
- Understanding of the distributed systems principles
- Understanding of principles for building a resistant network infrastructure
- Experience of Ubuntu Linux administration (Debian-like will be a plus)
- Strong knowledge of Bash
- Experience of working with LXC-containers
- Understanding and experience with infrastructure as a code approach
- Experience of development idempotent Ansible roles
- Experience with relational databases (PostgeSQL), ability to create simple SQL queries
- Experience with git
- Experience with monitoring and metric collect systems (Prometheus, Grafana, Zabbix)
- Understanding of dynamic routing (OSPF)
Preferred experience
- Experience of working with highload zero-downtown environments
- Experience of coding on Python
- Experience of working with IPsec, VXLAN, Open vSwitch
- Knowledge and experience of working with network equipment Cisco
- Experience of working with Cisco NX-OS
- Knowledge of principles of multicast protocols IGMP, PIM
- Experience of setting multicast on Cisco equipment
- Experience of working with Solarflare Onload
- Experience administering Atlassian products

