Engineering Leader- Cloud Infrastructure
at Our client company is into Computer software. (YB1)
- Lead, inspire, and influence to make sure your team is successful
- Partner with the recruiting team to attract and retain high-quality and diverse talent
- Establish great rapport with other development teams, Product Managers, Sales, and Customer Success to maintain high levels of visibility, efficiency, and collaboration
- Ensure teams have appropriate technical direction, leadership, and balance between short-term impact and long-term architectural vision.
- Occasionally contributing to development tasks such as coding and feature verifications to assist teams with release commitments, to gain an understanding of the deeply technical product as well as to keep your technical acumen sharp
You'll need:
- BS/MS degree in CS-or- a related field with 5+ years of engineering management experience leading productive, high-functioning teams
- Strong fundamentals in distributed systems design and development
- Ability to hire while ensuring a high hiring bar, keep engineers motivated, coach/mentor, and handle performance management
- Experience running production services in Public Clouds such as AWS, GCP, and Azure
- Experience with running large stateful data systems in the Cloud
- Prior knowledge of Cloud architecture and implementation features (multi-tenancy, containerization, orchestration, elastic scalability)
- A great track record of shipping features and hitting deadlines consistently; should be able to move fast, build in increments and iterate; have a sense of urgency, aggressive mindset towards achieving results and excellent prioritization skills; able to anticipate future technical needs for the product and craft plans to realize them
- Ability to influence the team, peers, and upper management using effective communication and collaborative techniques; focused on building and maintaining a culture of collaboration within the team
Similar jobs
Please Apply - https://zrec.in/7EYKe?source=CareerSite
About Us
Infra360 Solutions is a services company specializing in Cloud, DevSecOps, Security, and Observability solutions. We help technology companies adapt DevOps culture in their organization by focusing on long-term DevOps roadmap. We focus on identifying technical and cultural issues in the journey of successfully implementing the DevOps practices in the organization and work with respective teams to fix issues to increase overall productivity. We also do training sessions for the developers and make them realize the importance of DevOps. We provide these services - DevOps, DevSecOps, FinOps, Cost Optimizations, CI/CD, Observability, Cloud Security, Containerization, Cloud Migration, Site Reliability, Performance Optimizations, SIEM and SecOps, Serverless automation, Well-Architected Review, MLOps, Governance, Risk & Compliance. We do assessments of technology architecture, security, governance, compliance, and DevOps maturity model for any technology company and help them optimize their cloud cost, streamline their technology architecture, and set up processes to improve the availability and reliability of their website and applications. We set up tools for monitoring, logging, and observability. We focus on bringing the DevOps culture to the organization to improve its efficiency and delivery.
Job Description
Job Title: Senior DevOps Engineer / SRE
Department: Technology
Location: Gurgaon
Work Mode: On-site
Working Hours: 10 AM - 7 PM
Terms: Permanent
Experience: 4-6 years
Education: B.Tech/MCA
Notice Period: Immediately
About Us
At Infra360.io, we are a next-generation cloud consulting and services company committed to delivering comprehensive, 360-degree solutions for cloud, infrastructure, DevOps, and security. We partner with clients to transform and optimize their technology landscape, ensuring resilience, scalability, cost efficiency and innovation.
Our core services include Cloud Strategy, Site Reliability Engineering (SRE), DevOps, Cloud Security Posture Management (CSPM), and related Managed Services. We specialize in driving operational excellence across multi-cloud environments, helping businesses achieve their goals with agility and reliability.
We thrive on ownership, collaboration, problem-solving, and excellence, fostering an environment where innovation and continuous learning are at the forefront. Join us as we expand and redefine what’s possible in cloud technology and infrastructure.
Role Summary
We are seeking a Senior DevOps Engineer (SRE) to manage and optimize large-scale, mission-critical production systems. The ideal candidate will have a strong problem-solving mindset, extensive experience in troubleshooting, and expertise in scaling, automating, and enhancing system reliability. This role requires hands-on proficiency in tools like Kubernetes, Terraform, CI/CD, and cloud platforms (AWS, GCP, Azure), along with scripting skills in Python or Go. The candidate will drive observability and monitoring initiatives using tools like Prometheus, Grafana, and APM solutions (Datadog, New Relic, OpenTelemetry).
Strong communication, incident management skills, and a collaborative approach are essential. Experience in team leadership and multi-client engagement is a plus.
Ideal Candidate Profile
- Solid 4-6 years of experience as an SRE and DevOps with a proven track record of handling large-scale production environments
- Bachelor's or Master's degree in Computer Science, Engineering, or a related field
- Strong Hands-on experience with managing Large Scale Production Systems
- Strong Production Troubleshooting Skills and handling high-pressure situations.
- Strong Experience with Databases (PostgreSQL, MongoDB, ElasticSearch, Kafka)
- Worked on making production systems more Scalable, Highly Available and Fault-tolerant
- Hands-on experience with ELK or other logging and observability tools
- Hands-on experience with Prometheus, Grafana & Alertmanager and on-call processes like Pagerduty
- Problem-Solving Mindset
- Strong with skills - K8s, Terraform, Helm, ArgoCD, AWS/GCP/Azure etc
- Good with Python/Go Scripting Automation
- Strong with fundamentals like DNS, Networking, Linux
- Experience with APM tools like - Newrelic, Datadog, OpenTelemetry
- Good experience with Incident Response, Incident Management, Writing detailed RCAs
- Experience with Applications best practices in making apps more reliable and fault-tolerant
- Strong leadership skills and the ability to mentor team members and provide guidance on best practices.
- Able to manage multiple clients and take ownership of client issues.
- Experience with Git and coding best practices
Good to have
- Team-leading Experience
- Multiple Client Handling
- Requirements gathering from clients
- Good Communication
Key Responsibilities
- Design and Development:
- Architect, design, and develop high-quality, scalable, and secure cloud-based software solutions.
- Collaborate with product and engineering teams to translate business requirements into technical specifications.
- Write clean, maintainable, and efficient code, following best practices and coding standards.
- Cloud Infrastructure:
- Develop and optimise cloud-native applications, leveraging cloud services like AWS, Azure, or Google Cloud Platform (GCP).
- Implement and manage CI/CD pipelines for automated deployment and testing.
- Ensure the security, reliability, and performance of cloud infrastructure.
- Technical Leadership:
- Mentor and guide junior engineers, providing technical leadership and fostering a collaborative team environment.
- Participate in code reviews, ensuring adherence to best practices and high-quality code delivery.
- Lead technical discussions and contribute to architectural decisions.
- Problem Solving and Troubleshooting:
- Identify, diagnose, and resolve complex software and infrastructure issues.
- Perform root cause analysis for production incidents and implement preventative measures.
- Continuous Improvement:
- Stay up-to-date with the latest industry trends, tools, and technologies in cloud computing and software engineering.
- Contribute to the continuous improvement of development processes, tools, and methodologies.
- Drive innovation by experimenting with new technologies and solutions to enhance the platform.
- Collaboration:
- Work closely with DevOps, QA, and other teams to ensure smooth integration and delivery of software releases.
- Communicate effectively with stakeholders, including technical and non-technical team members.
- Client Interaction & Management:
- Will serve as a direct point of contact for multiple clients.
- Able to handle the unique technical needs and challenges of two or more clients concurrently.
- Involve both direct interaction with clients and internal team coordination.
- Production Systems Management:
- Must have extensive experience in managing, monitoring, and debugging production environments.
- Will work on troubleshooting complex issues and ensure that production systems are running smoothly with minimal downtime.
LogiNext is looking for a technically savvy and passionate Principal DevOps Engineer or Senior Database Administrator to cater to the development and operations efforts in product. You will choose and deploy tools and technologies to build and support a robust infrastructure.
You have hands-on experience in building secure, high-performing and scalable infrastructure. You have experience to automate and streamline the development operations and processes. You are a master in troubleshooting and resolving issues in dev, staging and production environments.
Responsibilities:
Design and implement scalable infrastructure for delivering and running web, mobile and big data applications on cloud Scale and optimise a variety of SQL and NoSQL databases (especially MongoDB), web servers, application frameworks, caches, and distributed messaging systems Automate the deployment and configuration of the virtualized infrastructure and the entire software stack Plan, implement and maintain robust backup and restoration policies ensuring low RTO and RPO Support several Linux servers running our SaaS platform stack on AWS, Azure, IBM Cloud, Ali Cloud Define and build processes to identify performance bottlenecks and scaling pitfalls Manage robust monitoring and alerting infrastructure Explore new tools to improve development operations to automate daily tasks Ensure High Availability and Auto-failover with minimum or no manual interventions
Requirements:
Bachelor’s degree in Computer Science, Information Technology or a related field 8 to 10 years of experience in designing and maintaining high volume and scalable micro-services architecture on cloud infrastructure Strong background in Linux/Unix Administration and Python/Shell Scripting Extensive experience working with cloud platforms like AWS (EC2, ELB, S3, Auto-scaling, VPC, Lambda), GCP, Azure Experience in deployment automation, Continuous Integration and Continuous Deployment (Jenkins, Maven, Puppet, Chef, GitLab) and monitoring tools like Zabbix, Cloud Watch Monitoring, Nagios Knowledge of Java Virtual Machines, Apache Tomcat, Nginx, Apache Kafka, Microservices architecture, Caching mechanisms Experience in query analysis, peformance tuning, database redesigning, Experience in enterprise application development, maintenance and operations Knowledge of best practices and IT operations in an always-up, always-available service Excellent written and oral communication skills, judgment and decision-making skills
Only apply on this link - https://loginext.hire.trakstar.com/jobs/fk025uh?source=" target="_blank">https://loginext.hire.trakstar.com/jobs/fk025uh?source=
LogiNext is looking for a technically savvy and passionate Associate Vice President - Product Engineering - DevOps or Senior Database Administrator to cater to the development and operations efforts in product. You will choose and deploy tools and technologies to build and support a robust infrastructure.
You have hands-on experience in building secure, high-performing and scalable infrastructure. You have experience to automate and streamline the development operations and processes. You are a master in troubleshooting and resolving issues in dev, staging and production environments.
Responsibilities:
- Design and implement scalable infrastructure for delivering and running web, mobile and big data applications on cloud
- Scale and optimise a variety of SQL and NoSQL databases (especially MongoDB), web servers, application frameworks, caches, and distributed messaging systems
- Automate the deployment and configuration of the virtualized infrastructure and the entire software stack
- Plan, implement and maintain robust backup and restoration policies ensuring low RTO and RPO
- Support several Linux servers running our SaaS platform stack on AWS, Azure, IBM Cloud, Ali Cloud
- Define and build processes to identify performance bottlenecks and scaling pitfalls
- Manage robust monitoring and alerting infrastructure
- Explore new tools to improve development operations to automate daily tasks
- Ensure High Availability and Auto-failover with minimum or no manual interventions
Requirements:
- Bachelor’s degree in Computer Science, Information Technology or a related field
- 11 to 14 years of experience in designing and maintaining high volume and scalable micro-services architecture on cloud infrastructure
- Strong background in Linux/Unix Administration and Python/Shell Scripting
- Extensive experience working with cloud platforms like AWS (EC2, ELB, S3, Auto-scaling, VPC, Lambda), GCP, Azure
- Experience in deployment automation, Continuous Integration and Continuous Deployment (Jenkins, Maven, Puppet, Chef, GitLab) and monitoring tools like Zabbix, Cloud Watch Monitoring, Nagios
- Knowledge of Java Virtual Machines, Apache Tomcat, Nginx, Apache Kafka, Microservices architecture, Caching mechanisms
- Experience in query analysis, peformance tuning, database redesigning,
- Experience in enterprise application development, maintenance and operations
- Knowledge of best practices and IT operations in an always-up, always-available service
- Excellent written and oral communication skills, judgment and decision-making skills.
- Excellent leadership skill.
- Seeking an Individual carrying around 5+ yrs of experience.
- Must have skills - Jenkins, Groovy, Ansible, Shell Scripting, Python, Linux Admin
- Terraform, AWS deep knowledge to automate and provision EC2, EBS, SQL Server, cost optimization, CI/CD pipeline using Jenkins, Server less automation is plus.
- Excellent writing and communication skills in English. Enjoy writing crisp and understandable documentation
- Comfortable programming in one or more scripting languages
- Enjoys tinkering with tooling. Find easier ways to handle systems by doing some research. Strong awareness around build vs buy.
- 2+ years work experience in a DevOps or similar role
- Knowledge of OO programming and concepts (Java, C++, C#, Python)
- A drive towards automating repetitive tasks (e.g., scripting via Bash, Python, etc)
- Fluency in one or more scripting languages such as Python or Ruby.
- Familiarity with Microservice-based architectures
- Practical experience with Docker containerization and clustering (Kubernetes/ECS)
- In-depth, hands-on experience with Linux, networking, server, and cloud architectures.
- Experience with CI/CD tools Azure DevOps, AWS cloud formation, Lamda functions, Jenkins, and Ansible
- Experience with AWS, Azure, or another cloud PaaS provider.
- Solid understanding of configuration, deployment, management, and maintenance of large cloud-hosted systems; including auto-scaling, monitoring, performance tuning, troubleshooting, and disaster recovery
- Proficiency with source control, continuous integration, and testing pipelines
- Effective communication skills
Job Responsibilities:
- Deploy and maintain critical applications on cloud-native microservices architecture.
- Implement automation, effective monitoring, and infrastructure-as-code.
- Deploy and maintain CI/CD pipelines across multiple environments.
- Streamline the software development lifecycle by identifying pain points and productivity barriers and determining ways to resolve them.
- Analyze how customers are using the platform and help drive continuous improvement.
- Support and work alongside a cross-functional engineering team on the latest technologies.
- Iterate on best practices to increase the quality & velocity of deployments.
- Sustain and improve the process of knowledge sharing throughout the engineering team
- Identification and prioritization of technical debt that risks instability or creates wasteful operational toil.
- Own daily operational goals with the team.
Araali Networks is seeking a highly driven DevOps engineer to help streamline the release and deployment process, while assuring high quality.
Responsibilities:
Use best practices to streamline the release process
Use IaC methods to manage cloud infrastructure
Use test automation for release qualification
Skills and Qualifications:
Hands-on experience in managing release pipelines
Good understanding of public cloud infrastructure
Working knowledge of python test frameworks like pyunit, and automation tools like Selenium, Cucumber etc
Strong analytical and debugging skills
Bachelor’s degree in Computer Science
About Araali:
Araali Networks is a SaaS based cybersecurity startup that has raised a total of $10M from well known investors like A Capital, Firebolt Ventures and SV Angels, and and through a strategic investment by a publicly traded security company.
The company is disrupting the cloud firewall market by auto-creating nano-perimeters around every cloud app. The Araali solution enables developers to focus on features and improves the security posture through simplification and automation.
The security controls are embedded at the time of DevOps. The precision of Araali controls also helps with security operations where alerts are precise and intelligently routed to the right app team, making it actionable in real-time.
- AWS Cloud, CICD, Serverless setups, Monitoring Setup
- Performance setup, scalability in hands experience, Linux expertise, DevOps Operations
- AWS Cloud, CICD, Serverless setups, Monitoring Setup, Performance setup, scalability in hands experience, Linux expertise, DevOps Operations.
Job Description
We are looking to add DevOps Engineer to the Infra team.
Roles & Responsibilities
What you do :
- Developing automation for the various deployments core to our business
- Documenting run books for various processes / improving knowledge bases
- Identifying technical issues, communicating and recommending solutions
- Miscellaneous support (user account, VPN, network, etc)
- Develop continuous integration / deployment strategies
- Production systems deployment/monitoring/optimization
-
Management of staging/development environments
What you know :
- Ability to work with a wide variety of open source technologies and tools
- Ability to code/script (Python, Ruby, Bash)
- Experience with systems and IT operations
- Comfortable with frequent incremental code testing and deployment
- Strong grasp of automation tools (Chef, Packer, Ansible, or others)
- Experience with cloud infrastructure and bare-metal systems
- Experience optimizing infrastructure for high availability and low latencies
- Experience with instrumenting systems for monitoring and reporting purposes
- Well versed in software configuration management systems (git, others)
- Experience with cloud providers (AWS or other) and tailoring apps for cloud deployment
-
Data management skills
Education :
- Degree in Computer Engineering or Computer Science
- 1-3 years of equivalent experience in DevOps roles.
- Work conducted is focused on business outcomes
- Can work in an environment with a high level of autonomy (at the individual and team level)
-
Comfortable working in an open, collaborative environment, reaching across functional.
Our Offering :
- True start-up experience - no bureaucracy and a ton of tough decisions that have a real impact on the business from day one.
-
The camaraderie of an amazingly talented team that is working tirelessly to build a great OS for India and surrounding markets.
Perks :
- Awesome benefits, social gatherings, etc.
- Work with intelligent, fun, and interesting people in a dynamic start-up environment.
Eligibility
B.Tech/M.Tech/B.E
Company Introduction
Established in May 2015, Indus OS is a homebred system apps company, building India’s only content and commerce platform for users to discover and consume digital content & services in the language of their choice. With a vision of digitally connecting 1 Billion Indians, Indus OS is constantly striving to adapt its existing portfolio (App Store, Minus One Screen, Keyboard, Messenger, etc) by introducing new features to enrich the user experience in their native language.
Currently, Indus OS has a user base of over 12+ Million on the back of 10+ smartphone brand partnerships with leading OEMs such as Samsung, Gionee, iTel, Micromax, Intex, Karbonn, and others. The Indus platform is available in English & 23 Indian regional languages and is intended to digitally connect the next 1 billion people in the emerging markets.
Requirements
- Design, write and build tools to improve the reliability, latency, availability and scalability of HealthifyMe application.
- Communicate, collaborate and work effectively across distributed teams in a global environment
- Optimize performance and solve issues across the entire stack: hardware, software, application, and network.
- Experienced in building infrastructure with terraform / cloudformation or equivalent.
- Experience with ansible or equivalent is beneficial
- Ability to use a wide variety of Open Source Tools
- Experience with AWS is a must.
- Minimum 5 years of running services in a large scale environment.
- Expert level understanding of Linux servers, specifically RHEL/CentOS.
- Practical, proven knowledge of shell scripting and at least one higher-level language (eg. Python, Ruby, GoLang).
- Experience with source code and binary repositories, build tools, and CI/CD (Git, Artifactory, Jenkins, etc)
- Demonstrable knowledge of TCP/IP, HTTP, web application security, and experience supporting multi-tier web application architectures.
Look forward to
- Working with a world-class team.
- Fun & work at the same place with an amazing work culture and flexible timings.
- Get ready to transform yourself into a health junkie
Join HealthifyMe and make history!