

- Work towards improving the following 4 verticals - scalability, availability, security, and cost, for company's workflows and products.
- Help in provisioning, managing, optimizing cloud infrastructure in AWS (IAM, EC2, RDS, CloudFront, S3, ECS, Lambda, ELK etc.)
- Work with the development teams to design scalable, robust systems using cloud architecture for both 0-to-1 and 1-to-100 products.
- Drive technical initiatives and architectural service improvements.
- Be able to predict problems and implement solutions that detect and prevent outages.
- Mentor/manage a team of engineers.
- Design solutions with failure scenarios in mind to ensure reliability.
- Document rigorously to keep track of all changes/upgrades to the infrastructure and as well share knowledge with the rest of the team
- Identify vulnerabilities during development with actionable information to empower developers to remediate vulnerabilities
- Automate the build and testing processes to consistently integrate code
- Manage changes to documents, software, images, large web sites, and other collections of code, configuration, and metadata among disparate teams

Similar jobs
Job Description
Position Title: Senior System Engineer
Position Type: Full Time
Department: RSG
Reports to: First Level Manager, Indian Development Centre
Company Background:
Cglia is a software development company building highly available, highly secure, cloud-based enterprise software products that helps speed the research process resulting in new drugs, new devices, and new treatments to improve the health and wellbeing of world population.
At Cglia, our work shows our dedication and passion for innovative quality software products that are intuitive and easy to use and exceeds every aspect of customer expectations.
Cglia, is the place that develops world-class professionals who would like to be innovative, creative, learn continuously, and build a solid foundation to build products that are special and delight the customer.
Job Description:
The Senior System Engineer will have expertise in managing both Linux and Windows environments, along with hands-on experience in containerization technologies such as Kubernetes and Docker. Proficiency in Ansible for automation and configuration management is essential. This role is critical in ensuring the seamless operation, deployment, and maintenance of our IT infrastructure.
The ideal candidate has to oversee and participate with the installation, monitoring, maintenance, support, optimization and documentation of all network hardware and software. This includes managing multiple projects, planning network technology roadmaps and configuring/optimizing network services both internally and those integrated with Internet-based services
Job Responsibilities:
· Manage, maintain, and monitor Linux and Windows servers to ensure high availability and performance.
· Perform system upgrades, patches, and performance tuning for both operating systems and DBA servers.
· Deploy, manage, and troubleshoot containerized applications using Kubernetes and Docker.
· Design and implement Kubernetes clusters to ensure scalability, security, and reliability.
· Develop and maintain Ansible playbooks for automation of repetitive tasks, configuration management,
and system provisioning.
· Implement security best practices for both Linux and Windows environments.
· Set up and manage backup and disaster recovery solutions for critical systems and data.
· Work closely with development teams to support CI/CD pipelines and troubleshoot application issues.
· Manage VM Ware in a high availability environment with Disaster Recovery
· Good experience in RAID & Firewall
· Maintaining and managing SQL database server support
· Experience with scripting languages Unix/Shell, Bash or PowerShell
· Assist Quality Assurance with testing program changes, new releases or user documentation and support
new product release activities that include testing customer flows
· Must have the ability to work a flexible schedule and is required to participate in on-call rotation, which
includes different shift timings, weekends, and holidays
· Work across multiple time zones with remote team members
· Perform other duties as deemed necessary to provide quality service to the clients
Experience and Skills Required:
· Minimum 4+ years of experience in Linux and Windows administration
· 3 years of experience in VM Ware in a high availability environment with Disaster Recovery
· Good experience in RAID & Firewall
· 2+ years of experience in SQL database server support
· Ability to quickly acquire an in-depth knowledge of multiple custom applications
· Experience in setting up IT policies based on best practices and monitoring them
· Experience in shell scripting and automating tasks
· Experience in hardware and software monitoring tools
· Experience in administration and best practices for Apache and Tomcat
· Experience in handling Cisco router and firewall configurations and management
· Working knowledge on SQL Server, Oracle and other RDBMS databases
· Must be proactive and possess strong interpersonal, communication and organization skills
· Must possess excellent written and verbal presentation skills
· Must be self-motivated
· Certification in Linux/Windows administration is preferable.
Academics:
· Bachelor's / Master's degree (or equivalent) in computer science or related field or equivalent experience.
Please Apply - https://zrec.in/L51Qf?source=CareerSite
About Us
Infra360 Solutions is a services company specializing in Cloud, DevSecOps, Security, and Observability solutions. We help technology companies adapt DevOps culture in their organization by focusing on long-term DevOps roadmap. We focus on identifying technical and cultural issues in the journey of successfully implementing the DevOps practices in the organization and work with respective teams to fix issues to increase overall productivity. We also do training sessions for the developers and make them realize the importance of DevOps. We provide these services - DevOps, DevSecOps, FinOps, Cost Optimizations, CI/CD, Observability, Cloud Security, Containerization, Cloud Migration, Site Reliability, Performance Optimizations, SIEM and SecOps, Serverless automation, Well-Architected Review, MLOps, Governance, Risk & Compliance. We do assessments of technology architecture, security, governance, compliance, and DevOps maturity model for any technology company and help them optimize their cloud cost, streamline their technology architecture, and set up processes to improve the availability and reliability of their website and applications. We set up tools for monitoring, logging, and observability. We focus on bringing the DevOps culture to the organization to improve its efficiency and delivery.
Job Description
Job Title: Senior DevOps Engineer (Infrastructure/SRE)
Department: Technology
Location: Gurgaon
Work Mode: On-site
Working Hours: 10 AM - 7 PM
Terms: Permanent
Experience: 4-6 years
Education: B.Tech/MCA
Notice Period: Immediately
About Us
At Infra360.io, we are a next-generation cloud consulting and services company committed to delivering comprehensive, 360-degree solutions for cloud, infrastructure, DevOps, and security. We partner with clients to transform and optimize their technology landscape, ensuring resilience, scalability, cost efficiency and innovation.
Our core services include Cloud Strategy, Site Reliability Engineering (SRE), DevOps, Cloud Security Posture Management (CSPM), and related Managed Services. We specialize in driving operational excellence across multi-cloud environments, helping businesses achieve their goals with agility and reliability.
We thrive on ownership, collaboration, problem-solving, and excellence, fostering an environment where innovation and continuous learning are at the forefront. Join us as we expand and redefine what’s possible in cloud technology and infrastructure.
Role Summary
We are looking for a Senior DevOps Engineer (Infrastructure) to design, automate, and manage cloud-based and datacentre infrastructure for diverse projects. The ideal candidate will have deep expertise in a public cloud platform (AWS, GCP, or Azure), with a strong focus on cost optimization, security best practices, and infrastructure automation using tools like Terraform and CI/CD pipelines.
This role involves designing scalable architectures (containers, serverless, and VMs), managing databases, and ensuring system observability with tools like Prometheus and Grafana. Strong leadership, client communication, and team mentoring skills are essential. Experience with VPN technologies and configuration management tools (Ansible, Helm) is also critical. Multi-cloud experience and familiarity with APM tools are a plus.
Ideal Candidate Profile
- Solid 4-6 years of experience as a DevOps engineer with a proven track record of architecting and automating solutions on Cloud
- Experience in troubleshooting production incidents and handling high-pressure situations.
- Strong leadership skills and the ability to mentor team members and provide guidance on best practices.
- Bachelor's or Master's degree in Computer Science, Engineering, or a related field.
- Extensive experience with Kubernetes, Terraform, ArgoCD, and Helm.
- Strong with at least one public cloud AWS/GCP/Azure
- Strong with Cost Optimization and Security Best practices
- Strong with Infrastructure automation using Terraform and CI/CD automation
- Strong with Configuration Management using Ansible, Helm etc
- Good with designing architectures (Containers, Serverless, VMs etc)
- Hands-on Experience working on Multiple Projects
- Strong with Client communication and requirements gathering
- Databases management experience
- Good experience with Prometheus, Grafana & Alert Manager
- Able to manage multiple clients and take ownership of client issues.
- Experience with Git and coding best practices
- Proficiency in cloud networking, including VPCs, DNS, VPNs (OpenVPN, OpenSwan, Pritunl, Site-to-Site VPNs), load balancers, and firewalls, ensuring secure and efficient connectivity.
- Strong understanding of cloud security best practices, identity and access management (IAM), and compliance requirements for modern infrastructure.
Good to have
- Multi-cloud experience with AWS, GCP & Azure
- Experience with APM & Observability tools like - Newrelic, Datadog, and OpenTelemetry
- Proficiency in scripting languages (Python, Go) for automation and tooling to improve infrastructure and application reliability.
Key Responsibilities
- Design and Development:
- Architect, design, and develop high-quality, scalable, and secure cloud-based software solutions.
- Collaborate with product and engineering teams to translate business requirements into technical specifications.
- Write clean, maintainable, and efficient code, following best practices and coding standards.
- Cloud Infrastructure:
- Develop and optimise cloud-native applications, leveraging cloud services like AWS, Azure, or Google Cloud Platform (GCP).
- Implement and manage CI/CD pipelines for automated deployment and testing.
- Ensure the security, reliability, and performance of cloud infrastructure.
- Technical Leadership:
- Mentor and guide junior engineers, providing technical leadership and fostering a collaborative team environment.
- Participate in code reviews, ensuring adherence to best practices and high-quality code delivery.
- Lead technical discussions and contribute to architectural decisions.
- Problem Solving and Troubleshooting:
- Identify, diagnose, and resolve complex software and infrastructure issues.
- Perform root cause analysis for production incidents and implement preventative measures.
- Continuous Improvement:
- Stay up-to-date with the latest industry trends, tools, and technologies in cloud computing and software engineering.
- Contribute to the continuous improvement of development processes, tools, and methodologies.
- Drive innovation by experimenting with new technologies and solutions to enhance the platform.
- Collaboration:
- Work closely with DevOps, QA, and other teams to ensure smooth integration and delivery of software releases.
- Communicate effectively with stakeholders, including technical and non-technical team members.
- Client Interaction & Management:
- Will serve as a direct point of contact for multiple clients.
- Able to handle the unique technical needs and challenges of two or more clients concurrently.
- Involve both direct interaction with clients and internal team coordination.
- Production Systems Management:
- Must have extensive experience in managing, monitoring, and debugging production environments.
- Will work on troubleshooting complex issues and ensure that production systems are running smoothly with minimal downtime.
Who you are
The possibility of having massive societal impact. Our software touches the lives of hundreds of millions of people.
Solving hard governance and societal challenges
Work directly with central and state government leaders and other dignitaries
Mentorship from world class people and rich ecosystems
Position : Architect or Technical Lead - DevOps
Location : Bangalore
Role:
Strong knowledge on architecture and system design skills for multi-tenant, multi-region, redundant and highly- available mission-critical systems.
Clear understanding of core cloud platform technologies across public and private clouds.
Lead DevOps practices for Continuous Integration and Continuous Deployment pipeline and IT operations practices, scaling, metrics, as well as running day-to-day operations from development to production for the platform.
Implement non-functional requirements needed for a world-class reliability, scale, security and cost-efficiency throughout the product development lifecycle.
Drive a culture of automation, and self-service enablement for developers.
Work with information security leadership and technical team to automate security into the platform and services.
Meet infrastructure SLAs and compliance control points of cloud platforms.
Define and contribute to initiatives to continually improve Solution Delivery processes.
Improve organisation’s capability to build, deliver and scale software as a service on the cloud
Interface with Engineering, Product and Technical Architecture group to meet joint objectives.
You are deeply motivated & have knowledge of solving hard to crack societal challenges with the stamina to see it all the way through.
You must have 7+ years of hands-on experience on DevOps, MSA and Kubernetes platform.
You must have 2+ years of strong kubernetes experience and must have CKA (Certified Kubernetes Administrator).
You should be proficient in tools/technologies involved for a microservices architecture and deployed in a multi cloud kubernetes environment.
You should have hands-on experience in architecting and building CI/CD pipelines from code check-in until production.
You should have strong knowledge on Dockers, Jenkins pipeline scripting, Helm charts, kubernetes objects and manifests, prometheus and demonstrate hands-on knowledge of multi cloud computing, storage and networking.
You should have experience in designing, provisioning and administrating using best practices across multiple public cloud offerings (Azure, AWS, GCP) and/or private cloud offerings ( OpenStack, VMware, Nutanix, NIC, bare metal Infra).
You should have experience in setting up logging, monitoring Cloud application performance, error rates, and error budgets while tracking adherence to SLOs and SLAs.
Job Title: DevOps SDE llI
Job Summary
Porter seeks an experienced cloud and DevOps engineer to join our infrastructure platform team. This team is responsible for the organization's cloud platform, CI/CD, and observability infrastructure. As part of this team, you will be responsible for providing a scalable, developer-friendly cloud environment by participating in the design, creation, and implementation of automated processes and architectures to achieve our vision of an ideal cloud platform.
Responsibilities and Duties
In this role, you will
- Own and operate our application stack and AWS infrastructure to orchestrate and manage our applications.
- Support our application teams using AWS by provisioning new infrastructure and contributing to the maintenance and enhancement of existing infrastructure.
- Build out and improve our observability infrastructure.
- Set up automated auditing processes and improve our applications' security posture.
- Participate in troubleshooting infrastructure issues and preparing root cause analysis reports.
- Develop and maintain our internal tooling and automation to manage the lifecycle of our applications, from provisioning to deployment, zero-downtime and canary updates, service discovery, container orchestration, and general operational health.
- Continuously improve our build pipelines, automated deployments, and automated testing.
- Propose, participate in, and document proof of concept projects to improve our infrastructure, security, and observability.
Qualifications and Skills
Hard requirements for this role:
- 5+ years of experience as a DevOps / Infrastructure engineer on AWS.
- Experience with git, CI / CD, and Docker. (We use GitHub, GitHub actions, Jenkins, ECS and Kubernetes).
- Experience in working with infrastructure as code (Terraform/CloudFormation).
- Linux and networking administration experience.
- Strong Linux Shell scripting experience.
- Experience with one programming language and cloud provider SDKs. (Python + boto3 is preferred)
- Experience with configuration management tools like Ansible and Packer.
- Experience with container orchestration tools. (Kubernetes/ECS).
- Database administration experience and the ability to write intermediate-level SQL queries. (We use Postgres)
- AWS SysOps administrator + Developer certification or equivalent knowledge
Good to have:
- Experience working with ELK stack.
- Experience supporting JVM applications.
- Experience working with APM tools is good to have. (We use datadog)
- Experience working in a XaaC environment. (Packer, Ansible/Chef, Terraform/Cloudformation, Helm/Kustomise, Open policy agent/Sentinel)
- Experience working with security tools. (AWS Security Hub/Inspector/GuardDuty)
- Experience with JIRA/Jira help desk.
About Hive
Hive is the leading provider of cloud-based AI solutions for content understanding,
trusted by the world’s largest, fastest growing, and most innovative organizations. The
company empowers developers with a portfolio of best-in-class, pre-trained AI models, serving billions of customer API requests every month. Hive also offers turnkey software applications powered by proprietary AI models and datasets, enabling breakthrough use cases across industries. Together, Hive’s solutions are transforming content moderation, brand protection, sponsorship measurement, context-based ad targeting, and more.
Hive has raised over $120M in capital from leading investors, including General Catalyst, 8VC, Glynn Capital, Bain & Company, Visa Ventures, and others. We have over 250 employees globally in our San Francisco, Seattle, and Delhi offices. Please reach out if you are interested in joining the future of AI!
About Role
Our unique machine learning needs led us to open our own data centers, with an
emphasis on distributed high performance computing integrating GPUs. Even with these data centers, we maintain a hybrid infrastructure with public clouds when the right fit. As we continue to commercialize our machine learning models, we also need to grow our DevOps and Site Reliability team to maintain the reliability of our enterprise SaaS offering for our customers. Our ideal candidate is someone who is
able to thrive in an unstructured environment and takes automation seriously. You believe there is no task that can’t be automated and no server scale too large. You take pride in optimizing performance at scale in every part of the stack and never manually performing the same task twice.
Responsibilities
● Create tools and processes for deploying and managing hardware for Private Cloud Infrastructure.
● Improve workflows of developer, data, and machine learning teams
● Manage integration and deployment tooling
● Create and maintain monitoring and alerting tools and dashboards for various services, and audit infrastructure
● Manage a diverse array of technology platforms, following best practices and
procedures
● Participate in on-call rotation and root cause analysis
Requirements
● Minimum 5 - 10 years of previous experience working directly with Software
Engineering teams as a developer, DevOps Engineer, or Site Reliability
Engineer.
● Experience with infrastructure as a service, distributed systems, and software design at a high-level.
● Comfortable working on Linux infrastructures (Debian) via the CLIAble to learn quickly in a fast-paced environment.
● Able to debug, optimize, and automate routine tasks
● Able to multitask, prioritize, and manage time efficiently independently
● Can communicate effectively across teams and management levels
● Degree in computer science, or similar, is an added plus!
Technology Stack
● Operating Systems - Linux/Debian Family/Ubuntu
● Configuration Management - Chef
● Containerization - Docker
● Container Orchestrators - Mesosphere/Kubernetes
● Scripting Languages - Python/Ruby/Node/Bash
● CI/CD Tools - Jenkins
● Network hardware - Arista/Cisco/Fortinet
● Hardware - HP/SuperMicro
● Storage - Ceph, S3
● Database - Scylla, Postgres, Pivotal GreenPlum
● Message Brokers: RabbitMQ
● Logging/Search - ELK Stack
● AWS: VPC/EC2/IAM/S3
● Networking: TCP / IP, ICMP, SSH, DNS, HTTP, SSL / TLS, Storage systems,
RAID, distributed file systems, NFS / iSCSI / CIFS
Who we are
We are a group of ambitious individuals who are passionate about creating a revolutionary AI company. At Hive, you will have a steep learning curve and an opportunity to contribute to one of the fastest growing AI start-ups in San Francisco. The work you do here will have a noticeable and direct impact on the
development of the company.
Thank you for your interest in Hive and we hope to meet you soon
What you will do:
- Handling Configuration Management, Web Services Architectures, DevOps Implementation, Build & Release Management, Database management, Backups and monitoring
- Logging, metrics and alerting management
- Creating Docker files
- Performing root cause analysis for production errors
What you need to have:
- 12+ years of experience in Software Development/ QA/ Software Deployment with 5+ years of experience in managing high performing teams
- Proficiency in VMware, AWS & cloud applications development, deployment
- Good knowledge in Java, Node.js
- Experience working with RESTful APIs, JSON etc
- Experience with Unit/ Functional automation is a plus
- Experience with MySQL, Mango DB, Redis, Rabbit MQ
- Proficiency in Jenkins. Ansible, Terraform/Chef/Ant
- Proficiency in Linux based Operating Systems
- Proficiency of Cloud Infrastructure like Dockers, Kubernetes
- Strong problem solving and analytical skills
- Good written and oral communication skills
- Sound understanding in areas of Computer Science such as algorithms, data structures, object oriented design, databases
- Proficiency in monitoring and observability
We are looking for a DevOps Engineer for managing the interchange of data between the server and the users. Your primary responsibility will be the development of all server-side logic, definition, and maintenance of the central database, and ensuring high performance and responsiveness to request from the frontend. You will also be responsible for integrating the front-end elements built by your co-workers into the application. Therefore, a basic understanding of frontend technologies is necessary as well.
What we are looking for
- Must have strong knowledge of Kubernetes and Helm3
- Should have previous experience in Dockerizing the applications.
- Should be able to automate manual tasks using Shell or Python
- Should have good working knowledge on AWS and GCP clouds
- Should have previous experience working on Bitbucket, Github, or any other VCS.
- Must be able to write Jenkins Pipelines and have working knowledge on GitOps and ArgoCD.
- Have hands-on experience in Proactive monitoring using tools like NewRelic, Prometheus, Grafana, Fluentbit, etc.
- Should have a good understanding of ELK Stack.
- Exposure on Jira, confluence, and Sprints.
What you will do:
- Mentor junior Devops engineers and improve the team’s bar
- Primary owner of tech best practices, tech processes, DevOps initiatives, and timelines
- Oversight of all server environments, from Dev through Production.
- Responsible for the automation and configuration management
- Provides stable environments for quality delivery
- Assist with day-to-day issue management.
- Take lead in containerising microservices
- Develop deployment strategies that allow DevOps engineers to successfully deploy code in any environment.
- Enables the automation of CI/CD
- Implement dashboard to monitors various
- 1-3 years of experience in DevOps
- Experience in setting up front end best practices
- Working in high growth startups
- Ownership and Be Proactive.
- Mentorship & upskilling mindset.
- systems and applications
what you’ll get- Health Benefits
- Innovation-driven culture
- Smart and fun team to work with
- Friends for life
Exposure to development and implementation practices in a modern systems environment together with exposure to working in a project team particularly with reference to industry methodologies, e.g. Agile, continuous delivery, etc
- At least 3-5 years of experience building and maintaining AWS infrastructure (VPC, EC2, Security Groups, IAM, ECS, CodeDeploy, CloudFront, S3)
- Strong understanding of how to secure AWS environments and meet compliance requirements
- Experience using DevOps methodology and Infrastructure as Code
- Automation / CI/CD tools – Bitbucket Pipelines, Jenkins
- Infrastructure as code – Terraform, Cloudformation, etc
- Strong experience deploying and managing infrastructure with Terraform
- Automated provisioning and configuration management – Ansible, Chef, Puppet
- Experience with Docker, GitHub, Jenkins, ELK and deploying applications on AWS
- Improve CI/CD processes, support software builds and CI/CD of the development departments
- Develop, maintain, and optimize automated deployment code for development, test, staging and production environments
• At least 4 years of hands-on experience with cloud infrastructure on GCP
• Hands-on-Experience on Kubernetes is a mandate
• Exposure to configuration management and orchestration tools at scale (e.g. Terraform, Ansible, Packer)
• Knowledge and hand-on-experience in DevOps tools (e.g. Jenkins, Groovy, and Gradle)
• Knowledge and hand-on-experience on the various platforms (e.g. Gitlab, CircleCl and Spinnakar)
• Familiarity with monitoring and alerting tools (e.g. CloudWatch, ELK stack, Prometheus)
• Proven ability to work independently or as an integral member of a team
Preferable Skills:
• Familiarity with standard IT security practices such as encryption,
credentials and key management.
• Proven experience on various coding languages (Java, Python-) to
• support DevOps operation and cloud transformation
• Familiarity and knowledge of the web standards (e.g. REST APIs, web security mechanisms)
• Hands on experience with GCP
• Experience in performance tuning, services outage management and troubleshooting.
Attributes:
• Good verbal and written communication skills
• Exceptional leadership, time management, and organizational skill Ability to operate independently and make decisions with little direct supervision

