

Role : Senior Engineer Infrastructure
Key Responsibilities:
● Infrastructure Development and Management: Design, implement, and manage robust and scalable infrastructure solutions, ensuring optimal performance,security, and availability. Lead transition and migration projects, moving legacy systemsto cloud-based solutions.
● Develop and maintain applications and services using Golang.
● Automation and Optimization: Implement automation tools and frameworksto optimize operational processes. Monitorsystem performance, optimizing and modifying systems as necessary.
● Security and Compliance: Ensure infrastructure security by implementing industry best practices and compliance requirements. Respond to and mitigate security incidents and vulnerabilities.
Qualifications:
● Bachelor's degree in Computer Science, Engineering, or a related field (or equivalent practical experience).
● Good understanding of prominent backend languageslike Golang, Python, Node.js, or others.
● In-depth knowledge of network architecture,system security, infrastructure scalability.
● Proficiency with development tools,server management, and database systems.
● Strong experience with cloud services(AWS.), deployment,scaling, and management.
● Knowledge of Azure is a plus
● Familiarity with containers and orchestration services,such as Docker, Kubernetes, etc.
● Strong problem-solving skills and analytical thinking.
● Excellent verbal and written communication skills.
● Ability to thrive in a collaborative team environment.
● Genuine passion for backend development and keen interest in scalable systems.

Similar jobs
Job Description
Position Title: Senior System Engineer
Position Type: Full Time
Department: RSG
Reports to: First Level Manager, Indian Development Centre
Company Background:
Cglia is a software development company building highly available, highly secure, cloud-based enterprise software products that helps speed the research process resulting in new drugs, new devices, and new treatments to improve the health and wellbeing of world population.
At Cglia, our work shows our dedication and passion for innovative quality software products that are intuitive and easy to use and exceeds every aspect of customer expectations.
Cglia, is the place that develops world-class professionals who would like to be innovative, creative, learn continuously, and build a solid foundation to build products that are special and delight the customer.
Job Description:
The Senior System Engineer will have expertise in managing both Linux and Windows environments, along with hands-on experience in containerization technologies such as Kubernetes and Docker. Proficiency in Ansible for automation and configuration management is essential. This role is critical in ensuring the seamless operation, deployment, and maintenance of our IT infrastructure.
The ideal candidate has to oversee and participate with the installation, monitoring, maintenance, support, optimization and documentation of all network hardware and software. This includes managing multiple projects, planning network technology roadmaps and configuring/optimizing network services both internally and those integrated with Internet-based services
Job Responsibilities:
· Manage, maintain, and monitor Linux and Windows servers to ensure high availability and performance.
· Perform system upgrades, patches, and performance tuning for both operating systems and DBA servers.
· Deploy, manage, and troubleshoot containerized applications using Kubernetes and Docker.
· Design and implement Kubernetes clusters to ensure scalability, security, and reliability.
· Develop and maintain Ansible playbooks for automation of repetitive tasks, configuration management,
and system provisioning.
· Implement security best practices for both Linux and Windows environments.
· Set up and manage backup and disaster recovery solutions for critical systems and data.
· Work closely with development teams to support CI/CD pipelines and troubleshoot application issues.
· Manage VM Ware in a high availability environment with Disaster Recovery
· Good experience in RAID & Firewall
· Maintaining and managing SQL database server support
· Experience with scripting languages Unix/Shell, Bash or PowerShell
· Assist Quality Assurance with testing program changes, new releases or user documentation and support
new product release activities that include testing customer flows
· Must have the ability to work a flexible schedule and is required to participate in on-call rotation, which
includes different shift timings, weekends, and holidays
· Work across multiple time zones with remote team members
· Perform other duties as deemed necessary to provide quality service to the clients
Experience and Skills Required:
· Minimum 4+ years of experience in Linux and Windows administration
· 3 years of experience in VM Ware in a high availability environment with Disaster Recovery
· Good experience in RAID & Firewall
· 2+ years of experience in SQL database server support
· Ability to quickly acquire an in-depth knowledge of multiple custom applications
· Experience in setting up IT policies based on best practices and monitoring them
· Experience in shell scripting and automating tasks
· Experience in hardware and software monitoring tools
· Experience in administration and best practices for Apache and Tomcat
· Experience in handling Cisco router and firewall configurations and management
· Working knowledge on SQL Server, Oracle and other RDBMS databases
· Must be proactive and possess strong interpersonal, communication and organization skills
· Must possess excellent written and verbal presentation skills
· Must be self-motivated
· Certification in Linux/Windows administration is preferable.
Academics:
· Bachelor's / Master's degree (or equivalent) in computer science or related field or equivalent experience.
About the Company:
Gruve is an innovative Software Services startup dedicated to empowering Enterprise Customers in managing their Data Life Cycle. We specialize in Cyber Security, Customer Experience, Infrastructure, and advanced technologies such as Machine Learning and Artificial Intelligence. Our mission is to assist our customers in their business strategies utilizing their data to make more intelligent decisions. As an well-funded early-stage startup, Gruve offers a dynamic environment with strong customer and partner networks.
Why Gruve:
At Gruve, we foster a culture of innovation, collaboration, and continuous learning. We are committed to building a diverse and inclusive workplace where everyone can thrive and contribute their best work. If you’re passionate about technology and eager to make an impact, we’d love to hear from you.
Gruve is an equal opportunity employer. We welcome applicants from all backgrounds and thank all who apply; however, only those selected for an interview will be contacted.
Position summary:
We are seeking a Staff Engineer – DevOps with 8-12 years of experience in designing, implementing, and optimizing CI/CD pipelines, cloud infrastructure, and automation frameworks. The ideal candidate will have expertise in Kubernetes, Terraform, CI/CD, Security, Observability, and Cloud Platforms (AWS, Azure, GCP). You will play a key role in scaling and securing our infrastructure, improving developer productivity, and ensuring high availability and performance.
Key Roles & Responsibilities:
- Design, implement, and maintain CI/CD pipelines using tools like Jenkins, GitLab CI/CD, ArgoCD, and Tekton.
- Deploy and manage Kubernetes clusters (EKS, AKS, GKE) and containerized workloads.
- Automate infrastructure provisioning using Terraform, Ansible, Pulumi, or CloudFormation.
- Implement observability and monitoring solutions using Prometheus, Grafana, ELK, OpenTelemetry, or Datadog.
- Ensure security best practices in DevOps, including IAM, secrets management, container security, and vulnerability scanning.
- Optimize cloud infrastructure (AWS, Azure, GCP) for performance, cost efficiency, and scalability.
- Develop and manage GitOps workflows and infrastructure-as-code (IaC) automation.
- Implement zero-downtime deployment strategies, including blue-green deployments, canary releases, and feature flags.
- Work closely with development teams to optimize build pipelines, reduce deployment time, and improve system reliability.
Basic Qualifications:
- A bachelor’s or master’s degree in computer science, electronics engineering or a related field
- 8-12 years of experience in DevOps, Site Reliability Engineering (SRE), or Infrastructure Automation.
- Strong expertise in CI/CD pipelines, version control (Git), and release automation.
- Hands-on experience with Kubernetes (EKS, AKS, GKE) and container orchestration.
- Proficiency in Terraform, Ansible for infrastructure automation.
- Experience with AWS, Azure, or GCP services (EC2, S3, IAM, VPC, Lambda, API Gateway, etc.).
- Expertise in monitoring/logging tools such as Prometheus, Grafana, ELK, OpenTelemetry, or Datadog.
- Strong scripting and automation skills in Python, Bash, or Go.
Preferred Qualifications
- Experience in FinOps Cloud Cost Optimization) and Kubernetes cluster scaling.
- Exposure to serverless architectures and event-driven workflows.
- Contributions to open-source DevOps projects.
Skills We Require:- Dev Ops, AWS Admin, terraform, Infrastructure as a Code
SUMMARY:-
- Implement integrations requested by customers
- Deploy updates and fixes
- Provide Level 2 technical support
- Build tools to reduce occurrences of errors and improve customer experience
- Develop software to integrate with internal back-end systems
- Perform root cause analysis for production errors
- Investigate and resolve technical issues
- Develop scripts to automate visualization
- Design procedures for system troubleshooting and maintenance
Have good hands on experience on Dev Ops, AWS Admin, terraform, Infrastructure as a Code
Have knowledge on EC2, Lambda, S3, ELB, VPC, IAM, Cloud Watch, Centos, Server Hardening
Ability to understand business requirements and translate them into technical requirements
A knack for benchmarking and optimizationWhat you will do
We are looking for an exceptional engineering lead to join our team. You will be responsible for building and owning the systems that would have critical impact for the business and the experience of our community from day one.
- Build and lead an agile engineering team
- Work closely with Founder on product development
- Collaborate with operations team to understand customer pain points and solve interesting problems
- Code, test, ship - manage the entire application cycle
- Build libraries and documentation for future references
- Research and develop best practices and tools to enable delivery of features
- Set up capabilities to track and report business and user metrics
- Design and improve architecture to ensure scalability
Requirements
- Proven experience at scaling tech companies, preferably in commerce or social network
- Keen to innovate, open-minded and collaborative
- Able to interpret product needs and suggest appropriate solutions
- Have led a team, also able to code hands-on
- Strong communication skills
- Strong work ethic: responsible, responsive, and detail-oriented.
Technologies we use
Go, Flutter, AWS, Google Cloud
Main tasks
- Supervision of the CI/CD process for the automated builds and deployments of web services and web applications as well as desktop tool in the cloud and container environment
- Responsibility of the operations part of a DevOps organization especially for development in the environment of container technology and orchestration, e.g. with Kubernetes
- Installation, operation and monitoring of web applications in cloud data centers for the purpose of development of the test as well as for the operation of an own productive cloud
- Implementation of installations of the solution especially in the container context
- Introduction, maintenance and improvement of installation solutions for development in the desktop and server environment as well as in the cloud and with on-premise Kubernetes
- Maintenance of the system installation documentation and implementation of trainings
Execution of internal software tests and support of involved teams and stakeholders
- Hands on Experience with Azure DevOps.
Qualification profile
- Bachelor’s or master’s degree in communications engineering, electrical engineering, physics or comparable qualification
- Experience in software
- Installation and administration of Linux and Windows systems including network and firewalling aspects
- Experience with build and deployment automation with tools like Jenkins, Gradle, Argo, AnangoDB or similar as well as system scripting (Bash, Power-Shell, etc.)
- Interest in operation and monitoring of applications in virtualized and containerized environments in cloud and on-premise
- Server environments, especially application, web-and database servers
- Knowledge in VMware/K3D/Rancer is an advantage
- Good spoken and written knowledge of English
About Hive
Hive is the leading provider of cloud-based AI solutions for content understanding,
trusted by the world’s largest, fastest growing, and most innovative organizations. The
company empowers developers with a portfolio of best-in-class, pre-trained AI models, serving billions of customer API requests every month. Hive also offers turnkey software applications powered by proprietary AI models and datasets, enabling breakthrough use cases across industries. Together, Hive’s solutions are transforming content moderation, brand protection, sponsorship measurement, context-based ad targeting, and more.
Hive has raised over $120M in capital from leading investors, including General Catalyst, 8VC, Glynn Capital, Bain & Company, Visa Ventures, and others. We have over 250 employees globally in our San Francisco, Seattle, and Delhi offices. Please reach out if you are interested in joining the future of AI!
About Role
Our unique machine learning needs led us to open our own data centers, with an
emphasis on distributed high performance computing integrating GPUs. Even with these data centers, we maintain a hybrid infrastructure with public clouds when the right fit. As we continue to commercialize our machine learning models, we also need to grow our DevOps and Site Reliability team to maintain the reliability of our enterprise SaaS offering for our customers. Our ideal candidate is someone who is
able to thrive in an unstructured environment and takes automation seriously. You believe there is no task that can’t be automated and no server scale too large. You take pride in optimizing performance at scale in every part of the stack and never manually performing the same task twice.
Responsibilities
● Create tools and processes for deploying and managing hardware for Private Cloud Infrastructure.
● Improve workflows of developer, data, and machine learning teams
● Manage integration and deployment tooling
● Create and maintain monitoring and alerting tools and dashboards for various services, and audit infrastructure
● Manage a diverse array of technology platforms, following best practices and
procedures
● Participate in on-call rotation and root cause analysis
Requirements
● Minimum 5 - 10 years of previous experience working directly with Software
Engineering teams as a developer, DevOps Engineer, or Site Reliability
Engineer.
● Experience with infrastructure as a service, distributed systems, and software design at a high-level.
● Comfortable working on Linux infrastructures (Debian) via the CLIAble to learn quickly in a fast-paced environment.
● Able to debug, optimize, and automate routine tasks
● Able to multitask, prioritize, and manage time efficiently independently
● Can communicate effectively across teams and management levels
● Degree in computer science, or similar, is an added plus!
Technology Stack
● Operating Systems - Linux/Debian Family/Ubuntu
● Configuration Management - Chef
● Containerization - Docker
● Container Orchestrators - Mesosphere/Kubernetes
● Scripting Languages - Python/Ruby/Node/Bash
● CI/CD Tools - Jenkins
● Network hardware - Arista/Cisco/Fortinet
● Hardware - HP/SuperMicro
● Storage - Ceph, S3
● Database - Scylla, Postgres, Pivotal GreenPlum
● Message Brokers: RabbitMQ
● Logging/Search - ELK Stack
● AWS: VPC/EC2/IAM/S3
● Networking: TCP / IP, ICMP, SSH, DNS, HTTP, SSL / TLS, Storage systems,
RAID, distributed file systems, NFS / iSCSI / CIFS
Who we are
We are a group of ambitious individuals who are passionate about creating a revolutionary AI company. At Hive, you will have a steep learning curve and an opportunity to contribute to one of the fastest growing AI start-ups in San Francisco. The work you do here will have a noticeable and direct impact on the
development of the company.
Thank you for your interest in Hive and we hope to meet you soon
Job Description:
Responsibilities
· Having E2E responsibility for Azure landscape of our customers
· Managing to code release and operational tasks within a global team with a focus on automation, maintainability, security and customer satisfaction
· Make usage of CI/CD framework to rapidly support lifecycle management of the platform
· Acting as L2-L3 support for incidents, problems and service request
· Work with various Atos and 3rd party teams to resolve incidents and implement changes
· Implement and drive automation and self-healing solutions to reduce toil
· Enhance error budgets and hands on design and development of solutions to address reliability issues and/or risks
· Support ITSM processes and collaborate with service management representatives
Job Requirements
· Azure Associate certification or equivalent knowledge level
· 5+ years of professional experience
· Experience with Terraform and/or native Azure automation
· Knowledge of CI/CD concepts and toolset (i.e. Jenkins, Azure DevOps, Git)
· Must be adaptable to work in a varied, fast paced exciting, ever changing environment
· Good analytical and problem-solving skills to resolve technical issues
· Understanding of Agile development and SCRUM concepts a plus
· Experience with Kubernetes architecture and tools a plus
Job Brief:
We are looking for candidates that have experience in development and have performed CI/CD based projects. Should have a good hands-on Jenkins Master-Slave architecture, used AWS native services like CodeCommit, CodeBuild, CodeDeploy and CodePipeline. Should have experience in setting up cross platform CI/CD pipelines which can be across different cloud platforms or on-premise and cloud platform.
Job Location:
Pune.
Job Description:
- Hands on with AWS (Amazon Web Services) Cloud with DevOps services and CloudFormation.
- Experience interacting with customer.
- Excellent communication.
- Hands-on in creating and managing Jenkins job, Groovy scripting.
- Experience in setting up Cloud Agnostic and Cloud Native CI/CD Pipelines.
- Experience in Maven.
- Experience in scripting languages like Bash, Powershell, Python.
- Experience in automation tools like Terraform, Ansible, Chef, Puppet.
- Excellent troubleshooting skills.
- Experience in Docker and Kuberneties with creating docker files.
- Hands on with version control systems like GitHub, Gitlab, TFS, BitBucket, etc.
About the Company
Blue Sky Analytics is a Climate Tech startup that combines the power of AI & Satellite data to aid in the creation of a global environmental data stack. Our funders include Beenext and Rainmatter. Over the next 12 months, we aim to expand to 10 environmental data-sets spanning water, land, heat, and more!
We are looking for DevOps Engineer who can help us build the infrastructure required to handle huge datasets on a scale. Primarily, you will work with AWS services like EC2, Lambda, ECS, Containers, etc. As part of our core development crew, you’ll be figuring out how to deploy applications ensuring high availability and fault tolerance along with a monitoring solution that has alerts for multiple microservices and pipelines. Come save the planet with us!
Your Role
- Applications built at scale to go up and down on command.
- Manage a cluster of microservices talking to each other.
- Build pipelines for huge data ingestion, processing, and dissemination.
- Optimize services for low cost and high efficiency.
- Maintain high availability and scalable PSQL database cluster.
- Maintain alert and monitoring system using Prometheus, Grafana, and Elastic Search.
Requirements
- 1-4 years of work experience.
- Strong emphasis on Infrastructure as Code - Cloudformation, Terraform, Ansible.
- CI/CD concepts and implementation using Codepipeline, Github Actions.
- Advanced hold on AWS services like IAM, EC2, ECS, Lambda, S3, etc.
- Advanced Containerization - Docker, Kubernetes, ECS.
- Experience with managed services like database cluster, distributed services on EC2.
- Self-starters and curious folks who don't need to be micromanaged.
- Passionate about Blue Sky Climate Action and working with data at scale.
Benefits
- Work from anywhere: Work by the beach or from the mountains.
- Open source at heart: We are building a community where you can use, contribute and collaborate on.
- Own a slice of the pie: Possibility of becoming an owner by investing in ESOPs.
- Flexible timings: Fit your work around your lifestyle.
- Comprehensive health cover: Health cover for you and your dependents to keep you tension free.
- Work Machine of choice: Buy a device and own it after completing a year at BSA.
- Quarterly Retreats: Yes there's work-but then there's all the non-work+fun aspect aka the retreat!
- Yearly vacations: Take time off to rest and get ready for the next big assignment by availing the paid leaves.

If you are looking for good opportunity in Cloud Development/Devops. Here is the right opportunity.
EXP: 4-10 YRs
Location:Pune
Job Type: Permanent
Minimum qualifications:
- Education: Bachelor-Master degree
- Proficient in English language.
Relevant experience:
- Should have been working for at least four years as a DevOps/Cloud Engineer
- Should have worked on AWS Cloud Environment in depth
- Should have been working in an Infrastructure as code environment or understands it very clearly.
- Has done Infrastructure coding using Cloudformation/Terraform and Configuration Management using Chef/Ansibleand Enterprise Bus(RabbitMQ/Kafka)
- Deep understanding of the microservice design and aware of centralized Caching(Redis), centralizedconfiguration(Consul/Zookeeper)

