


Roles & Responsibilities :
- Champion engineering and operational excellence.
- Establish a solid infrastructure framework and excellent development and deployment processes.
- Provide technical guidance to both your team members and your peers from the development team.
- Work with the development teams closely to gather system requirements, new service proposals and large system improvements and come up with the infrastructure architecture leading to stable, well-monitored fly, performant and secure systems.
- Be part of and help create a positive work environment based on accountability.
- Communicate across functions and drive engineering initiatives.
- Initiate cross team collaboration with product development teams to develop high quality, polished products, and services.
Required Skills :
- 5+ years of professional experience developing and launching software products on Cloud.
- Basic understanding Java/Go Programming
- Good Understanding of Container Technologies/Orchestration platforms (e. g Docker, Kubernetes)
- Deep understanding of AWS or Any Cloud.
- Good understanding of data stores like Postgres, Redis, Kafka, and Elasticsearch.
- Good Understanding of Operating systems
- Strong technical background with track record of individual technical accomplishments
- Ability to handle multiple competing priorities in a fast-paced environment
- Ability to establish credibility with smart engineers quickly.
- Most importantly, ability to learn and urge to learn new things.
- B.Tech/M.Tech in Computer Science or a related technical field.

Similar jobs
About the Company:
Gruve is an innovative Software Services startup dedicated to empowering Enterprise Customers in managing their Data Life Cycle. We specialize in Cyber Security, Customer Experience, Infrastructure, and advanced technologies such as Machine Learning and Artificial Intelligence. Our mission is to assist our customers in their business strategies utilizing their data to make more intelligent decisions. As an well-funded early-stage startup, Gruve offers a dynamic environment with strong customer and partner networks.
Why Gruve:
At Gruve, we foster a culture of innovation, collaboration, and continuous learning. We are committed to building a diverse and inclusive workplace where everyone can thrive and contribute their best work. If you’re passionate about technology and eager to make an impact, we’d love to hear from you.
Gruve is an equal opportunity employer. We welcome applicants from all backgrounds and thank all who apply; however, only those selected for an interview will be contacted.
Position summary:
We are seeking a Staff Engineer – DevOps with 8-12 years of experience in designing, implementing, and optimizing CI/CD pipelines, cloud infrastructure, and automation frameworks. The ideal candidate will have expertise in Kubernetes, Terraform, CI/CD, Security, Observability, and Cloud Platforms (AWS, Azure, GCP). You will play a key role in scaling and securing our infrastructure, improving developer productivity, and ensuring high availability and performance.
Key Roles & Responsibilities:
- Design, implement, and maintain CI/CD pipelines using tools like Jenkins, GitLab CI/CD, ArgoCD, and Tekton.
- Deploy and manage Kubernetes clusters (EKS, AKS, GKE) and containerized workloads.
- Automate infrastructure provisioning using Terraform, Ansible, Pulumi, or CloudFormation.
- Implement observability and monitoring solutions using Prometheus, Grafana, ELK, OpenTelemetry, or Datadog.
- Ensure security best practices in DevOps, including IAM, secrets management, container security, and vulnerability scanning.
- Optimize cloud infrastructure (AWS, Azure, GCP) for performance, cost efficiency, and scalability.
- Develop and manage GitOps workflows and infrastructure-as-code (IaC) automation.
- Implement zero-downtime deployment strategies, including blue-green deployments, canary releases, and feature flags.
- Work closely with development teams to optimize build pipelines, reduce deployment time, and improve system reliability.
Basic Qualifications:
- A bachelor’s or master’s degree in computer science, electronics engineering or a related field
- 8-12 years of experience in DevOps, Site Reliability Engineering (SRE), or Infrastructure Automation.
- Strong expertise in CI/CD pipelines, version control (Git), and release automation.
- Hands-on experience with Kubernetes (EKS, AKS, GKE) and container orchestration.
- Proficiency in Terraform, Ansible for infrastructure automation.
- Experience with AWS, Azure, or GCP services (EC2, S3, IAM, VPC, Lambda, API Gateway, etc.).
- Expertise in monitoring/logging tools such as Prometheus, Grafana, ELK, OpenTelemetry, or Datadog.
- Strong scripting and automation skills in Python, Bash, or Go.
Preferred Qualifications
- Experience in FinOps Cloud Cost Optimization) and Kubernetes cluster scaling.
- Exposure to serverless architectures and event-driven workflows.
- Contributions to open-source DevOps projects.
Responsibilities
Provisioning and de-provisioning AWS accounts for internal customers
Work alongside systems and development teams to support the transition and operation of client websites/applications in and out of AWS.
Deploying, managing, and operating AWS environments
Identifying appropriate use of AWS operational best practices
Estimating AWS costs and identifying operational cost control mechanisms
Keep technical documentation up to date
Proactively keep up to date on AWS services and developments
Create (where appropriate) automation, in order to streamline provisioning and de-provisioning processes
Lead certain data/service migration projects
Job Requirements
Experience provisioning, operating, and maintaining systems running on AWS
Experience with Azure/AWS.
Capabilities to provide AWS operations and deployment guidance and best practices throughout the lifecycle of a project
Experience with application/data migration to/from AWS
Experience with NGINX and the HTTP protocol.
Experience with configuration and management software such as GIT Strong analytical and problem-solving skills
Deployment experience using common AWS technologies like VPC, and regionally distributed EC2 instances, Docker, and more.
Ability to work in a collaborative environment
Detail-oriented, strong work ethic and high standard of excellence
A fast learner, the Achiever, sets high personal goals
Must be able to work on multiple projects and consistently meet project deadlines
Description
Do you dream about code every night? If so, we’d love to talk to you about a new product that we’re making to enable delightful testing experiences at scale for development teams who build modern software solutions.
What You'll Do
Troubleshooting and analyzing technical issues raised by internal and external users.
Working with Monitoring tools like Prometheus / Nagios / Zabbix.
Developing automation in one or more technologies such as Terraform, Ansible, Cloud Formation, Puppet, Chef will be preferred.
Monitor infrastructure alerts and take proactive action to avoid downtime and customer impacts.
Working closely with the cross-functional teams to resolve issues.
Test, build, design, deployment, and ability to maintain continuous integration and continuous delivery process using tools like Jenkins, maven Git, etc.
Work in close coordination with the development and operations team such that the application is in line with performance according to the customer's expectations.
What you should have
Bachelor’s or Master’s degree in computer science or any related field.
3 - 6 years of experience in Linux / Unix, cloud computing techniques.
Familiar with working on cloud and datacenter for enterprise customers.
Hands-on experience on Linux / Windows / Mac OS’s and Batch/Apple/Bash scripting.
Experience with various databases such as MongoDB, PostgreSQL, MySQL, MSSQL.
Familiar with AWS technologies like EC2, S3, Lambda, IAM, etc.
Must know how to choose the best tools and technologies which best fit the business needs.
Experience in developing and maintaining CI/CD processes using tools like Git, GitHub, Jenkins etc.
Excellent organizational skills to adapt to a constantly changing technical environment

- Provides free and subscription-based website and email services hosted and operated at data centres in Mumbai and Hyderabad.
- Serve global audience and customers through sophisticated content delivery networks.
- Operate a service infrastructure using the latest technologies for web services and a very large storage infrastructure.
- Provides virtualized infrastructure, allows seamless migration and the addition of services for scalability.
- Pioneers and earliest adopters of public cloud and NoSQL big data store - since more than a decade.
- Provide innovative internet services with work on multiple technologies like php, java, nodejs, python and c++ to scale our services as per need.
- Has Internet infrastructure peering arrangements with all the major and minor ISPs and telecom service providers.
- Have mail traffic exchange agreements with major Internet services.
Job Details :
- This job position provides competitive professional opportunity both to experienced and aspiring engineers. The company's technology and operations groups are managed by senior professionals with deep subject matter expertise.
- The company believes having an open work environment offering mentoring and learning opportunities with an informal and flexible work culture, which allows professionals to actively participate and contribute to the success of our services and business.
- You will be part of a team that keeps the business running for cloud products and services that are used 24- 7 by the company's consumers and enterprise customers around the world. You will be asked to contribute to operate, maintain and provide escalation support for the company's cloud infrastructure that powers all of cloud offerings.
Job Role :
- As a senior engineer, your role grows as you gain experience in our operations. We facilitate a hands-on learning experience after an induction program, to get you into the role as quickly as possible.
- The systems engineer role also requires candidates to research and recommend innovative and automated approaches for system administration tasks.
- The work culture allows a seamless integration with different product engineering teams. The teams work together and share responsibility to triage in complex operational situations. The candidate is expected to stay updated on best practices and help evolve processes both for resilience of services and compliance.
- You will be required to provide support for both, production and non-production environments to ensure system updates and expected service levels. You will be required to specifically handle 24/7 L2 and L3 oversight for incident responses and have an excellent understanding of the end-to-end support process from client to different support escalation levels.
- The role also requires a discipline to create, update and maintain process documents, based on operation incidents, technologies and tools used in the processes to resolve issues.
QUALIFICATION AND EXPERIENCE :
- A graduate degree or senior diploma in engineering or technology with some or all of the following:
- Knowledge and work experience with KVM, AWS (Glacier, S3, EC2), RabbitMQ, Fluentd, Syslog, Nginx is preferred
- Installation and tuning of Web Servers, PHP, Java servlets, memory-based databases for scalability and performance
- Knowledge of email related protocols such as SMTP, POP3, IMAP along with experience in maintenance and administration of MTAs such as postfix, qmail, etc will be an added advantage
- Must have knowledge on monitoring tools, trend analysis, networking technologies, security tools and troubleshooting aspects.
- Knowledge of analyzing and mitigating security related issues and threats is certainly desirable.
- Knowledge of agile development/SDLC processes and hands-on participation in planning sprints and managing daily scrum is desirable.
- Preferably, programming experience in Shell, Python, Perl or C.
Our Client is a B2B2C tech Web3 startup founded by founders - IITB Graduates who are experienced in retail, ecommerce and fintech.
Vision: Client aims to change the way that customers, creators, and retail investors interact and transact at brands of all shapes and sizes. Essentially, becoming the Web3 version of brands driven social ecommerce & investment platform. We have two broader development phases to achieve our mission.
candidate will be responsible for automating the deployment of cloud infrastructure and services to
support application development and hosting (architecting, engineering, deploying, and operationally
managing the underlying logical and physical cloud computing infrastructure).
Location: Bangalore
Reporting Manager: VP, Engineering
Job Description:
● Collaborate with teams to build and deliver solutions implementing serverless,
microservice-based, IaaS, PaaS, and containerized architectures in GCP/AWS environments.
● Responsible for deploying highly complex, distributed transaction processing systems.
● Work on continuous improvement of the products through innovation and learning. Someone with
a knack for benchmarking and optimization
● Hiring, developing, and cultivating a high and reliable cloud support team
● Building and operating complex CI/CD pipelines at scale
● Work with GCP Services, Private Service Connect, Cloud Run, Cloud Functions, Pub/Sub, Cloud
Storage, Networking in general
● Collaborate with Product Management and Product Engineering teams to drive excellence in
Google Cloud products and features.
● Ensures efficient data storage and processing functions in accordance with company security
policies and best practices in cloud security.
● Ensuring scaled database setup/montioring with near zero downtime
Key Skills:
● Hands-on software development experience in Python, NodeJS, or Java
● 5+ years of Linux/Unix Administration monitoring, reliability, and security of Linux-based, online,
high-traffic services and Web/eCommerce properties
● 5+ years of production experience in large-scale cloud-based Infrastructure (GCP preferred)
● Strong experience with Log Analysis and Monitoring tools such as CloudWatch, Splunk,
Dynatrace, Nagios, etc.
● Hands-on experience with AWS Cloud – EC2, S3 Buckets, RDS
● Hands-on experience with Infrastructure as a Code (e.g., cloud formation, ARM, Terraform,
Ansible, Chef, Puppet) and Version control tools
● Hands-on experience with configuration management (Chef/Ansible)
● Experience in designing High Availability infrastructure and planning for Disaster Recovery
solutions
● Knowledgeable in networking and security Familiar with GCP services (in Databases, Containers,
Compute, stores, etc) with comfort in GCP serverless technologies
● Exposure to Linkerd, Helm charts, and Ambassador is mandatory
● Experience with Big Data on GCP BigQuery, Pub/Sub, Dataproc, and Dataflow is plus
● Experience with Groovy and writing Jenkinsfile
● Experience with time-series and document databases (e.g. Elasticsearch, InfluxDB, Prometheus)
● Experience with message buses (e.g. Apache Kafka, NATS)
Regards
Team Merito
Our client is a call management solutions company, which helps small to mid-sized businesses use its virtual call center to manage customer calls and queries. It is an AI and cloud-based call operating facility that is affordable as well as feature-optimized. The advanced features offered like call recording, IVR, toll-free numbers, call tracking, etc are based on automation and enhances the call handling quality and process, for each client as per their requirements. They service over 6,000 business clients including large accounts like Flipkart and Uber.
- Beng involved in Configuration Management, Web Services Architectures, DevOps Implementation, Build & Release Management, Database management, Backups, and Monitoring.
- Creating and managing CI/ CD pipelines for microservice architectures.
- Creating and managing application configuration.
- Researching and planning architectures and tools for smooth deployments.
- Logging, metrics and alerting management.
What you need to have:
- Proficient in Linux Commands line and troubleshooting.
- Proficient in designing CI/ CD pipelines using jenkins. Experience in deployment using Ansible.
- Experience in microservices architecture deployment, Hands-on experience on Docker, Kubernetes, EKS.
- Knowledge of infrastructure management tools (Infrastructure as cloud) such as terraform, AWS cloudformation etc.
- Proficient in AWS Services. Deployment, Monitoring and troubleshooting applications in AWS.
- Configuration management tools like ansible/chef/puppet.
- Proficient in deployment of applications behind load balancers and proxy servers such as nginx, apache.
- Proficient in bash scripting, python scripting is an advantage.
- Experience with Logging, Monitoring, and Alerting tools like ELK(Elastic-search, Logstash, Kibana), Nagios. Graylog, splunk Prometheus, Grafana is a plus.
- Proficient in Configuration Management.
Job Description:
- Hands on experience with Ansible & Terraform.
- Scripting language, such as Python or Bash or PowerShell is required and willingness to learn and master others.
- Troubleshooting and resolving automation, build, and CI/CD related issues (in cloud environment like AWS or Azure).
- Experience with Kubernetes is mandate.
- To develop and maintain tooling and environments for test and production environments.
- Assist team members in the development and maintenance of tooling for integration testing, performance testing, security testing, as well as source control systems (that includes working in CI systems like Azure DevOps, Team City, and orchestration tools like Octopus).
- Good with Linux environment.
Roles and Responsibilities
- Primary stakeholder collaborating with Dir Engineering on software/infrastructure architecture, monitoring/alerting framework and all other architectural level technical issues
- Design and manage implementation of Silvermine’s high performance, scalable, extensible and resilient microservices application stack based of existing, partially migrated monolithic application and for new product development. Includes:
- Utilizing either ECS Fargate (no EC2 clusters) or EKS as the orchestration framework – to be tested up to a minimum of 100k concurrent users
- Exploring, designing and implementing use of on demand compute (Lambda) where appropriate
- Scalable and redundant data architecture supporting microservices design principles
- A scalable reverse proxy layer to isolate microservices from managing network connections
- Utilizing CDN capabilities to offload origin load via an intelligent caching strategy
- Leveraging best in breed AWS service offerings to enable team to focus on application stack instead of application scaffolding while minimizing operational complexity and cost
- Monitoring and optimizing of stack for
- Security and monitoring
- Leverage AWS and 3rd party services to monitor the application stack and data; secure them from DDOS attacks and security breaches; and alert the team in the vent of an incident
- Using APM and logging tools:
- Monitor application stack and infrastructure component performance
- Proactively detect, triage and mitigate stack performance issues
- Alert upon exception events
- Provide triaging tools for debugging and Root Cause Analysis.
- Enhance the CI/CD pipeline to support automated testing, a resilient deployment model (e.g., blue-green, canary) and 100% rollback support (including the data layer)
- Development a comprehensive, supportable, repeatable IAC implementation using CloudFormation or Terraform
- Take a leadership role and exhibit expertise in the development of standards, architectural governance, design patterns, best practices and optimization of existing architecture.
- Partner with teams and leaders to provide strategic consultation for business process design/optimization, creating strategic technology road maps, performing rapid prototyping and implementing technical solutions to accelerate the fulfillment of the business strategic vision.
- Staying up to date on emerging technologies (AI, Automation, Cloud etc.) and trends with a clear focus on productivity, ease of use and fit-for-purpose, by researching, testing, and evaluating.
- Providing POCs and product implementation guidelines.
- Applying imagination and innovation by creating, inventing, and implementing new or better approaches, alternatives and breakthrough ideas that are valued by customers within the function.
- Assessing current state of solutions, defining future state needs, identifying gaps and recommending new technology solutions and strategic business execution improvements.
- Overseeing and facilitating the evaluation and selection technology, product standards and the design of standard configurations/implementation patterns.
- Partnering with other architects and solution owners to create standards and set strategies for the enterprise.
- Communicating directly with business colleagues on applying digital workplace technologies to solve identified business challenges.
Skills Required:
- Good mentorship skills to coach and guide the team on AWS DevOps.
- Jenkins, Python, Pipeline as Code, Cloud Formation Templates and Terraform.
- Experience with Dockers, Containers, Lambda and Fargate is a must
- Experience with CI/CD and Release management
- Strong proficiency in PowerShell scripting
- Demonstrable expertise in Java
- Familiarity with REST APIs
Qualifications:
- Minimum of 5 years of relevant experience in Devops.
- Bachelors or Masters in Computer Science or equivalent degree.
- AWS Certifications is added advantage
We are hiring candidates who are looking to work in a cloud environment and ready to learn and adapt to the evolving technologies.
Linux Administrator Roles & Responsibilities:
- 5+ or more years of professional experience with strong working expertise in Agile environments
- Deep knowledge in managing Linux servers.
- Managing Windows servers(Not Mandatory).
- Manage Web servers (Apache, Nginx).
- Manage Application servers.
- Strong background & experience in any one scripting language (Bash, Python)
- Manage firewall rules.
- Perform root cause analysis for production errors.
- Basic administration of MySQL, MSSQL.
- Ready to learn and adapt to business requirements.
- Manage information security controls with best practises and processes.
- Support business requirements beyond working hours.
- Ensuring highest uptimes of the services.
- Monitoring resource usages.
Skills/Requirements
- Bachelor’s Degree or Diploma in Computer Science, Engineering, Software Engineering or a relevant field.
- Experience with Linux-based infrastructures, Linux/Unix administration.
- Knowledge in managing databases such as My SQL, MS SQL.
- Knowledge of scripting languages such as Python, Bash.
- Knowledge in open-source technologies and cloud services like AWS, Azure is a plus. Candidates willing to learn will be preferred.
- Experience in managing web applications.
- Problem-solving attitude.
- 5+ years experience in the IT industry.
Roles and Responsibilities
● Managing Availability, Performance, Capacity of infrastructure and applications.
● Building and implementing observability for applications health/performance/capacity.
● Optimizing On-call rotations and processes.
● Documenting “tribal” knowledge.
● Managing Infra-platforms like
- Mesos/Kubernetes
- CICD
- Observability(Prometheus/New Relic/ELK)
- Cloud Platforms ( AWS/ Azure )
- Databases
- Data Platforms Infrastructure
● Providing help in onboarding new services with the production readiness review process.
● Providing reports on services SLO/Error Budgets/Alerts and Operational Overhead.
● Working with Dev and Product teams to define SLO/Error Budgets/Alerts.
● Working with the Dev team to have an in-depth understanding of the application architecture and its bottlenecks.
● Identifying observability gaps in product services, infrastructure and working with stake owners to fix it.
● Managing Outages and doing detailed RCA with developers and identifying ways to avoid that situation.
● Managing/Automating upgrades of the infrastructure services.
● Automate toil work.
Experience & Skills
● 3+ Years of experience as an SRE/DevOps/Infrastructure Engineer on large scale microservices and infrastructure.
● A collaborative spirit with the ability to work across disciplines to influence, learn, and deliver.
● A deep understanding of computer science, software development, and networking principles.
● Demonstrated experience with languages, such as Python, Java, Golang etc.
● Extensive experience with Linux administration and good understanding of the various linux kernel subsystems (memory, storage, network etc).
● Extensive experience in DNS, TCP/IP, UDP, GRPC, Routing and Load Balancing.
● Expertise in GitOps, Infrastructure as a Code tools such as Terraform etc.. and Configuration Management Tools such as Chef, Puppet, Saltstack, Ansible.
● Expertise of Amazon Web Services (AWS) and/or other relevant Cloud Infrastructure solutions like Microsoft Azure or Google Cloud.
● Experience in building CI/CD solutions with tools such as Jenkins, GitLab, Spinnaker, Argo etc.
● Experience in managing and deploying containerized environments using Docker,
Mesos/Kubernetes is a plus.
● Experience with multiple datastores is a plus (MySQL, PostgreSQL, Aerospike,
Couchbase, Scylla, Cassandra, Elasticsearch).
● Experience with data platforms tech stacks like Hadoop, Hive, Presto etc is a plus

