
General Description:Owns all technical aspects of software development for assigned applications
Participates in the design and development of systems & application programs
Functions as Senior member of an agile team and helps drive consistent development practices – tools, common components, and documentation
Required Skills:
In depth experience configuring and administering EKS clusters in AWS.
In depth experience in configuring **Splunk SaaS** in AWS environments especially in **EKS**
In depth understanding of OpenTelemetry and configuration of **OpenTelemetry Collectors**
In depth knowledge of observability concepts and strong troubleshooting experience.
Experience in implementing comprehensive monitoring and logging solutions in AWS using **CloudWatch**.
Experience in **Terraform** and Infrastructure as code.
Experience in **Helm**Strong scripting skills in Shell and/or python.
Experience with large-scale distributed systems and architecture knowledge (Linux/UNIX and Windows operating systems, networking, storage) in a cloud computing or traditional IT infrastructure environment.
Must have a good understanding of cloud concepts (Storage /compute/network).
Experience collaborating with several cross functional teams to architect observability pipelines for various GCP services like GKE, cloud run Big Query etc.
Experience with Git and GitHub.Experience with code build and deployment using GitHub actions, and Artifact Registry.
Proficient in developing and maintaining technical documentation, ADRs, and runbooks.

Similar jobs
About the Role:
We are looking for a skilled AWS DevOps Engineer to join our Cloud Operations team in Bangalore. This hybrid role is ideal for someone with hands-on experience in AWS and a strong background in application migration from on-premises to cloud environments. You'll play a key role in driving cloud adoption, optimizing infrastructure, and ensuring seamless cloud operations.
Key Responsibilities:
- Manage and maintain AWS cloud infrastructure and services.
- Lead and support application migration projects from on-prem to cloud.
- Automate infrastructure provisioning using Infrastructure as Code (IaC) tools.
- Monitor cloud environments and optimize cost, performance, and reliability.
- Collaborate with development, operations, and security teams to implement DevOps best practices.
- Troubleshoot and resolve infrastructure and deployment issues.
Required Skills:
- 3–5 years of experience in AWS cloud environment.
- Proven experience with on-premises to cloud application migration.
- Strong understanding of AWS core services (EC2, VPC, S3, IAM, RDS, etc.).
- Solid scripting skills (Python, Bash, or similar).
Good to Have:
- Experience with Terraform for Infrastructure as Code.
- Familiarity with Kubernetes for container orchestration.
- Exposure to CI/CD tools like Jenkins, GitLab, or AWS CodePipeline.
What you will do:
- Handling Configuration Management, Web Services Architectures, DevOps Implementation, Build & Release Management, Database management, Backups and monitoring
- Logging, metrics and alerting management
- Creating Docker files
- Performing root cause analysis for production errors
What you need to have:
- 12+ years of experience in Software Development/ QA/ Software Deployment with 5+ years of experience in managing high performing teams
- Proficiency in VMware, AWS & cloud applications development, deployment
- Good knowledge in Java, Node.js
- Experience working with RESTful APIs, JSON etc
- Experience with Unit/ Functional automation is a plus
- Experience with MySQL, Mango DB, Redis, Rabbit MQ
- Proficiency in Jenkins. Ansible, Terraform/Chef/Ant
- Proficiency in Linux based Operating Systems
- Proficiency of Cloud Infrastructure like Dockers, Kubernetes
- Strong problem solving and analytical skills
- Good written and oral communication skills
- Sound understanding in areas of Computer Science such as algorithms, data structures, object oriented design, databases
- Proficiency in monitoring and observability
- Work towards improving the following 4 verticals - scalability, availability, security, and cost, for company's workflows and products.
- Help in provisioning, managing, optimizing cloud infrastructure in AWS (IAM, EC2, RDS, CloudFront, S3, ECS, Lambda, ELK etc.)
- Work with the development teams to design scalable, robust systems using cloud architecture for both 0-to-1 and 1-to-100 products.
- Drive technical initiatives and architectural service improvements.
- Be able to predict problems and implement solutions that detect and prevent outages.
- Mentor/manage a team of engineers.
- Design solutions with failure scenarios in mind to ensure reliability.
- Document rigorously to keep track of all changes/upgrades to the infrastructure and as well share knowledge with the rest of the team
- Identify vulnerabilities during development with actionable information to empower developers to remediate vulnerabilities
- Automate the build and testing processes to consistently integrate code
- Manage changes to documents, software, images, large web sites, and other collections of code, configuration, and metadata among disparate teams
Roles and Responsibilities
- Primary stakeholder collaborating with Dir Engineering on software/infrastructure architecture, monitoring/alerting framework and all other architectural level technical issues
- Design and manage implementation of Silvermine’s high performance, scalable, extensible and resilient microservices application stack based of existing, partially migrated monolithic application and for new product development. Includes:
- Utilizing either ECS Fargate (no EC2 clusters) or EKS as the orchestration framework – to be tested up to a minimum of 100k concurrent users
- Exploring, designing and implementing use of on demand compute (Lambda) where appropriate
- Scalable and redundant data architecture supporting microservices design principles
- A scalable reverse proxy layer to isolate microservices from managing network connections
- Utilizing CDN capabilities to offload origin load via an intelligent caching strategy
- Leveraging best in breed AWS service offerings to enable team to focus on application stack instead of application scaffolding while minimizing operational complexity and cost
- Monitoring and optimizing of stack for
- Security and monitoring
- Leverage AWS and 3rd party services to monitor the application stack and data; secure them from DDOS attacks and security breaches; and alert the team in the vent of an incident
- Using APM and logging tools:
- Monitor application stack and infrastructure component performance
- Proactively detect, triage and mitigate stack performance issues
- Alert upon exception events
- Provide triaging tools for debugging and Root Cause Analysis.
- Enhance the CI/CD pipeline to support automated testing, a resilient deployment model (e.g., blue-green, canary) and 100% rollback support (including the data layer)
- Development a comprehensive, supportable, repeatable IAC implementation using CloudFormation or Terraform
- Take a leadership role and exhibit expertise in the development of standards, architectural governance, design patterns, best practices and optimization of existing architecture.
- Partner with teams and leaders to provide strategic consultation for business process design/optimization, creating strategic technology road maps, performing rapid prototyping and implementing technical solutions to accelerate the fulfillment of the business strategic vision.
- Staying up to date on emerging technologies (AI, Automation, Cloud etc.) and trends with a clear focus on productivity, ease of use and fit-for-purpose, by researching, testing, and evaluating.
- Providing POCs and product implementation guidelines.
- Applying imagination and innovation by creating, inventing, and implementing new or better approaches, alternatives and breakthrough ideas that are valued by customers within the function.
- Assessing current state of solutions, defining future state needs, identifying gaps and recommending new technology solutions and strategic business execution improvements.
- Overseeing and facilitating the evaluation and selection technology, product standards and the design of standard configurations/implementation patterns.
- Partnering with other architects and solution owners to create standards and set strategies for the enterprise.
- Communicating directly with business colleagues on applying digital workplace technologies to solve identified business challenges.
Skills Required:
- Good mentorship skills to coach and guide the team on AWS DevOps.
- Jenkins, Python, Pipeline as Code, Cloud Formation Templates and Terraform.
- Experience with Dockers, Containers, Lambda and Fargate is a must
- Experience with CI/CD and Release management
- Strong proficiency in PowerShell scripting
- Demonstrable expertise in Java
- Familiarity with REST APIs
Qualifications:
- Minimum of 5 years of relevant experience in Devops.
- Bachelors or Masters in Computer Science or equivalent degree.
- AWS Certifications is added advantage
Why you should join us
- You will join the mission to create positive impact on millions of peoples lives
- You get to work on the latest technologies in a culture which encourages experimentation - You get to work with super humans (Psst: Look up these super human1, super human2, super human3, super human4)
- You get to work in an accelerated learning environment
What you will do
- You will provide deep technical expertise to your team in building future ready systems.
- You will help develop a robust roadmap for ensuring operational excellence
- You will setup infrastructure on AWS that will be represented as code
- You will work on several automation projects that provide great developer experience
- You will setup secure, fault tolerant, reliable and performant systems
- You will establish clean and optimised coding standards for your team that are well documented
- You will set up systems in a way that are easy to maintain and provide a great developer experience
- You will actively mentor and participate in knowledge sharing forums
- You will work in an exciting startup environment where you can be ambitious and try new things :)
You should apply if
- You have a strong foundation in Computer Science concepts and programming fundamentals
- You have been working on cloud infrastructure setup, especially on AWS since 8+ years
- You have set up and maintained reliable systems that operate at high scale
- You have experience in hardening and securing cloud infrastructures
- You have a solid understanding of computer networking, network security and CDNs
- Extensive experience in AWS, Kubernetes and optionally Terraform
- Experience in building automation tools for code build and deployment (preferably in JS)
- You understand the hustle of a startup and are good with handling ambiguity
- You are curious, a quick learner and someone who loves to experiment
- You insist on highest standards of quality, maintainability and performance
- You work well in a team to enhance your impact
About the company:
Tathastu, the next-generation innovation labs is Future Group’s initiative to provide a new-age retail experience - combining the physical with digital and enhancing it with data. We are creating next-generation consumer interactions by combining AI/ML, Data Science, and emerging technologies with consumer platforms.
The E-Commerce vertical under Tathastu has developed online consumer platforms for Future Group’s portfolio of retail brands -Easy day, Big Bazaar, Central, Brand factory, aLL, Clarks, Coverstory. Backed by our network of offline stores we have built a new retail platform that merges our Online & Offline retail streams. We use data to power all our decisions across our products and build internal tools to help us scale our impact with a small closely-knit team.
Our widespread store network, robust logistics, and technology capabilities have made it possible to launch a ‘2-Hour Delivery Promise’ on every product across fashion, food, FMCG, and home products for orders placed online through the Big Bazaar mobile app and portal. This makes Big Bazaar the first retailer in the country to offer instant home delivery on almost every consumer product ordered online.
Job Responsibilities:
- You’ll streamline and automate the software development and infrastructure management processes and play a crucial role in executing high-impact initiatives and continuously improving processes to increase the effectiveness of our platforms.
- You’ll translate complex use cases into discrete technical solutions in platform architecture, design and coding, functionality, usability, and optimization.
- You will drive automation in repetitive tasks, configuration management, and deliver comprehensive automated tests to debug/troubleshoot Cloud AWS-based systems and BigData applications.
- You’ll continuously discover, evaluate, and implement new technologies to maximize the development and operational efficiency of the platforms.
- You’ll determine the metrics that will define technical and operational success and constantly track such metrics to fine-tune the technology stack of the organization.
Experience: 4 to 8 Yrs
Qualification: B.Tech / MCA
Required Skills:
- Experience with Linux/UNIX systems administration and Amazon Web Services (AWS).
- Infrastructure as Code (Terraform), Kubernetes and container orchestration, Web servers (Nginx, Apache), Application Servers(Tomcat,Node.js,..), document stores and relational databases (AWS RDS-MySQL).
- Site Reliability Engineering patterns and visibility /performance/availability monitoring (Cloudwatch, Prometheus)
- Background in and happy to work hands-on with technical troubleshooting and performance tuning.
- Supportive and collaborative personality - ability to influence and drive progress with your peers
Our Technology Stack:
- Docker/Kubernetes
- Cloud (AWS)
- Python/GoLang Programming
- Microservices
- Automation Tools
- GCP Cloud experience mandatory
- CICD - Azure DevOps
- IaC tools – Terraform
- Experience with IAM / Access Management within cloud
- Networking / Firewalls
- Kubernetes / Helm / Istio
KaiOS is a mobile operating system for smart feature phones that stormed the scene to become the 3rd largest mobile OS globally (2nd in India ahead of iOS). We are on 100M+ devices in 100+ countries. We recently closed a Series B round with Cathay Innovation, Google and TCL.
What we are looking for:
- BE/B-Tech in Computer Science or related discipline
- 3+ years of overall commercial DevOps / infrastructure experience, cross-functional delivery-focused agile environment; previous experience as a software engineer and ability to see into and reason about the internals of an application a major plus
- Extensive experience designing, delivering and operating highly scalable resilient mission-critical backend services serving millions of users; experience operating and evolving multi-regional services a major plus
- Strong engineering skills; proficiency in programming languages
- Experience in Infrastructure-as-code tools, for example Terraform, CloudFormation
- Proven ability to execute on a technical and product roadmap
- Extensive experience with iterative development and delivery
- Extensive experience with observability for operating robust distributed systems – instrumentation, log shipping, alerting, etc.
- Extensive experience with test and deployment automation, CI / CD
- Extensive experience with infrastructure-as-code and tooling, such as Terraform
- Extensive experience with AWS, including services such as ECS, Lambda, DynamoDB, Kinesis, EMR, Redshift, Elasticsearch, RDS, CloudWatch, CloudFront, IAM, etc.
- Strong knowledge of backend and web technology; infrastructure experience with at-scale modern data technology a major plus
- Deep expertise in server-side security concerns and best practices
- Fluency with at least one programming language. We mainly use Golang, Python and JavaScript so far
- Outstanding analytical thinking, problem solving skills and attention to detail
- Excellent verbal and written communication skills
Requirements
Designation: DevOps Engineer
Location: Bangalore OR Hongkong
Experience: 4 to 7 Years
Notice period: 30 days or Less
Requirements
- Design, write and build tools to improve the reliability, latency, availability and scalability of HealthifyMe application.
- Communicate, collaborate and work effectively across distributed teams in a global environment
- Optimize performance and solve issues across the entire stack: hardware, software, application, and network.
- Experienced in building infrastructure with terraform / cloudformation or equivalent.
- Experience with ansible or equivalent is beneficial
- Ability to use a wide variety of Open Source Tools
- Experience with AWS is a must.
- Minimum 5 years of running services in a large scale environment.
- Expert level understanding of Linux servers, specifically RHEL/CentOS.
- Practical, proven knowledge of shell scripting and at least one higher-level language (eg. Python, Ruby, GoLang).
- Experience with source code and binary repositories, build tools, and CI/CD (Git, Artifactory, Jenkins, etc)
- Demonstrable knowledge of TCP/IP, HTTP, web application security, and experience supporting multi-tier web application architectures.
Look forward to
- Working with a world-class team.
- Fun & work at the same place with an amazing work culture and flexible timings.
- Get ready to transform yourself into a health junkie
Join HealthifyMe and make history!
- Mandatory: Docker, AWS, Linux, Kubernete or ECS
- Prior experience provisioning and spinning up AWS Clusters / Kubernetes
- Production experience to build scalable systems (load balancers, memcached, master/slave architectures)
- Experience supporting a managed cloud services infrastructure
- Ability to maintain, monitor and optimise production database servers
- Prior work with Cloud Monitoring tools (Nagios, Cacti, CloudWatch etc.)
- Experience with Docker, Kubernetes, Mesos, NoSQL databases (DynamoDB, Cassandra, MongoDB, etc)
- Other Open Source tools used in the infrastructure space (Packer, Terraform, Vagrant, etc.)
- In-depth knowledge on Linux Environment.
- Prior experience leading technical teams through the design and implementation of systems infrastructure projects.
- Working knowledge of Configuration Management (Chef, Puppet or Ansible preferred) Continuous Integration Tools (Jenkins preferred)
- Experience in handling large production deployments and infrastructure.
- DevOps based infrastructure and application deployments experience.
- Working knowledge of the AWS network architecture including designing VPN solutions between regions and subnets
- Hands-on knowledge with the AWS AMI architecture including the development of machine templates and blueprints
- He/she should be able to validate that the environment meets all security and compliance controls.
- Good working knowledge of AWS services such as Messaging, Application Services, Migration Services, Cost Management Platform.
- Proven written and verbal communication skills.
- Understands and can serve as the technical team lead to oversee the build of the Cloud environment based on customer requirements.
- Previous NOC experience.
- Client Facing Experience with excellent Customer Communication and Documentation Skills

