
Role : Principal Devops Engineer
About the Client
It is a Product base company that has to build a platform using AI and ML technology for their transportation and logiticsThey also have a presence in the global market
Responsibilities and Requirements
• Experience in designing and maintaining high volume and scalable micro-services architecture on cloud infrastructure
• Knowledge in Linux/Unix Administration and Python/Shell Scripting
• Experience working with cloud platforms like AWS (EC2, ELB, S3, Auto-scaling, VPC, Lambda), GCP, Azure
• Knowledge in deployment automation, Continuous Integration and Continuous Deployment (Jenkins, Maven, Puppet, Chef, GitLab) and monitoring tools like Zabbix, Cloud Watch Monitoring, Nagios
• Knowledge of Java Virtual Machines, Apache Tomcat, Nginx, Apache Kafka, Microservices architecture, Caching mechanisms
• Experience in enterprise application development, maintenance and operations
• Knowledge of best practices and IT operations in an always-up, always-available service
• Excellent written and oral communication skills, judgment and decision-making skill

Similar jobs
About Simbian
Simbian is at the forefront of cybersecurity innovation, leveraging purpose-built AI Agents to deliver 10x security outcomes for global enterprises and MSSPs. Our platform autonomously investigates and responds to alerts, freeing security teams from repetitive tasks. Simbian combines privacy-first technology, proven integration with 70+ enterprise tools, and rapid deployment for measurable value. Role
Overview
We are seeking a collaborative, innovative DevOps Engineer passionate about enabling secure, scalable operations for cutting-edge cybersecurity products. Join our team during a period of high growth and help architect the future of agentic AI security platforms.
Key Responsibilities
• Kubernetes Management:
o Manage and maintain production-grade Kubernetes clusters across multiple cloud providers (AWS is essential, Azure is valuable, GCP is a plus).
o Deploy, upgrade, troubleshoot, and scale stateful and stateless workloads (NGINX, Postgres, MongoDB, OpenCTI, OpenSearch, Kafka, Hadoop, Fluentd) in Kubernetes.
• Cloud Operations:
o Operate and optimize cloud environments, with strong expertise in AWS (AWS Certified Solutions Architect Professional or equivalent Azure cert preferred).
o Design, deploy, and manage infrastructure on AWS and Azure (GCP optional). • SQL Database Management:
o Administer SQL databases, ideally Postgres, on Kubernetes clusters or cloud VMs.
o Perform routine maintenance, backups, upgrades, monitoring, and optimization.
• Infrastructure as Code:
o Build, install, upgrade, and maintain Helm charts with expertise.
o Use and understand Ansible for cloud automation (AWS/Azure), and Terraform for infrastructure provisioning.
• Monitoring, Logging, Observability:
o Implement and manage logging and metrics stacks using OpenSearch/Elasticsearch, Prometheus, Grafana, Thanos or similar open source tools.
• Programming & Scripting:
o Develop automation scripts in Bash (proficient with control structures). o Produce scripts or microservices in Node.js (preferred) or Python/Django (bonus).
• CI/CD:
o Build and maintain CI/CD pipelines preferably using GitHub Actions (Jenkins or equivalent is acceptable).
• Containerization:
o Create, manage, and troubleshoot Docker/Podman containers, images, volumes, and use Docker Compose for local development.
• Customer-Facing On-Prem Deployments (Bonus):
o Install, configure, and support Kubernetes on customer premises.
o Demonstrate ownership, initiative, and strong customer communication skills.
o Solid knowledge of Linux administration, networking, and cloud environments.
What You’ll Bring:
• 4+ years’ experience in DevOps, SRE, or Production Engineering.
• Mastery of Kubernetes, AWS, infrastructure automation, and database management.
• Strong collaborative, curious, and growth-driven mindset.
• Ability to challenge ideas, drive innovation, and embrace rapid change.
• Excellent communication for technical customer interactions.
Why Join Simbian?
• Work with pioneering agentic AI security—impact global security teams.
• Shape infrastructure for privacy-first technology in a high-growth startup.
• Enjoy a dynamic remote-first work culture with opportunities for ownership and advancement.
Key Responsibilities
- Design, implement, and maintain CI/CD pipelines for backend, frontend, and mobile applications.
- Manage cloud infrastructure using AWS (EC2, Lambda, S3, VPC, RDS, CloudWatch, ECS/EKS).
- Configure and maintain Docker containers and/or Kubernetes clusters.
- Implement and maintain Infrastructure as Code (IaC) using Terraform / CloudFormation.
- Automate build, deployment, and monitoring processes.
- Manage code repositories using Git/GitHub/GitLab, enforce branching strategies.
- Implement monitoring and alerting using tools like Prometheus, Grafana, CloudWatch, ELK, Splunk.
- Ensure system scalability, reliability, and security.
- Troubleshoot production issues and perform root-cause analysis.
- Collaborate with engineering teams to improve deployment and development workflows.
- Optimize infrastructure costs and improve performance.
Required Skills & Qualifications
- 3+ years of experience in DevOps, SRE, or Cloud Engineering.
- Strong hands-on knowledge of AWS cloud services.
- Experience with Docker, containers, and orchestrators (ECS, EKS, Kubernetes).
- Strong understanding of CI/CD tools: GitHub Actions, Jenkins, GitLab CI, or AWS CodePipeline.
- Experience with Linux administration and shell scripting.
- Strong understanding of Networking, VPC, DNS, Load Balancers, Security Groups.
- Experience with monitoring/logging tools: CloudWatch, ELK, Prometheus, Grafana.
- Experience with Terraform or CloudFormation (IaC).
- Good understanding of Node.js or similar application deployments.
- Knowledge of NGINX/Apache and load balancing concepts.
- Strong problem-solving and communication skills.
Preferred/Good to Have
- Experience with Kubernetes (EKS).
- Experience with Serverless architectures (Lambda).
- Experience with Redis, MongoDB, RDS.
- Certification in AWS Solutions Architect / DevOps Engineer.
- Experience with security best practices, IAM policies, and DevSecOps.
- Understanding of cost optimization and cloud cost management.
Job Title: AWS DevOps Engineer
Experience Level: 5+ Years
Location: Bangalore, Pune, Hyderabad, Chennai and Gurgaon
Summary:
We are looking for a hands-on Platform Engineer with strong execution skills to provision and manage cloud infrastructure. The ideal candidate will have experience with Linux, AWS services, Kubernetes, and Terraform, and should be capable of troubleshooting complex issues in cloud and container environments.
Key Responsibilities:
- Provision AWS infrastructure using Terraform (IaC).
- Manage and troubleshoot Kubernetes clusters (EKS/ECS).
- Work with core AWS services: VPC, EC2, S3, RDS, Lambda, ALB, WAF, and CloudFront.
- Support CI/CD pipelines using Jenkins and GitHub.
- Collaborate with teams to resolve infrastructure and deployment issues.
- Maintain documentation of infrastructure and operational procedures.
Required Skills:
- 3+ years of hands-on experience in AWS infrastructure provisioning using Terraform.
- Strong Linux administration and troubleshooting skills.
- Experience managing Kubernetes clusters.
- Basic experience with CI/CD tools like Jenkins and GitHub.
- Good communication skills and a positive, team-oriented attitude.
Preferred:
- AWS Certification (e.g., Solutions Architect, DevOps Engineer).
- Exposure to Agile and DevOps practices.
- Experience with monitoring and logging tools.
We are looking for an excellent experienced person in the Dev-Ops field. Be a part of a vibrant, rapidly growing tech enterprise with a great working environment. As a DevOps Engineer, you will be responsible for managing and building upon the infrastructure that supports our data intelligence platform. You'll also be involved in building tools and establishing processes to empower developers to
deploy and release their code seamlessly.
Responsibilities
The ideal DevOps Engineers possess a solid understanding of system internals and distributed systems.
Understanding accessibility and security compliance (Depending on the specific project)
User authentication and authorization between multiple systems,
servers, and environments
Integration of multiple data sources and databases into one system
Understanding fundamental design principles behind a scalable
application
Configuration management tools (Ansible/Chef/Puppet), Cloud
Service Providers (AWS/DigitalOcean), Docker+Kubernetes ecosystem is a plus.
Should be able to make key decisions for our infrastructure,
networking and security.
Manipulation of shell scripts during migration and DB connection.
Monitor Production Server Health of different parameters (CPU Load, Physical Memory, Swap Memory and Setup Monitoring tool to
Monitor Production Servers Health, Nagios
Created Alerts and configured monitoring of specified metrics to
manage their cloud infrastructure efficiently.
Setup/Managing VPC, Subnets; make connection between different zones; blocking suspicious ip/subnet via ACL.
Creating/Managing AMI/Snapshots/Volumes, Upgrade/downgrade
AWS resources (CPU, Memory, EBS)
The candidate would be Responsible for managing microservices at scale maintain the compute and storage infrastructure for various product teams.
Strong Knowledge about Configuration Management Tools like –
Ansible, Chef, Puppet
Extensively worked with Change tracking tools like JIRA and log
Analysis, Maintaining documents of production server error log's
reports.
Experienced in Troubleshooting, Backup, and Recovery
Excellent Knowledge of Cloud Service Providers like – AWS, Digital
Ocean
Good Knowledge about Docker, Kubernetes eco-system.
Proficient understanding of code versioning tools, such as Git
Must have experience working in an automated environment.
Good knowledge of Amazon Web Service Architects like – Amazon EC2, Amazon S3 (Amazon Glacier), Amazon VPC, Amazon Cloud Watch.
Scheduling jobs using crontab, Create SWAP Memory
Proficient Knowledge about Access Management (IAM)
Must have expertise in Maven, Jenkins, Chef, SVN, GitHub, Tomcat, Linux, etc.
Candidate Should have good knowledge about GCP.
EducationalQualifications
B-Tech-IT/M-Tech -/MBA- IT/ BCA /MCA or any degree in the relevant field
EXPERIENCE: 2-6 yr
Job Description:
Responsibilities
· Having E2E responsibility for Azure landscape of our customers
· Managing to code release and operational tasks within a global team with a focus on automation, maintainability, security and customer satisfaction
· Make usage of CI/CD framework to rapidly support lifecycle management of the platform
· Acting as L2-L3 support for incidents, problems and service request
· Work with various Atos and 3rd party teams to resolve incidents and implement changes
· Implement and drive automation and self-healing solutions to reduce toil
· Enhance error budgets and hands on design and development of solutions to address reliability issues and/or risks
· Support ITSM processes and collaborate with service management representatives
Job Requirements
· Azure Associate certification or equivalent knowledge level
· 5+ years of professional experience
· Experience with Terraform and/or native Azure automation
· Knowledge of CI/CD concepts and toolset (i.e. Jenkins, Azure DevOps, Git)
· Must be adaptable to work in a varied, fast paced exciting, ever changing environment
· Good analytical and problem-solving skills to resolve technical issues
· Understanding of Agile development and SCRUM concepts a plus
· Experience with Kubernetes architecture and tools a plus
- Cloud and virtualization-based technologies (Amazon Web Services (AWS), VMWare).
- Java Application Server Administration (Weblogic, WidlFfy, JBoss, Tomcat).
- Docker and Kubernetes (EKS)
- Linux/UNIX Administration (Amazon Linux and RedHat).
- Developing and supporting cloud infrastructure designs and implementations and guiding application development teams.
- Configuration Management tools (Chef or Puppet or ansible).
- Log aggregations tools such as Elastic and/or Splunk.
- Automate infrastructure and application deployment-related tasks using terraform.
- Automate repetitive tasks required to maintain a secure and up-to-date operational environment.
Responsibilities
- Build and support always-available private/public cloud-based software-as-a-service (SaaS) applications.
- Build AWS or other public cloud infrastructure using Terraform.
- Deploy and manage Kubernetes (EKS) based docker applications in AWS.
- Create custom OS images using Packer.
- Create and revise infrastructure and architectural designs and implementation plans and guide the implementation with operations.
- Liaison between application development, infrastructure support, and tools (IT Services) teams.
- Development and documentation of Chef recipes and/or ansible scripts. Support throughout the entire deployment lifecycle (development, quality assurance, and production).
- Help developers leverage infrastructure, application, and cloud platform features and functionality participate in code and design reviews, and support developers by building CI/CD pipelines using Bamboo, Jenkins, or Spinnaker.
- Create knowledge-sharing presentations and documentation to help developers and operations teams understand and leverage the system's capabilities.
- Learn on the job and explore new technologies with little supervision.
- Leverage scripting (BASH, Perl, Ruby, Python) to build required automation and tools on an ad-hoc basis.
Who we have in mind:
- Solid experience in building a solution on AWS or other public cloud services using Terraform.
- Excellent problem-solving skills with a desire to take on responsibility.
- Extensive knowledge in containerized application and deployment in Kubernetes
- Extensive knowledge of the Linux operating system, RHEL preferred.
- Proficiency with shell scripting.
- Experience with Java application servers.
- Experience with GiT and Subversion.
- Excellent written and verbal communication skills with the ability to communicate technical issues to non-technical and technical audiences.
- Experience working in a large-scale operational environment.
- Internet and operating system security fundamentals.
- Extensive knowledge of massively scalable systems. Linux operating system/application development desirable.
- Programming in scripting languages such as Python. Other object-oriented languages (C++, Java) are a plus.
- Experience with Configuration Management Automation tools (chef or puppet).
- Experience with virtualization, preferably on multiple hypervisors.
- BS/MS in Computer Science or equivalent experience.
- Excellent written and verbal skills.
Education or Equivalent Experience:
- Bachelor's degree or equivalent education in related fields
- Certificates of training in associated fields/equipment’s
- Automate deployments of infrastructure components and repetitive tasks.
- Drive changes strictly via the infrastructure-as-code methodology.
- Promote the use of source control for all changes including application and system-level changes.
- Design & Implement self-recovering systems after failure events.
- Participate in system sizing and capacity planning of various components.
- Create and maintain technical documents such as installation/upgrade MOPs.
- Coordinate & collaborate with internal teams to facilitate installation & upgrades of systems.
- Support 24x7 availability for corporate sites & tools.
- Participate in rotating on-call schedules.
- Actively involved in researching, evaluating & selecting new tools & technologies.
- Cloud computing – AWS, OCI, OpenStack
- Automation/Configuration management tools such as Terraform & Chef
- Atlassian tools administration (JIRA, Confluence, Bamboo, Bitbucket)
- Scripting languages - Ruby, Python, Bash
- Systems administration experience – Linux (Redhat), Mac, Windows
- SCM systems - Git
- Build tools - Maven, Gradle, Ant, Make
- Networking concepts - TCP/IP, Load balancing, Firewall
- High-Availability, Redundancy & Failover concepts
- SQL scripting & queries - DML, DDL, stored procedures
- Decisive and ability to work under pressure
- Prioritizing workload and multi-tasking ability
- Excellent written and verbal communication skills
- Database systems – Postgres, Oracle, or other RDBMS
- Mac automation tools - JAMF or other
- Atlassian Datacenter products
- Project management skills
Qualifications
- 3+ years of hands-on experience in the field or related area
- Requires MS or BS in Computer Science or equivalent field
About the job
👉 TL; DR: We at Sarva Labs Inc., are looking for Site Reliability Engineers with experience to join our team. As a Protocol Developer, you will handle assets in data centers across Asia, Europe and Americas for the World’s First Context-Aware Peer-to-Peer Network enabling Web4.0. We are looking for that person who will take over the ownership of DevOps, establish proper deployment processes and work with engineering teams and hustle through the Main Net launch.
About Us 🚀
Imagine if each user had their own chain with each transaction being settled by a dynamic group of nodes who come together and settle that interaction with near immediate finality without a volatile gas cost. That’s MOI for you, Anon.
Visit https://www.sarva.ai/ to know more about who we are as a company
Visit https://www.moi.technology/ to know more about the technology and team!
Visit https://www.moi-id.life/ , https://www.moibit.io/ , https://www.moiverse.io/ to know more
Read our developer documentation at https://apidocs.moinet.io/
What you'll do 🛠
- You will take over the ownership of DevOps, establish proper deployment processes and work with engineering teams to ensure an appropriate degree of automation for component assembly, deployment, and rollback strategies in medium to large scale environments
- Monitor components to proactively prevent system component failure, and enable the engineering team on system characteristics that require improvement
- You will ensure the uninterrupted operation of components through proactive resource management and activities such as security/OS/Storage/application upgrades
You'd fit in 💯 if you...
- Familiar with any of these providers: AWS, GCP, DO, Azure, RedSwitches, Contabo, Redswitches, Hetzner, Server4you, Velia, Psychz, Tier and so on
- Experience in virtualizing bare metals using Openstack / VMWare / Similar is a PLUS
- Seasoned in building and managing VMs, Containers and clusters across the continents
- Confident in making best use of Docker, Kubernetes with stateful set deployment, autoscaling, rolling update, UI dashboard, replications, persistent volume, ingress
- Must have experience deploying in multi-cloud environments
- Working knowledge on automation tools such as Terraform, Travis, Packer, Chef, etc.
- Working knowledge on Scalability in a distributed and decentralised environment
- Familiar with Apache, Rancher, Nginx, SELinux/Ubuntu 18.04 LTS/CentOS 7 and RHEL
- Monitoring tools like PM2, Grafana and so on
- Hands-on with ELK stack/similar for log analytics
🌱 Join Us
- Flexible work timings
- We’ll set you up with your workspace. Work out of our Villa which has a lake view!
- Competitive salary/stipend
- Generous equity options (for full-time employees)
Your Role:
- Serve as a primary point responsible for the overall health, performance, and capacity of one or more of our Internet-facing services
- Gain deep knowledge of our complex applications
- Assist in the roll-out and deployment of new product features and installations to facilitate our rapid iteration and constant growth
- Develop tools to improve our ability to rapidly deploy and effectively monitor custom applications in a large-scale UNIX environment
- Work closely with development teams to ensure that platforms are designed with "operability" in mind.
- Function well in a fast-paced, rapidly-changing environment
- Should be able to lead a team of smart engineers
- Should be able to strategically guide the team to greater automation adoption
Must Have:
- Experience Building/managing DevOps/SRE teams
- Strong in troubleshooting/debugging Systems, Network and Applications
- Strong in Unix/Linux operating systems and Networking
- Working knowledge of Open source technologies in Monitoring, Deployment and incident management
Good to Have:
- Minimum 3+ years of team management experience
- Experience in Containers and orchestration layers like Kubernetes, Mesos/Marathon
- Proven experience in programming & diagnostics in any languages like Go, Python, Java
- Experience in NoSQL/SQL technologies like Cassandra/MySQL/CouchBase etc.
- Experience in BigData technologies like Kafka/Hadoop/Airflow/Spark
- Is a die-hard sports fan











.png&w=256&q=75)
