
Site Reliability Engineer
at Our client company is into Computer software. (YB1)
- Design, develop, test, debug and maintain components of the cloud infrastructure
- Manage operational priorities of the DBaaS
- Establish a process for handling and leading response to security new vulnerabilities
- Lead certification efforts from the security perspective
- Participate in penetration testing efforts
- Design and build DBaaS processes for key management, rotation storage, encryption, and password management
Requirements:
- Strong software design and implementation skills in building infrastructure frameworks
- Experience building and operating extensible, scalable resilient data systems
- Working knowledge of Java and Python Experience using public cloud infrastructure (AWS, GCP, or Azure)
- Containerization tooling (Docker, EKS, Kubernetes)
- Infrastructure as Code Tooling (Example: Terraform, Cloudformation, Etc.)
- Configuration Management Tooling (Ansible, Chef, etc.)
- Automation Scripting (Python preferred)
- Solid understanding of basic systems operations (disk, network, etc)
- Willingness and ability to learn new languages and concepts
- 5+ years of relevant experience

Similar jobs
Principal DevOps Engineer
Note - Screening Requirement: Please note that this position requires a minimum of 8+ years of hands-on Architecting DevOps/SRE/Platform Engineering experience specializing in AWS, EKS/Kubernetes, Terraform, Python, Jenkins, and AI workflows (Mandatory skills) from scratch.
We are looking for an absolute builder who has a proven track record of personally architecting, designing and setting up production-grade AWS EKS clusters entirely from scratch. If your experience is primarily limited to managing, maintaining, or optimizing pre-existing environments that were already stood up by another team, this is not the right opportunity for you.
Who are we?
Securin is an AI-driven cybersecurity company focused on proactive, adversarial exposure and vulnerability management. Our mission is to help organizations reduce cyber risk by identifying, prioritising, and remediating the issues that matter most. Powered by a seasoned team of threat researchers and status as a Certified Naming Authority (CNA), Securin combines artificial intelligence / machine learning, threat intelligence, and deep vulnerability research (including the Dark Web) to deliver an adversarial approach to cyber defense. We help enterprises shift from reactive patching to strategic, risk-based exposure and vulnerability management – driving smarter security decisions and faster remediation.
What do we promise?
We are a highly effective tech-enabled cybersecurity solutions provider and promise continual security posture improvement, enhanced attack surface visibility, and proactive prioritized remediation for every one of our client businesses.
What do we provide?
● A chance to be on the leading edge of cybersecurity and AI
● Ability to have direct impact on company growth and revenue strategy
● An opportunity to mentor and be mentored by experts in multiple disciplines
What do we deliver?
Securin helps organizations to identify and remediate the most dangerous exposures, vulnerabilities, and risks in their environment. We deliver predictive and definitive intelligence and facilitate proactive remediation to help organizations stay a step ahead of attackers. By utilising our cybersecurity solutions, our clients can have a proactive and holistic view of their security posture and protect their assets from even the most advanced and dynamic attacks.
Securin has been recognized by national and international organizations for its role in accelerating innovation in offensive and proactive security. Our combination of domain expertise, cutting-edge technology, and advanced tech-enabled cybersecurity solutions has made Securin a leader in the industry.
Core Technology Stack
AWS , EKS / Kubernetes, Jenkins , Python ,Terraform ,CI/CD , AI
Key Responsibilities:
● Architect and manage the end-to-end SaaS platform infrastructure on AWS, including EKS cluster design, VPC networking, IAM, and multi-region availability.
● Build, maintain, and optimize Jenkins-based CI/CD pipelines and develop Python automation scripts for provisioning, deployments, and runbook automation.
● Define and enforce platform SLOs/SLAs; own the observability strategy across logging, metrics, and tracing.
● Manage and participate in the on-call rotation; act as escalation point for P1/P2 incidents and drive post-incident reviews.
● Drive Infrastructure-as-Code (IaC) practices with Terraform/CloudFormation and champions a culture of automation and operational excellence.
● Collaborate cross-functionally with product, security, and engineering teams to align infrastructure roadmap with business goals.
● Identify opportunities to leverage AI to automate operational and DevOps workflows.
● Design and implement AI-assisted solutions for incident triaging, root cause analysis, log analysis, and performance optimization.
● Drive the adoption of AI-powered tools for infrastructure management, deployment automation, monitoring, and troubleshooting.
● Build intelligent workflows that reduce manual effort in release management, capacity planning, and operational support.
● Integrate AI capabilities into CI/CD pipelines to improve code quality, deployment reliability, and operational efficiency.
● Collaborate with engineering teams to automate repetitive tasks and improve developer productivity.
● Define best practices and governance for the safe and effective use of AI across DevOps processes.
● Measure and report on productivity gains, operational improvements, and cost savings achieved through AI adoption.
Requirements:
● 8+ years of experience in DevOps, SRE, or cloud infrastructure engineering roles.
● Deep hands-on expertise with AWS services (EC2, EKS, RDS, S3, IAM, VPC, CloudFront, Route53, Lambda, etc).
● Strong Kubernetes experience: cluster management, Helm, autoscaling (HPA/KEDA).
● Proficiency with Jenkins for complex CI/CD pipeline design and maintenance.
● Solid Python scripting skills for automation, tooling, and infrastructure management tasks.
● Experience with Infrastructure-as-Code using Terraform and/or AWS Cloud Formation.
● Proven track record of architecting and managing end-to-end SaaS products in a cloud-native environment.
● Strong understanding of networking fundamentals, security best practices, and compliance frameworks (SOC 2, ISO 27001 a plus).
● Hands-on experience with on-call processes and incident management frameworks.
Preferred Qualifications:
● AWS certifications: Solutions Architect Professional, DevOps Engineer Professional, or equivalent.
● Familiarity with service mesh, secrets management (Vault, AWS Secrets Manager), and zero-trust security models.
● Experience with multi-tenant SaaS architectures and tenant isolation strategies.
● Knowledge of FinOps principles and AWS cost management tooling.
● Experience with database DevOps: RDS, Aurora schema migrations, and backup strategies.
Core Competencies:
● Strategic Thinking – ability to translate business goals into scalable technical architecture.
● Operational Excellence – strong bias for reliability, automation, and continuous improvement.
● Communication – ability to clearly articulate complex technical topics to non-technical stakeholders.
● Ownership Mindset – proactively identifies and resolves risks without waiting to be asked.
● Resilience Under Pressure – calm and decisive during incidents; leads by example in high-stress situations.
Why should we connect?
We are a bunch of passionate cybersecurity professionals who are building a culture of security. Today, cybersecurity is no more a luxury but a necessity with a global market value of $150 billion.
At Securin, we live by a people-first approach. We firmly believe that our employees should enjoy what they do. For our employees, we provide a hybrid work environment with competitive best-in-industry pay, while providing them with an environment to learn, thrive, and grow. Our hybrid working environment allows employees to work from the comfort of their homes or the office if they choose to. For the right candidate, this will feel like your second home.
If you are passionate about cybersecurity just as we are, we would love to connect and share ideas
- Strong hands-on experience in Microsoft Azure Cloud.
- Good understanding of Azure services such as Compute, Storage, Event Hub, Event Subscription, Storage Queue, and PaaS services.
- Basic understanding of Azure AI Foundry and AI-related Azure service setup.
- Good Azure networking basics: VNet, subnet, routing, and basic troubleshooting.
- Strong knowledge of Terraform, especially:
- Terraform state
- plan / apply
- troubleshooting failures
- migration risks
- Terraform Enterprise concepts
- Strong Python coding capability, not just basic scripting.
- Experience using Python for API integration, automation, JSON/YAML handling, and internal tooling.
- Good understanding of CI/CD pipelines.
- Ability to troubleshoot pipeline failures.
- Comfortable with YAML and JSON.
- Ability to troubleshoot Azure infrastructure/platform issues.
- Ability to collect logs/evidence and coordinate with network/app/Microsoft support teams.
- Basic awareness of agentic AI / LLM concepts.
- Awareness of security and cost best practices.
Good to Have Skills
- Hands-on experience with Harness.
- Hands-on experience with Terraform Enterprise.
- Exposure to LangGraph / LangChain.
- Exposure to agentic AI workflows or skill creation.
- Exposure to Claude or enterprise LLM integrations.
- Knowledge of Azure ML Workspace, model registry, and managed endpoints.
- MLOps / LLMOps knowledge.
- FinOps / Azure cost optimization experience.
- Azure certifications: AZ-104, AZ-305, AZ-400, AZ-500.
Role: DevOps Engineer
Experience: 7+ Years
Location: Pune / Trivandrum
Work Mode: Hybrid
𝐊𝐞𝐲 𝐑𝐞𝐬𝐩𝐨𝐧𝐬𝐢𝐛𝐢𝐥𝐢𝐭𝐢𝐞𝐬:
- Drive CI/CD pipelines for microservices and cloud architectures
- Design and operate cloud-native platforms (AWS/Azure)
- Manage Kubernetes/OpenShift clusters and containerized applications
- Develop automated pipelines and infrastructure scripts
- Collaborate with cross-functional teams on DevOps best practices
- Mentor development teams on continuous delivery and reliability
- Handle incident management, troubleshooting, and root cause analysis
𝐌𝐚𝐧𝐝𝐚𝐭𝐨𝐫𝐲 𝐒𝐤𝐢𝐥𝐥𝐬:
- 7+ years in DevOps/SRE roles
- Strong experience with AWS or Azure
- Hands-on with Docker, Kubernetes, and/or OpenShift
- Proficiency in Jenkins, Git, Maven, JIRA
- Strong scripting skills (Shell, Python, Perl, Ruby, JavaScript)
- Solid networking knowledge and troubleshooting skills
- Excellent communication and collaboration abilities
𝐏𝐫𝐞𝐟𝐞𝐫𝐫𝐞𝐝 𝐒𝐤𝐢𝐥𝐥𝐬:
- Experience with Helm, monitoring tools (Splunk, Grafana, New Relic, Datadog)
- Knowledge of Microservices and SOA architectures
- Familiarity with database technologies

Job Title: DevOps - 3
Roles and Responsibilities:
- Develop deep understanding of the end-to-end configurations, dependencies, customer requirements, and overall characteristics of the production services as the accountable owner for overall service operations
- Implementing best practices, challenging the status quo, and tab on industry and technical trends, changes, and developments to ensure the team is always striving for best-in-class work
- Lead incident response efforts, working closely with cross-functional teams to resolve issues quickly and minimize downtime. Implement effective incident management processes and post-incident reviews
- Participate in on-call rotation responsibilities, ensuring timely identification and resolution of infrastructure issues
- Possess expertise in designing and implementing capacity plans, accurately estimating costs and efforts for infrastructure needs.
- Systems and Infrastructure maintenance and ownership for production environments, with a continued focus on improving efficiencies, availability, and supportability through automation and well defined runbooks
- Provide mentorship and guidance to a team of DevOps engineers, fostering a collaborative and high-performing work environment. Mentor team members in best practices, technologies, and methodologies.
- Design for Reliability - Architect & implement solutions that keeps Infrastructure running with Always On availability and ensures high uptime SLA for the Infrastructure
- Manage individual project priorities, deadlines, and deliverables related to your technical expertise and assigned domains
- Collaborate with Product & Information Security teams to ensure the integrity and security of Infrastructure and applications. Implement security best practices and compliance standards.
Must Haves
- 5-8 years of experience as Devops / SRE / Platform Engineer.
- Strong expertise in automating Infrastructure provisioning and configuration using tools like Ansible, Packer, Terraform, Docker, Helm Charts etc.
- Strong skills in network services such as DNS, TLS/SSL, HTTP, etc
- Expertise in managing large-scale cloud infrastructure (preferably AWS and Oracle)
- Expertise in managing production grade Kubernetes clusters
- Experience in scripting using programming languages like Bash, Python, etc.
- Expertise in skill sets for centralized logging systems, metrics, and tooling frameworks such as ELK, Prometheus/VictoriaMetrics, and Grafana etc.
- Experience in Managing and building High scale API Gateway, Service Mesh, etc
- Systematic problem-solving approach, coupled with strong communication skills and a sense of ownership and drive
- Have a working knowledge of a backend programming language
- Deep knowledge & experience with Unix / Linux operating systems internals (Eg. filesystems, user management, etc)
- A working knowledge and deep understanding of cloud security concepts
- Proven track record of driving results and delivering high-quality solutions in a fast-paced environment
- Demonstrated ability to communicate clearly with both technical and non-technical project stakeholders, with the ability to work effectively in a cross-functional team environment.
-
Working with Ruby, Python, Perl, and Java
-
Troubleshooting and having working knowledge of various tools, open-source technologies, and cloud services.
-
Configuring and managing databases and cache layers such as MySQL, Mongo, Elasticsearch, Redis
-
Setting up all databases and for optimisations (sharding, replication, shell scripting etc)
-
Creating user, Domain handling, Service handling, Backup management, Port management, SSL services
-
Planning, testing & development of IT Infrastructure ( Server configuration and Database) and handling the technical issue related to server Docker and VM optimization
-
Demonstrate awareness of DB management, server related work, Elasticsearch.
-
Selecting and deploying appropriate CI/CD tools
-
Striving for continuous improvement and build continuous integration, continuous development, and constant deployment pipeline (CI/CD Pipeline)
-
Experience working on Linux based infrastructure
-
Awareness of critical concepts in DevOps and Agile principles
-
6-8 years of experience
This company is a network of the world's best developers - full-time, long-term remote software jobs with better compensation and career growth. We enable our clients to accelerate their Cloud Offering, and Capitalize on Cloud. We have our own IOT/AI platform and we provide professional services on that platform to build custom clouds for their IOT devices. We also build mobile apps, run 24x7 devops/site reliability engineering for our clients.
We are looking for very hands-on SRE (Site Reliability Engineering) engineers with 3 to 6 years of experience. The person will be part of team that is responsible for designing & implementing automation from scratch for medium to large scale cloud infrastructure and providing 24x7 services to our North American / European customers. This also includes ensuring ~100% uptime for almost 50+ internal sites. The person is expected to deliver with both high speed and high quality as well as work for 40 Hours per week (~6.5 hours per day, 6 days per week) in shifts which will rotate every month.
This person MUST have:
- B.E Computer Science or equivalent
- 2+ Years of hands-on experience troubleshooting/setting up of the Linux environment, who can write shell scripts for any given requirement.
- 1+ Years of hands-on experience setting up/configuring AWS or GCP services from SCRATCH and maintaining them.
- 1+ Years of hands-on experience setting up/configuring Kubernetes & EKS and ensuring high availability of container orchestration.
- 1+ Years of hands-on experience setting up CICD from SCRATCH in Jenkins & Gitlab.
- Experience configuring/maintaining one monitoring tool.
- Excellent verbal & written communication skills.
- Candidates with certifications - AWS, GCP, CKA, etc will be preferred
- Hands-on experience with databases (Cassandra, MongoDB, MySQL, RDS).
Experience:
- Min 3 years of experience as SRE automation engineer building, running, and maintaining production sites. Not looking for candidates who have experience only as L1/L2 or Build & Deploy..
Location:
- Remotely, anywhere in India
Timings:
- The person is expected to deliver with both high speed and high quality as well as work for 40 Hours per week (~6.5 hours per day, 6 days per week) in shifts which will rotate every month.
Position:
- Full time/Direct
- We have great benefits such as PF, medical insurance, 12 annual company holidays, 12 PTO leaves per year, annual increments, Diwali bonus, spot bonuses and other incentives etc.
- We dont believe in locking in people with large notice periods. You will stay here because you love the company. We have only a 15 days notice period.
- 2+ years of demonstrable experience leading site reliability and performance in large-scale, high-traffic environments
- 2+ years of hands-on experience as a DevOps engineer
- Strong leadership, communication and interpersonal skills geared to getting things done
- Developing themselves and the talent within their charge – fostering and creating opportunity for the team
- Strong understanding of SRE concepts and the DevOps culture. Set the direction and strategy for your team, and help shape the overall SRE program for the company
- Be able to lead complicated technical issues and communicating status updates/RCA with management and customers.
- Own site stability, performance, capacity planning, DevOps recruitment.









