Cutshort logo
Toast logo
Senior Site Reliability Engineer
Senior Site Reliability Engineer
Toast's logo

Senior Site Reliability Engineer

Sandeep Dhara's profile picture
Posted by Sandeep Dhara
7 - 10 yrs
Best in industry
Remote, Bengaluru (Bangalore)
Skills
skill iconDocker
skill iconKubernetes
DevOps
skill iconAmazon Web Services (AWS)
Windows Azure
Google Cloud Platform (GCP)
Ansible
skill iconJenkins
Terraform

Now, more than ever, the Toast team is committed to our customers. We’re taking steps to help restaurants navigate these unprecedented times with technology, resources, and community. Our focus is on building a restaurant platform that helps restaurants adapt, take control, and get back to what they do best: building the businesses they love. And because our technology is purpose-built for restaurants by restaurant people, restaurants can trust that we’ll deliver on their needs for today while investing in experiences that will power their restaurant of the future.


At Toast, our Site Reliability Engineers (SREs) are responsible for keeping all customer-facing services and other Toast production systems running smoothly. SREs are a blend of pragmatic operators and software craftspeople who apply sound software engineering principles, operational discipline, and mature automation to our environments and our codebase. Our decisions are based on instrumentation and continuous observability, as well as predictions and capacity planning.


About this roll* (Responsibilities) 

  • Gather and analyze metrics from both operating systems and applications to assist in performance tuning and fault finding
  • Partner with development teams to improve services through rigorous testing and release procedures
  • Participate in system design consulting, platform management, and capacity planning
  • Create sustainable systems and services through automation and uplift
  • Balance feature development speed and reliability with well-defined service level objectives


Troubleshooting and Supporting Escalations:

  • Gather and analyze metrics from both operating systems and applications to assist in performance tuning and fault finding
  • Diagnose performance bottlenecks and implement optimizations across infrastructure, databases, web, and mobile applications
  • Implement strategies to increase system reliability and performance through on-call rotation and process optimization
  • Perform and run blameless RCAs on incidents and outages aggressively, looking for answers that will prevent the incident from ever happening again


Do you have the right ingredients? (Requirements)


  • Extensive industry experience with at least 7+ years in SRE and/or DevOps roles
  • Polyglot technologist/generalist with a thirst for learning
  • Deep understanding of cloud and microservice architecture and the JVM
  • Experience with tools such as APM, Terraform, Ansible, GitHub, Jenkins, and Docker
  • Experience developing software or software projects in at least four languages, ideally including two of Go, Python, and Java
  • Experience with cloud computing technologies ( AWS cloud provider preferred)



Bread puns are encouraged but not required

Read more
Users love Cutshort
Read about what our users have to say about finding their next opportunity on Cutshort.
Subodh Popalwar's profile image

Subodh Popalwar

Software Engineer, Memorres
For 2 years, I had trouble finding a company with good work culture and a role that will help me grow in my career. Soon after I started using Cutshort, I had access to information about the work culture, compensation and what each company was clearly offering.
Companies hiring on Cutshort
companies logos

About Toast

Founded :
2012
Type
Size
Stage :
Profitable
About
N/A
Company social profiles
N/A

Similar jobs

LogiNext
at LogiNext
1 video
7 recruiters
Rakhi Daga
Posted by Rakhi Daga
Mumbai
11 - 15 yrs
₹1L - ₹15L / yr
Microservices
Linux/Unix
skill iconPython
Shell Scripting
skill iconAmazon Web Services (AWS)
+22 more

Only apply on this link - https://loginext.hire.trakstar.com/jobs/fk025uh?source=" target="_blank">https://loginext.hire.trakstar.com/jobs/fk025uh?source=

LogiNext is looking for a technically savvy and passionate Associate Vice President - Product Engineering - DevOps or Senior Database Administrator to cater to the development and operations efforts in product. You will choose and deploy tools and technologies to build and support a robust infrastructure.

You have hands-on experience in building secure, high-performing and scalable infrastructure. You have experience to automate and streamline the development operations and processes. You are a master in troubleshooting and resolving issues in dev, staging and production environments.

 

Responsibilities:

  • Design and implement scalable infrastructure for delivering and running web, mobile and big data applications on cloud
  • Scale and optimise a variety of SQL and NoSQL databases (especially MongoDB), web servers, application frameworks, caches, and distributed messaging systems
  • Automate the deployment and configuration of the virtualized infrastructure and the entire software stack
  • Plan, implement and maintain robust backup and restoration policies ensuring low RTO and RPO
  • Support several Linux servers running our SaaS platform stack on AWS, Azure, IBM Cloud, Ali Cloud
  • Define and build processes to identify performance bottlenecks and scaling pitfalls
  • Manage robust monitoring and alerting infrastructure 
  • Explore new tools to improve development operations to automate daily tasks
  • Ensure High Availability and Auto-failover with minimum or no manual interventions


Requirements:

  • Bachelor’s degree in Computer Science, Information Technology or a related field
  • 11 to 14 years of experience in designing and maintaining high volume and scalable micro-services architecture on cloud infrastructure
  • Strong background in Linux/Unix Administration and Python/Shell Scripting
  • Extensive experience working with cloud platforms like AWS (EC2, ELB, S3, Auto-scaling, VPC, Lambda), GCP, Azure
  • Experience in deployment automation, Continuous Integration and Continuous Deployment (Jenkins, Maven, Puppet, Chef, GitLab) and monitoring tools like Zabbix, Cloud Watch Monitoring, Nagios
  • Knowledge of Java Virtual Machines, Apache Tomcat, Nginx, Apache Kafka, Microservices architecture, Caching mechanisms
  • Experience in query analysis, peformance tuning, database redesigning, 
  • Experience in enterprise application development, maintenance and operations
  • Knowledge of best practices and IT operations in an always-up, always-available service
  • Excellent written and oral communication skills, judgment and decision-making skills.
  • Excellent leadership skill.
Read more
Creating the Data observability space.
Bengaluru (Bangalore)
8 - 20 yrs
₹40L - ₹70L / yr
skill iconAmazon Web Services (AWS)
Snow flake schema
Google Cloud Platform (GCP)
Microsoft Windows Azure
Architect and build cloud environments with a focus on AWS including the design of production,
staging, QA, and development of cloud infrastructures running in 24×7 environments.
● Most of our deployments are in K8s, You will work with the team to run and manage multiple K8s
environments 24/7
● Implement and oversee all aspects of the cloud environment including provisioning, scale,
monitoring, and security.
● Nurture cloud computing expertise internally and externally to drive cloud adoption.
● Implement systems solutions, and processes needed to manage cloud cost, monitoring, scalability,
and redundancy.
● Ensure all cloud solutions adhere to security and compliance best practices.
● Collaborate with Enterprise Architecture, Data Platform, DevOps, and Integration Teams to ensure
cloud adoption follows standard best practices.
Responsibilities :
● Bachelor’s degree in Computer Science, Computer Engineering or Information Technology or
equivalent experience.
● Experience with Kubernetes on cloud and deployment technologies such as Helm is a major plus
● Expert level hands on experience with AWS (Azure and GCP experience are a big plus)
● 10 or more years of experience.
● Minimum of 5 years’ experience building and supporting cloud solutions
Read more
codersbrain
at codersbrain
1 recruiter
Tanuj Uppal
Posted by Tanuj Uppal
Remote only
5 - 14 yrs
Best in industry
DevOps
skill iconKubernetes
skill iconDocker
Windows Azure
Google Cloud Platform (GCP)
+2 more
Title: Azure Devops

Location: Remote

 

Job Description :

  • Strong hands-on knowledge on Azure DevOps.
  • Mandatory Skills required :Azure Devops,docker,Kubernetes
  • Skills required : Terraform,GIT,Jenkins,CI/CD,Pipelines,YAML,Scripting,Shell Scripting,Python, Gradle, Maven
  • Require only developer experience profiles, and Admin roles are not required
Read more
Mobile Programming LLC
at Mobile Programming LLC
1 video
34 recruiters
Sukhdeep Singh
Posted by Sukhdeep Singh
Bengaluru (Bangalore)
8 - 11 yrs
₹10L - ₹15L / yr
skill iconDocker
skill iconKubernetes
DevOps
skill iconAmazon Web Services (AWS)
Windows Azure
+1 more


Role Introduction

• This role involves guiding the DevOps team towards successful delivery of Governance and

toolchain initiatives by removing manual tasks.

• Operate toolchain applications to empower engineering teams by providing, reliable, governed

self-service tools and supporting their adoption

• Driving good practice for consumption and utilisation of the engineering toolchain, with a focus

on DevOps practices

• Drive good governance for cloud service consumption

• Involves working in a collaborative environment and focus on leading team and providing

technical leadership to team members.

• Involves setting up process and improvements for teams on supporting various DevOps tooling

and governing the tooling.

• Co-ordinating with multiple teams within organization

• Lead on handovers from architecture teams to support major project rollouts which require the

Toolchain governance DevOps team to operationally support tooling

What you will do

• Identify and implement best practices, process improvement and automation initiatives for

improvement towards quicker delivery by removing manual tasks

• Ensure best practices and process are documented for reusability and keeping up-to date on

good practices and standards.

• Re-usable automation and compliance service, tools and processes

• Support and management of toolchain, toolchain changes and selection

• Identify and implement risk mitigation plans, avoid escalations, resolve blockers for teams.

Toolchain governance will involve operating and responding to alerts, enforcing good tooling

governance by driving automation, remediating technical debt and ensuring the latest tools

are utilised and on the latest versions

• Triage product pipelines, performance issues, SLA/SLO breaches, service unavailable along

with ancillary actions such as providing access to logs, tools, environments.

• Involve in initial / detailed estimates during roadmap planning or feature

estimation/planning of any automation identified for a given toolset.

• Develop, refine, and tune integrations between various tools

• Discuss with Product Owner/team on any challenges from implementation, deployment

perspective and assist in arriving probable solution and escalate any risks to get them

resolved w.r.t DevOps toolchain.

• In consultation with Head of DevOps and other stake holders, prioritization of items, item-

task breakdown; accountable for squad deliverables for sprint

• Involve in reviewing current components and plan for upgrade and ensure its communicated

to wider audience within Organization

• Involve in reviewing access / role and enhance and automate provisioning.

• Identify and encourage areas for growth and improvement within the team e.g conducts

regular 1-2-1’s with squad members to provide support, mentoring and goal setting

• Involve in performance management ,rewards and recognition of team members, Involve in

hiring process.• Plan for upskill of team to know about tools and perform tasks. Ensure quicker onboarding

of new joiners/freshers to team to be productive.

• Review ticket metrics to measure the health of the project including SLAs and plan for

improvement.

• Requirement for on call for critical incidents that happen Out of Hours, based on tooling SLA.

This may include planning standby schedule for squad, carrying out retrospective for every

callout and reviewing SLIs/SLOs.

• Owns the tech/repair debt, risk and compliance for the tooling with respect to

infrastructure, pipelines, access etc

• Track optimum utilization of resources and monitor/track the delivery schedule

• Review solutions designs with the Architects / Principal DevOps Engineers as required

• Provide monthly reporting which align to DevOps Tooling KPIs

What you will have

• Candidate should have 8+ years of experience and Hands-on DevOps experience and

experience in team management.

• Strong communication and interpersonal skills, Team player

• Good working experience of CI/CD tools like Jenkins, SonarQube, FOSSA, Harness, Jira, JSM,

ServiceNow etc.

• Good hands on knowledge of AWS Services like EC2, ECS, S3, IAM, SNS, SQS, VPC, Lambda,

API Gateway, Cloud Watch, Cloud Formation etc.

• Experience in operating and governing DevOps Toolchain

• Experience in operational monitoring, alerting and identifying and delivering on both repair

and technical debt

• Experience and background in ITIL/ITSM processes. The candidate will ensure development

of the appropriate (ITSM) model and processes, based on the ITIL Service Management

framework. This includes the strategic, design, transition, and operation services and

continuous service improvement

• Provide ITSM leadership experience and coaching processes

• Experience on various tools like Jenkins, Harness, Fossa,

• Experience of hosting and managing applications on AWS/AZURE•

• Experience in CI/CD pipeline (Jenkins build pipelines)

• Experience in containerization (Docker/Kubernetes)

• Experience in any programming language (Node.js or Python is preferred)

• Experience in Architecting and supporting cloud based products will be a plus

• Experience in PowerShell & Bash will be a plus

• Able to self manage multiple concurrent small projects, including managing priorities

between projects

• Able to quickly learn new tools

• Should be able to mentor/drive junior team members to achieve desired outcome of

roadmap-

• Ability to analyse information to identify problems and issues, and make effective decisions

within short span

• Excellent problem solving and critical thinking

• Experience in integrating various components including unit testing / CI/CD configuration.

• Experience to review current toolset and plan for upgrade.

• Experience with Agile framework/Jira/JSM tool.• Good communication skills and ability to communicate/work independently with external

teams.

• Highly motivated, able to work proficiently both independently and in a team environment

Good knowledge and experience with security constructs –


Read more
Chennai
5 - 8 yrs
₹5L - ₹20L / yr
Ansible
skill iconDocker
DevOps
skill iconKubernetes
skill iconAmazon Web Services (AWS)
+1 more
Requirements
 Bachelor's degree in information security, computer science, or related.
 A Strong Devops experience of at least 4+ years
 Strong Experience in Unix/Linux/Python scripting
 Strong networking knowledge,vSphere networking stack knowledge desired.
 Experience on Docker and Kubernetes
 Experience with cloud technologies (AWS/Azure)
 Exposure to Continuous Development Tools such as Jenkins or Spinnaker
 Exposure to configuration management systems such as Ansible
 Knowledge of resource monitoring systems
 Ability to scope and estimate
 Strong verbal and communication skills
 Advanced knowledge of Docker and Kubernetes.
 Exposure to Blockchain as a Service (BaaS) like - Chainstack/IBM blockchain platform/Oracle Blockchain Cloud/Rubix/VMWare etc.
 Capable of provisioning and maintaining local enterprise blockchain platforms for Development and QA (Hyperledger fabric/Baas/Corda/ETH).
About Navis
Read more
Intuitive Technology Partners
Aakriti Gupta
Posted by Aakriti Gupta
Remote, Ahmedabad, Pune, Gurugram, Chennai, Bengaluru (Bangalore), india
6 - 12 yrs
Best in industry
DevOps
skill iconKubernetes
skill iconDocker
Terraform
Linux/Unix
+10 more

Intuitive is the fastest growing top-tier Cloud Solutions and Services company supporting Global Enterprise Customer across Americas, Europe and Middle East.

Intuitive is looking for highly talented hands-on Cloud Infrastructure Architects to help accelerate our growing Professional Services consulting Cloud & DevOps practice. This is an excellent opportunity to join Intuitive’s global world class technology teams, working with some of the best and brightest engineers while also developing your skills and furthering your career working with some of the largest customers.

Job Description :

  • Extensive exp. with K8s (EKS/GKE) and k8s eco-system tooling e,g., Prometheus, ArgoCD, Grafana, Istio etc.
  • Extensive AWS/GCP Core Infrastructure skills
  • Infrastructure/ IAC Automation, Integration - Terraform
  • Kubernetes resources engineering and management
  • Experience with DevOps tools, CICD pipelines and release management
  • Good at creating documentation(runbooks, design documents, implementation plans )

Linux Experience :

  1. Namespace
  2. Virtualization
  3. Containers

 

Networking Experience

  1. Virtual networking
  2. Overlay networks
  3. Vxlans, GRE

 

Kubernetes Experience :

Should have experience in bringing up the Kubernetes cluster manually without using kubeadm tool.

 

Observability                              

Experience in observability is a plus

 

Cloud automation :

Familiarity with cloud platforms exclusively AWS, DevOps tools like Jenkins, terraform etc.

 

Read more
Planet Spark
at Planet Spark
5 recruiters
Maneesh Dhooper
Posted by Maneesh Dhooper
NCR (Delhi | Gurgaon | Noida)
1 - 5 yrs
₹4L - ₹12L / yr
DevOps
skill iconDocker
Chef
Terraform
skill iconAmazon Web Services (AWS)

The AWS Cloud/Devops Engineer will be working with the engineering team and focusing on AWS infrastructure and automation.  A key part of the role is championing and leading infrastructure as code.  The Engineer will work closely with the Manager of Operations and Devops to build, manage and automate our AWS infrastructure. 

Duties & Responsibilities:

  • Design cloud infrastructure that is secure, scalable, and highly available on AWS
  • Work collaboratively with software engineering to define infrastructure and deployment requirements
  • Provision, configure and maintain AWS cloud infrastructure defined as code
  • Ensure configuration and compliance with configuration management tools
  • Administer and troubleshoot Linux based systems
  • Troubleshoot problems across a wide array of services and functional areas
  • Build and maintain operational tools for deployment, monitoring, and analysis of AWS infrastructure and systems
  • Perform infrastructure cost analysis and optimization

Qualifications:

  • At least 1-5 years of experience building and maintaining AWS infrastructure (VPC, EC2, Security Groups, IAM, ECS, CodeDeploy, CloudFront, S3)
  • Strong understanding of how to secure AWS environments and meet compliance requirements
  • Expertise using Chef for configuration management
  • Hands-on experience deploying and managing infrastructure with Terraform
  • Solid foundation of networking and Linux administration
  • Experience with CI-CD, Docker, GitLab, Jenkins, ELK and deploying applications on AWS
  • Ability to learn/use a wide variety of open source technologies and tools
  • Strong bias for action and ownership
Read more
Agiletech Info Solutions pvt ltd
Chennai
5 - 8 yrs
₹5L - ₹15L / yr
skill iconDocker
skill iconKubernetes
DevOps
skill iconAmazon Web Services (AWS)
Windows Azure
+1 more

DevOps Engineer

Job Description:

 

The position requires a broad set of technical and interpersonal skills that includes deployment technologies, monitoring and scripting from networking to infrastructure. Well versed in troubleshooting Prod issues and should be able to drive till the RCA.

 

Skills:

 

  • Manage VMs across multiple datacenters and AWS to support dev/test and production workloads.
  • Strong hands-on over Ansible is preferred
  • Strong knowledge and hands-on experience in Kubernetes Architecture and administration.
  • Should have core knowledge in Linux and System operations.
  • Proactively and reactively resolve incidents as escalated from monitoring solutions and end users.
  • Conduct and automate audits for network and systems infrastructure.
  • Do software deployments, per documented processes, with no impact to customers.
  • Follow existing devops processes while having flexibility to create and tweak processes to gain efficiency.
  • Troubleshoot connectivity problems across network, systems or applications.
  • Follow security guidelines, both policy and technical to protect our customers.
  • Ability to automate recurring tasks to increase velocity and quality.
  • Should have worked on any one of the Database (Postgres/Mongo/Cockroach/Cassandra)
  • Should have knowledge and hands-on experience in managing ELK clusters.
  • Scripting Knowledge in Shell/Python is added advantage.
  • Hands-on Experience over K8s based Microservice Architecture is added advantage.
Read more
Quark Software
at Quark Software
2 recruiters
Tarun M
Posted by Tarun M
Hyderabad
3 - 4 yrs
₹3L - ₹5L / yr
skill iconDocker
skill iconKubernetes
DevOps
skill iconAmazon Web Services (AWS)
Google Cloud Platform (GCP)
Roles & Responsibilities
Implementing various development, testing, automation tools, and IT infrastructure
Planning the team structure, activities, and involvement in project management activities.
Managing stakeholders and external interfaces
Setting up tools and required infrastructure
Defining and setting development, test, release, update, and support processes for DevOps operation
Have the technical skill to review, verify, and validate the software code developed in the project.
Troubleshooting techniques and fixing the code bugs
Monitoring the processes during the entire lifecycle for its adherence and updating or creating new processes for improvement and minimizing the wastage
Encouraging and building automated processes wherever possible
Identifying and deploying cybersecurity measures by continuously performing vulnerability assessment and risk management
Incidence management and root cause analysis
Coordination and communication within the team and with customers
Selecting and deploying appropriate CI/CD tools
Strive for continuous improvement and build continuous integration, continuous development, and constant deployment pipeline (CI/CD Pipeline)
Mentoring and guiding the team members
Monitoring and measuring customer experience and KPIs
Managing periodic reporting on the progress to the management and the customer
Read more
Bengaluru (Bangalore)
9 - 10 yrs
₹15L - ₹20L / yr
skill iconAmazon Web Services (AWS)
Public Cloud
VM
Managing Operations
Cloud Native
  • He has to perform architectural analysis, and he should know how to design enterprise-level systems.
  • He should know how to design and simulate tools for the perfect delivery of systems.
  • He should know how to design, develop, and maintain systems, processes, procedures to deliver a high-quality service design.
  • He has to work with other members of a team and other departments to establish healthy communication and information flow.
  • He should know how to deliver a high-performing solution architecture that can support the development efforts of a business.
  • He has to plan, design, and configure the most typical business solutions as needed.
  • He has to prepare technical documents and other presentations for multiple solutions areas.
  • He has to be sure that the best practices for configuration management are carried our as it was needed.
  • He has to work on customer specifications, analyze them, and conduct the best product recommendations associated with the platform

Requirements

  • AWS Solution Architect 9-10 Years
  • Responsible for managing applications on public cloud (AWS) infrastructure.
  • Responsible for larger migrations of applications from VM to cloud/cloud-native.
  • Responsible for setting up monitoring for cloud/cloud-native-based infrastructure and applications.
  • MUST: AWS Solution Architect Professional certification.

 

Read more
Why apply to jobs via Cutshort
people_solving_puzzle
Personalized job matches
Stop wasting time. Get matched with jobs that meet your skills, aspirations and preferences.
people_verifying_people
Verified hiring teams
See actual hiring teams, find common social connections or connect with them directly. No 3rd party agencies here.
ai_chip
Move faster with AI
We use AI to get you faster responses, recommendations and unmatched user experience.
21,01,133
Matches delivered
37,12,187
Network size
15,000
Companies hiring
Did not find a job you were looking for?
icon
Search for relevant jobs from 10000+ companies such as Google, Amazon & Uber actively hiring on Cutshort.
companies logo
companies logo
companies logo
companies logo
companies logo
Get to hear about interesting companies hiring right now
Company logo
Company logo
Company logo
Company logo
Company logo
Linkedin iconFollow Cutshort
Users love Cutshort
Read about what our users have to say about finding their next opportunity on Cutshort.
Subodh Popalwar's profile image

Subodh Popalwar

Software Engineer, Memorres
For 2 years, I had trouble finding a company with good work culture and a role that will help me grow in my career. Soon after I started using Cutshort, I had access to information about the work culture, compensation and what each company was clearly offering.
Companies hiring on Cutshort
companies logos