
Now, more than ever, the Toast team is committed to our customers. We’re taking steps to help restaurants navigate these unprecedented times with technology, resources, and community. Our focus is on building a restaurant platform that helps restaurants adapt, take control, and get back to what they do best: building the businesses they love. And because our technology is purpose-built for restaurants by restaurant people, restaurants can trust that we’ll deliver on their needs for today while investing in experiences that will power their restaurant of the future.
At Toast, our Site Reliability Engineers (SREs) are responsible for keeping all customer-facing services and other Toast production systems running smoothly. SREs are a blend of pragmatic operators and software craftspeople who apply sound software engineering principles, operational discipline, and mature automation to our environments and our codebase. Our decisions are based on instrumentation and continuous observability, as well as predictions and capacity planning.
About this roll* (Responsibilities)
- Gather and analyze metrics from both operating systems and applications to assist in performance tuning and fault finding
- Partner with development teams to improve services through rigorous testing and release procedures
- Participate in system design consulting, platform management, and capacity planning
- Create sustainable systems and services through automation and uplift
- Balance feature development speed and reliability with well-defined service level objectives
Troubleshooting and Supporting Escalations:
- Gather and analyze metrics from both operating systems and applications to assist in performance tuning and fault finding
- Diagnose performance bottlenecks and implement optimizations across infrastructure, databases, web, and mobile applications
- Implement strategies to increase system reliability and performance through on-call rotation and process optimization
- Perform and run blameless RCAs on incidents and outages aggressively, looking for answers that will prevent the incident from ever happening again
Do you have the right ingredients? (Requirements)
- Extensive industry experience with at least 7+ years in SRE and/or DevOps roles
- Polyglot technologist/generalist with a thirst for learning
- Deep understanding of cloud and microservice architecture and the JVM
- Experience with tools such as APM, Terraform, Ansible, GitHub, Jenkins, and Docker
- Experience developing software or software projects in at least four languages, ideally including two of Go, Python, and Java
- Experience with cloud computing technologies ( AWS cloud provider preferred)
Bread puns are encouraged but not required

About Toast
About
Toast empowers restaurants of all sizes to build great teams, increase revenue, improve operations, and delight guests.
We are a NYSE-listed Boston-based public company. We are also series F funded and have raised 400M USD in the last round in 2020.
We pair our deep understanding of the restaurant industry with powerful cloud based software and restaurant-grade hardware to deliver an intuitive, all-in-one platform, across point of sale, guest marketing, digital ordering & delivery, and payroll & HR.
Tech stack
Company video


Candid answers by the company
Toast helps restaurants of all sizes streamline operations, boost revenue, enhance team management, and deliver exceptional guest experiences.
Similar jobs
Required Skills
● Experience: Minimum of 5 years of professional experience in a DevOps Engineer
role
● Cloud Proficiency: Proven experience with at least one major cloud provider (AWS,
Azure, or GCP).
● Scripting & Programming: Strong scripting skills in languages such as Bash, Python,
or Go.
● IaC Tools: Hands-on experience with Terraform.
● Container Technology: Expertise in Docker and Kubernetes.
● CI/CD Tools: Proficient with CI/CD platforms like Jenkins, GitLab CI, or Travis CI.
● Configuration Management: Experience with configuration management tools like
Ansible, Chef, or Puppet.
● Version Control: Strong knowledge of Git and branching strategies.
● Problem-Solving:Excellent problem-solving abilities and a commitment to automation
and continuous improvement.
Job Description:
• Drive end-to-end automation from GitHub/GitLab/BitBucket to Deployment,
Observability and Enabling the SRE activities
• Guide operations support (setup, configuration, management, troubleshooting) of
digital platforms and applications
• Solid understanding of DevSecOps Workflows that support CI, CS, CD, CM, CT.
• Deploy, configure, and manage SaaS and PaaS cloud platform and applications
• Provide Level 1 (OS, patching) and Level 2 (app server instance troubleshooting)
• DevOps programming: writing scripts, building operations/server instance/app/DB
monitoring tools Set up / manage continuous build and dev project management
environment: JenkinX/GitHub Actions/Tekton, Git, Jira Designing secure networks,
systems, and application architectures
• Collaborating with cross-functional teams to ensure secure product development
• Disaster recovery, network forensics analysis, and pen-testing solutions
• Planning, researching, and developing security policies, standards, and procedures
• Awareness training of the workforce on information security standards, policies, and
best practices
• Installation and use of firewalls, data encryption and other security products and
procedures
• Maturity in understanding compliance, policy and cloud governance and ability to
identify and execute automation.
• At Wesco, we discuss more about solutions than problems. We celebrate innovation
and creativity.
Intuitive is the fastest growing top-tier Cloud Solutions and Services company supporting Global Enterprise Customer across Americas, Europe and Middle East.
Intuitive is looking for highly talented hands-on Cloud Infrastructure Architects to help accelerate our growing Professional Services consulting Cloud & DevOps practice. This is an excellent opportunity to join Intuitive’ s global world class technology teams, working with some of the best and brightest engineers while also developing your skills and furthering your career working with some of the largest customers.
Lead the pre-sales (25%) to post-sales (75%) efforts building Public/Hybrid Cloud solutions working collaboratively with Intuitive and client technical and business stakeholders
Be a customer advocate with obsession for excellence delivering measurable success for Intuitive’s customers with secure, scalable, highly available cloud architecture that leverage AWS Cloud services
Experience in analyzing customer's business and technical requirements, assessing existing environment for Cloud enablement, advising on Cloud models, technologies, and risk management strategies
Apply creative thinking/approach to determine technical solutions that further business goals and align with corporate technology strategies
Extensive experience building Well Architected solutions in-line with AWS cloud adoption framework (DevOps/DevSecOps, Database/Data Warehouse/Data Lake, App Modernization/Containers, Security, Governance, Risk, Compliance, Cost Management and Operational Excellence)
Experience with application discovery prefera bly with tools like Cloudscape, to discover application configurations , databases, filesystems, and application dependencies
Experience with Well Architected Review, Cloud Readiness Assessments and defining migration patterns (MRA/MRP) for application migration e.g. Re-host, Re-platform, Re-architect etc
Experience in architecting and deploying AWS Landing Zone architecture with CI/CD pipeline
Experience on architecture, design of AWS cloud services to address scalability, performance, HA, security, availability, compliance, backup and DR, automation, alerting and monitoring and cost
Hands-on experience in migrating applications to AWS leveraging proven tools and processes including migration, implementation, cutover and rollback plans and execution
Hands-on experience in deploying various AWS services e.g. EC2, S3, VPC, RDS, Security Groups etc. using either manual or IaC, IaC is preferred
Hands-on Experience in writing cloud automation scripts/code such as Ansible, Terraform, CloudFormation Template (AWS CFT) etc.
Hands-on Experience with application build/release processes CI/CD pipelines
Deep understanding of Agile processes (planning/stand-ups/retros etc), and interact with cross-functional teams i.e. Development, Infrastructure, Security, Performance Engineering, and QA
Additional Requirements:
Work with Technology leadership to grow the Cloud & DevOps practice. Create cloud practice collateral
Work directly with sales teams to improve and help them drive the sales for Cloud & DevOps practice
Assist Sales and Marketing team in creating sales and marketing collateral
Write whitepapers and technology blogs to be published on social media and Intuitive website
Create case studies for projects successfully executed by Intuitive delivery team
Conduct sales enablement sessions to coach sales team on new offerings
Flexibility with work hours supporting customer’s requirement and collaboration with global delivery teams
Flexibility with Travel as required for Pre-sales/Post-sales, Design workshops, War-room Migration events and customer meetings
Strong passion for modern technology exploration and development
Excellent written, verbal communication skills, presentation, and collaboration skills - Team leadership skills
Experience with Multi-cloud (Azure, GCP, OCI) is a big plus
Experience with VMware Cloud Foundation as well as Advanced Windows and Linux Engineering is a big plus
Experience with On-prem Data Engineering (Database, Data Warehouse, Data Lake) is a big plus
- Candidate should be able to write the sample programs using the Tools (Bash, PowerShell, Python or Shell scripting)
- Analytical/logical reasoning
- GitHub Actions
- Should have good working experience with GitHub Actions
- Repository/Workflow Dispatch, writing reusable workflows, etc
- AZ CLI commands
- Hands-on experience with AZ CLI commands
- Hands-on knowledge on various CI-CD tools (Jenkins/TeamCity, Artifactory, UCD, Bitbucket/Github, SonarQube) including setting up of build-deployment automated pipelines.
- Very good knowledge in scripting tools and languages such as Shell, Perl or Python , YAML/Groovy, build tools such as Maven/Gradle.
- Hands-on knowledge in containerization and orchestration tools such as Docker, OpenShift and Kubernetes.
- Good knowledge in configuration management tools such as Ansible, Puppet/Chef and have worked on setting up of monitoring tools (Splunk/Geneos/New Relic/Elk).
- Expertise in job schedulers/workload automation tools such as Control-M or AutoSys is good to have.
- Hands-on knowledge on Cloud technology (preferably GCP) including various computing services and infrastructure setup using Terraform.
- Should have basic understanding on networking, certificate management, Identity and Access Management and Information security/encryption concepts.
- • Should support day-to-day tasks related to platform and environments upkeep such as upgrades, patching, migration and system/interfaces integration.
- • Should have experience in working in Agile based SDLC delivery model, multi-task and support multiple systems/apps.
- • Big-data and Hadoop ecosystem knowledge is good to have but not mandatory.
- Should have worked on standard release, change and incident management tools such as ServiceNow/Remedy or similar
Hands on Experience with Linux administration
Experience using Python or Shell scripting (for Automation)
Hands-on experience with Implementation of CI/CD Processes
Experience working with one cloud platforms (AWS or Azure or Google)
Experience working with configuration management tools such as Ansible & Chef
Experience working with Containerization tool Docker.
Experience working with Container Orchestration tool Kubernetes.
Experience in source Control Management including SVN and/or Bitbucket
& GitHub
Experience with setup & management of monitoring tools like Nagios, Sensu & Prometheus or any other popular tools
Hands-on experience in Linux, Scripting Language & AWS is mandatory
Troubleshoot and Triage development, Production issues
As an Infrastructure Engineer at Navi, you will be building a resilient infrastructure platform, using modern Infrastructure engineering practices.
You will be responsible for the availability, scaling, security, performance and monitoring of the navi Cloud platform. You’ll be joining a team that follows best practices in infrastructure as code
Your Key Responsibilities
- Build out the Infrastructure components like API Gateway, Service Mesh, Service Discovery, container orchestration platform like kubernetes.
- Developing reusable Infrastructure code and testing frameworks
- Build meaningful abstractions to hide the complexities of provisioning modern infrastructure components
- Design a scalable Centralized Logging and Metrics platform
- Drive solutions to reduce Mean Time To Recovery(MTTR), enable High Availability.
What to Bring
- Good to have experience in managing large scale cloud infrastructure, preferable AWS and Kubernetes
- Experience in developing applications using programming languages like Java, Python and Go
- Experience in handling logs and metrics at a high scale.
- Systematic problem-solving approach, coupled with strong communication skills and a sense of ownership and drive.
Requirements:-
- Bachelor’s Degree or Master’s in Computer Science, Engineering,Software Engineering or a relevant field.
- Strong experience with Windows/Linux-based infrastructures, Linux/Unix administration.
- knowledge of Jira, Bitbucket, Jenkins, Xray, Ansible, Windows and .Net. as their Core Skill.
- Strong experience with databases such as SQL, MS SQL, MySQL, NoSQL.
- Knowledge of scripting languages such as Shell Scripting /Python/ PHP/Groovy, Bash.
- Experience with project management and workflow tools such as Agile, Jira / WorkFront etc.
- Experience with open-source technologies and cloud services.
- Experience in working with Puppet or Chef for automation and configuration.
- Strong communication skills and ability to explain protocol and processes with team and management.
- Experience in a DevOps Engineer role (or similar role)
- AExperience in software development and infrastructure development is a plus
Job Specifications:-
- Building and maintaining tools, solutions and micro services associated with deployment and our operations platform, ensuring that all meet our customer service standards and reduce errors.
- Actively troubleshoot any issues that arise during testing and production, catching and solving issues before launch.
- Test our system integrity, implemented designs, application developments and other processes related to infrastructure, making improvements as needed
- Deploy product updates as required while implementing integrations when they arise.
- Automate our operational processes as needed, with accuracy and in compliance with our security requirements.
- Specifying, documenting and developing new product features, and writing automating scripts. Manage code deployments, fixes, updates and related processes.
- Work with open-source technologies as needed.
- Work with CI and CD tools, and source control such as GIT and SVN.
- Lead the team through development and operations.
- Offer technical support where needed, developing software for our back-end systems.
Minimum 4 years exp
Skillsets:
- Build automation/CI: Jenkins
- Secure repositories: Artifactory, Nexus
- Build technologies: Maven, Gradle
- Development Languages: Python, Java, C#, Node, Angular, React/Redux
- SCM systems: Git, Github, Bitbucket
- Code Quality: Fisheye, Crucible, SonarQube
- Configuration Management: Packer, Ansible, Puppet, Chef
- Deployment: uDeploy, XLDeploy
- Containerization: Kubernetes, Docker, PCF, OpenShift
- Automation frameworks: Selenium, TestNG, Robot
- Work Management: JAMA, Jira
- Strong problem solving skills, Good verbal and written communication skills
- Good knowledge of Linux environment: RedHat etc.
- Good in shell scripting
- Good to have Cloud Technology : AWS, GCP and Azure









