
Now, more than ever, the Toast team is committed to our customers. We’re taking steps to help restaurants navigate these unprecedented times with technology, resources, and community. Our focus is on building a restaurant platform that helps restaurants adapt, take control, and get back to what they do best: building the businesses they love. And because our technology is purpose-built for restaurants by restaurant people, restaurants can trust that we’ll deliver on their needs for today while investing in experiences that will power their restaurant of the future.
At Toast, our Site Reliability Engineers (SREs) are responsible for keeping all customer-facing services and other Toast production systems running smoothly. SREs are a blend of pragmatic operators and software craftspeople who apply sound software engineering principles, operational discipline, and mature automation to our environments and our codebase. Our decisions are based on instrumentation and continuous observability, as well as predictions and capacity planning.
About this roll* (Responsibilities)
- Gather and analyze metrics from both operating systems and applications to assist in performance tuning and fault finding
- Partner with development teams to improve services through rigorous testing and release procedures
- Participate in system design consulting, platform management, and capacity planning
- Create sustainable systems and services through automation and uplift
- Balance feature development speed and reliability with well-defined service level objectives
Troubleshooting and Supporting Escalations:
- Gather and analyze metrics from both operating systems and applications to assist in performance tuning and fault finding
- Diagnose performance bottlenecks and implement optimizations across infrastructure, databases, web, and mobile applications
- Implement strategies to increase system reliability and performance through on-call rotation and process optimization
- Perform and run blameless RCAs on incidents and outages aggressively, looking for answers that will prevent the incident from ever happening again
Do you have the right ingredients? (Requirements)
- Extensive industry experience with at least 7+ years in SRE and/or DevOps roles
- Polyglot technologist/generalist with a thirst for learning
- Deep understanding of cloud and microservice architecture and the JVM
- Experience with tools such as APM, Terraform, Ansible, GitHub, Jenkins, and Docker
- Experience developing software or software projects in at least four languages, ideally including two of Go, Python, and Java
- Experience with cloud computing technologies ( AWS cloud provider preferred)
Bread puns are encouraged but not required

About Toast
About
Toast empowers restaurants of all sizes to build great teams, increase revenue, improve operations, and delight guests.
We are a NYSE-listed Boston-based public company. We are also series F funded and have raised 400M USD in the last round in 2020.
We pair our deep understanding of the restaurant industry with powerful cloud based software and restaurant-grade hardware to deliver an intuitive, all-in-one platform, across point of sale, guest marketing, digital ordering & delivery, and payroll & HR.
Tech stack



Company video


Candid answers by the company
Toast helps restaurants of all sizes streamline operations, boost revenue, enhance team management, and deliver exceptional guest experiences.
Similar jobs
Role Overview:
As a DevOps Engineer (L2), you will play a key role in designing, implementing, and optimizing infrastructure. You will take ownership of automating processes, improving system reliability, and supporting the development lifecycle.
Key Responsibilities:
- Design and manage scalable, secure, and highly available cloud infrastructure.
- Lead efforts in implementing and optimizing CI/CD pipelines.
- Automate repetitive tasks and develop robust monitoring solutions.
- Ensure the security and compliance of systems, including IAM, VPCs, and network configurations.
- Troubleshoot complex issues across development, staging, and production environments.
- Mentor and guide L1 engineers on best practices.
- Stay updated on emerging DevOps tools and technologies.
- Manage cloud resources efficiently using Infrastructure as Code (IaC) tools like Terraform and AWS CloudFormation.
Qualifications:
- Bachelor’s degree in Computer Science, IT, or a related field.
- Proven experience with CI/CD pipelines and tools like Jenkins, GitLab, or Azure DevOps.
- Advanced knowledge of cloud platforms (AWS, Azure, or GCP) with hands-on experience in deployments, migrations, and optimizations.
- Strong expertise in containerization (Docker) and orchestration tools (Kubernetes).
- Proficiency in scripting languages like Python, Bash, or PowerShell.
- Deep understanding of system security, networking, and load balancing.
- Strong analytical skills and problem-solving mindset.
- Certifications (e.g., AWS Certified Solutions Architect, Kubernetes Administrator) are a plus.
What We Offer:
- Opportunity to work with a cutting-edge tech stack in a product-first company.
- Collaborative and growth-oriented environment.
- Competitive salary and benefits.
- Freedom to innovate and contribute to impactful projects.

Skills Required:
- Good experience with programming language Python
- Strong experience in Docker.
- Good knowledge with any of the Cloud Platform like Azure.
- Must be comfortable working in a Linux environment.
- Must have exposure into IOT domain and its protocols ((Zigbee & BLE ,LoRa,Modbus)
- Must be a good team player.
- Strong Communication Skills
- Seeking an Individual carrying around 5+ yrs of experience.
- Must have skills - Jenkins, Groovy, Ansible, Shell Scripting, Python, Linux Admin
- Terraform, AWS deep knowledge to automate and provision EC2, EBS, SQL Server, cost optimization, CI/CD pipeline using Jenkins, Server less automation is plus.
- Excellent writing and communication skills in English. Enjoy writing crisp and understandable documentation
- Comfortable programming in one or more scripting languages
- Enjoys tinkering with tooling. Find easier ways to handle systems by doing some research. Strong awareness around build vs buy.
We are global expert in cloud consulting and service management, focusing exclusively on the Cloud DevOps Space. In short, we strive to be at the forefront in this era of digital disruption by being dynamic, agile and cohesive in providing businesses the solutions needed to leverage it to the next level. Our expert team of Engineers, Programmers, Designers and Business development professionals are the foundations of our firm with the fusion of cutting-edge technology.Nimble IT Consulting is vested in Research and Analysis of Current and Upcoming trends, be it Technology, Business Values and User Experience, we dedicate our efforts tirelessly to be at the pinnacle of the Quality Standards. Devising solutions that are just not only being approved or followed by industry leaders in fact they depend on it. Read more about us: https://nimbleitconsulting.com/" target="_blank">https://nimbleitconsulting.com
What we are looking for
A DevOps Engineer to join our team and provide consulting services to our clients, below is the technology stack we are interested in
Technical skills
- Expertise in implementing and managing Devops CI/CD pipeline. ( either using Jenkins or Azure DevOps )
- At least one AWS or Azure Certification
- Terraform Scripting
- Hands-on experience with git and source code management and release management.
- Experience in DevOps automation tools. And Very well versed with DevOps principles and the Agile Frameworks.
- Working knowledge of scripting using shell, Python, Gradle, Yaml, Ansible or puppet or chef.
- Working knowledge of build systems for various technologies like npm, maven etc.
- Experience and good understanding in any of Cloud platforms like AWS, Azure or Google cloud.
- Hands on Knowledge of Docker and Kubernetes is required.
- Proficient in troubleshooting skills with proven abilities in resolving complex technical issues. Experience with working with ticketing tools (Jira & Service now)
- A programming language like Java, Go , NodeJS is a nice to have.
- Work Permit for United Kingdom ( tier 2 visa ) total duration of visa will be 5 years ( first 2 years and then 3 year extension)
- At the end of the 5 years you will be eligible for British Citizenship by applying for Indefinite leave to remain in the UK
- Learn new technologies - We won’t ever expect you to do the same thing day in day out; we want to
- give you the chance to explore the latest techniques to solve challenging technical problems and help
- you become the best developer you can be.
- Join a growing agile team that are consistently delivering.
- Technical Development Program
We are an equal opportunity employer and value diversity at our company. We do not discriminate on the
basis of race, religion, colour, national origin, gender, sexual orientation, age, marital status, veteran status, or disability status.
A.P.T Portfolio, a high frequency trading firm that specialises in Quantitative Trading & Investment Strategies.Founded in November 2009, it has been a major liquidity provider in global Stock markets.
As a manager, you would be incharge of managing the devops team and your remit shall include the following
- Private Cloud - Design & maintain a high performance and reliable network architecture to support HPC applications
- Scheduling Tool - Implement and maintain a HPC scheduling technology like Kubernetes, Hadoop YARN Mesos, HTCondor or Nomad for processing & scheduling analytical jobs. Implement controls which allow analytical jobs to seamlessly utilize ideal capacity on the private cloud.
- Security - Implementing best security practices and implementing data isolation policy between different divisions internally.
- Capacity Sizing - Monitor private cloud usage and share details with different teams. Plan capacity enhancements on a quarterly basis.
- Storage solution - Optimize storage solutions like NetApp, EMC, Quobyte for analytical jobs. Monitor their performance on a daily basis to identify issues early.
- NFS - Implement and optimize latest version of NFS for our use case.
- Public Cloud - Drive AWS/Google-Cloud utilization in the firm for increasing efficiency, improving collaboration and for reducing cost. Maintain the environment for our existing use cases. Further explore potential areas of using public cloud within the firm.
- BackUps - Identify and automate back up of all crucial data/binary/code etc in a secured manner at such duration warranted by the use case. Ensure that recovery from back-up is tested and seamless.
- Access Control - Maintain password less access control and improve security over time. Minimize failures for automated job due to unsuccessful logins.
- Operating System -Plan, test and roll out new operating system for all production, simulation and desktop environments. Work closely with developers to highlight new performance enhancements capabilities of new versions.
- Configuration management -Work closely with DevOps/ development team to freeze configurations/playbook for various teams & internal applications. Deploy and maintain standard tools such as Ansible, Puppet, chef etc for the same.
- Data Storage & Security Planning - Maintain a tight control of root access on various devices. Ensure root access is rolled back as soon the desired objective is achieved.
- Audit access logs on devices. Use third party tools to put in a monitoring mechanism for early detection of any suspicious activity.
- Maintaining all third party tools used for development and collaboration - This shall include maintaining a fault tolerant environment for GIT/Perforce, productivity tools such as Slack/Microsoft team, build tools like Jenkins/Bamboo etc
Qualifications
- Bachelors or Masters Level Degree, preferably in CSE/IT
- 10+ years of relevant experience in sys-admin function
- Must have strong knowledge of IT Infrastructure, Linux, Networking and grid.
- Must have strong grasp of automation & Data management tools.
- Efficient in scripting languages and python
Desirables
- Professional attitude, co-operative and mature approach to work, must be focused, structured and well considered, troubleshooting skills.
- Exhibit a high level of individual initiative and ownership, effectively collaborate with other team members.
APT Portfolio is an equal opportunity employer
At Neurosensum we are committed to make customer feedback more actionable. We have developed a platform called SurveySensum which breaks the conventional market research turnaround time.
SurveySensum is becoming a great tool to not only capture the feedbacks but also to extract some useful insights with the quick workflow setups and dashboards. We have more than 7 channels through which we can collect the feedbacks. This makes us challenge the conventional software development design principles. The team likes to grind and helps each other to lift in tough situations.
Day to day responsibilities include:
- Work on the deployment of code via Bitbucket, AWS CodeDeploy and manual
- Work on Linux/Unix OS and Multi tech application patching
- Manage, coordinate, and implement software upgrades, patches, and hotfixes on servers.
- Create and modify scripts or applications to perform tasks
- Provide input on ways to improve the stability, security, efficiency, and scalability of the environment
- Easing developers’ life so that they can focus on the business logic rather than deploying and maintaining it.
- Managing release of the sprint.
- Educating team of the best practices.
- Finding ways to avoid human error and save time by automating the processes using Terraform, CloudFormation, Bitbucket pipelines, CodeDeploy, scripting
- Implementing cost effective measure on cloud and minimizing existing costs.
Skills and prerequisites
- OOPS knowledge
- Problem solving nature
- Willing to do the R&D
- Works with the team and support their queries patiently
- Bringing new things on the table - staying updated
- Pushing solution above a problem.
- Willing to learn and experiment
- Techie at heart
- Git basics
- Basic AWS or any cloud platform – creating and managing ec2, lambdas, IAM, S3 etc
- Basic Linux handling
- Docker and orchestration (Great to have)
- Scripting – python (preferably)/bash
Responsibilities
- Building and maintenance of resilient and scalable production infrastructure
Improvement of monitoring systems
- Improvement of monitoring systems
- Creation and support of development automation processes (CI / CD)
- Participation in infrastructure development
- Detection of problems in architecture and proposing of solutions for solving them
- Creation of tasks for system improvements for system scalability, performance and monitoring
- Analysis of product requirements in the aspect of devops
- Incident analysis and fixing
Skills and Experience
- Understanding of the distributed systems principles
- Understanding of principles for building a resistant network infrastructure
- Experience of Ubuntu Linux administration (Debian-like will be a plus)
- Strong knowledge of Bash
- Experience of working with LXC-containers
- Understanding and experience with infrastructure as a code approach
- Experience of development idempotent Ansible roles
- Experience of working with git
Preferred experience
Experience with relational databases (PostgeSQL), ability to create simple SQL queries
- Experience with monitoring and metric collect systems (Prometheus, Grafana, Zabbix)
- Understanding of dynamic routing (OSPF)
- Knowledge and experience of working with network equipment Cisco
- Experience of working with Cisco NX-OS
- Experience of working with IPsec, VXLAN, Open vSwitch
- Knowledge of principles of multicast protocols IGMP, PIM
- Experience of setting multicast on Cisco equipment
- Experience administering Atlassian products
About the client :
Asia’s largest global sports media property in history with a global broadcast to 150+ countries. As the world’s largest martial arts organization, they are a celebration of Asia’s greatest cultural treasure, and its deep-rooted Asian values of integrity, humility, honor, respect, courage, discipline, and compassion. Has achieved some of the highest TV ratings and social media engagement metrics across Asia with its unique brand of Asian values, world-class athletes, and world-class production. Broadcast partners include Turner Sports, Star India, TV Tokyo, Fox Sports, ABS-CBN, Astro, ClaroSports, Bandsports, Startimes, Premier Sports, Thairath TV, Skynet, Mediacorp, OSN, and more. Institutional investors include Sequoia Capital, Temasek Holdings, GIC, Iconiq Capital, Greenoaks Capital, and Mission Holdings. Currently has offices in Singapore, Tokyo, Los Angeles, Shanghai, Milan, Beijing, Bangkok, Manila, Jakarta, and Bangalore.
Position : Devops Engineer – SDE3
As part of the engineering team, you would be expected to have deep technology expertise with a passion for building highly scalable products. This is a unique opportunity where you can impact the lives of people across 150+ countries!
Responsibilities
• Develop Collaborate in large-scale systems design discussions.
• Deploying and maintaining in-house/customer systems ensuring high availability, performance and optimal cost.
• Automate build pipelines. Ensuring right architecture for CI/CD
• Work with engineering leaders to ensure cloud security
• Develop standard operating procedures for various facets of Infrastructure services (CI/CD, Git Branching, SAST, Quality gates, Auto Scaling)
• Perform & automate regular backups of servers & databases. Ensure rollback and restore capabilities are Realtime and with zero-downtime.
• Lead the entire DevOps charter for ONE Championship. Mentor other DevOps engineers. Ensure industry standards are followed.
Requirements
• Overall 5+ years of experience in as DevOps Engineer/Site Reliability Engineer
• B.E/B.Tech in CS or equivalent streams from institute of repute
• Experience in Azure is a must. AWS experience is a plus
• Experience in Kubernetes, Docker, and containers
• Proficiency in developing and deploying fully automated environments using Puppet/Ansible and Terraform
• Experience with monitoring tools like Nagios/Icinga, Prometheus, AlertManager, Newrelic
• Good knowledge of source code control (git)
• Expertise in Continuous Integration and Continuous Deployment setup using Azure Pipeline or Jenkins
• Strong experience in programming languages. Python is preferred
• Experience in scripting and unit testing
• Basic knowledge of SQL & NoSQL databases
• Strong Linux fundamentals
• Experience in SonarQube, Locust & Browserstack is a plus
- Proficient in Java, Node or Python
- Experience with NewRelic, Splunk, SignalFx, DataDog etc.
- Monitoring and alerting experience
- Full stack development experience
- Hands-on with building and deploying micro services in Cloud (AWS/Azure)
- Experience with terraform w.r.t Infrastructure As Code
- Should have experience troubleshooting live production systems using monitoring/log analytics tools
- Should have experience leading a team (2 or more engineers)
- Experienced using Jenkins or similar deployment pipeline tools
- Understanding of distributed architectures
Opening for a Java Developer with Devops experience
Experience required: 5 yrs to 10 yrs
Essential Required Skills:
Familiarity with Version Control such as GitHub, BitBucket
- Java programmer(Liferay, Alfresco will add plus point)
- AWS
- OPs(ansible, apache, python, terraform)
- Effective communication skills
- An analytical bent of mind and problem-solving aptitude
- Good time management skills
- Curiosity for learning
- Patience
Roles & Responsibilities:
- Candidate with good hand on exposure on AWS, Cloud, Devops, Ansible, Docker, Jekins.
- Strong proficiency in Linux, Open Source, Web based and Cloud based environments (ability to use open source technologies and tools)
- Strong scripting and automation (bash, Perl, common Linux utils), strong grasp of automation tools a plus.
- Strong debugging skills (OS, scripting, Web based technologies), SQL, Java and Database concepts are a plus
- Apache, nginx, git, svn, GNU tools
- Must have exposure on Grep, awk, sed, Git, svn
- Scripting (bash, python)
- API related skills (REST, and any other like google, aws, atlassian)
- Web based technology
- Strong Unix Skills
- Java programmer, Coding (Springboot, Microservices, Liferay, Alfresco will add plus point)
- Proficient in AWS
- Ops (ansible, apache, python, terraform)
Benefits
- Cash Rewards & Recognition on Monthly Basis
- Work-Life Balance (Flexible Working Hours)
- Five-Day Work Week

