
Now, more than ever, the Toast team is committed to our customers. We’re taking steps to help restaurants navigate these unprecedented times with technology, resources, and community. Our focus is on building a restaurant platform that helps restaurants adapt, take control, and get back to what they do best: building the businesses they love. And because our technology is purpose-built for restaurants by restaurant people, restaurants can trust that we’ll deliver on their needs for today while investing in experiences that will power their restaurant of the future.
At Toast, our Site Reliability Engineers (SREs) are responsible for keeping all customer-facing services and other Toast production systems running smoothly. SREs are a blend of pragmatic operators and software craftspeople who apply sound software engineering principles, operational discipline, and mature automation to our environments and our codebase. Our decisions are based on instrumentation and continuous observability, as well as predictions and capacity planning.
About this roll* (Responsibilities)
- Gather and analyze metrics from both operating systems and applications to assist in performance tuning and fault finding
- Partner with development teams to improve services through rigorous testing and release procedures
- Participate in system design consulting, platform management, and capacity planning
- Create sustainable systems and services through automation and uplift
- Balance feature development speed and reliability with well-defined service level objectives
Troubleshooting and Supporting Escalations:
- Gather and analyze metrics from both operating systems and applications to assist in performance tuning and fault finding
- Diagnose performance bottlenecks and implement optimizations across infrastructure, databases, web, and mobile applications
- Implement strategies to increase system reliability and performance through on-call rotation and process optimization
- Perform and run blameless RCAs on incidents and outages aggressively, looking for answers that will prevent the incident from ever happening again
Do you have the right ingredients? (Requirements)
- Extensive industry experience with at least 7+ years in SRE and/or DevOps roles
- Polyglot technologist/generalist with a thirst for learning
- Deep understanding of cloud and microservice architecture and the JVM
- Experience with tools such as APM, Terraform, Ansible, GitHub, Jenkins, and Docker
- Experience developing software or software projects in at least four languages, ideally including two of Go, Python, and Java
- Experience with cloud computing technologies ( AWS cloud provider preferred)
Bread puns are encouraged but not required

About Toast
About
Toast empowers restaurants of all sizes to build great teams, increase revenue, improve operations, and delight guests.
We are a NYSE-listed Boston-based public company. We are also series F funded and have raised 400M USD in the last round in 2020.
We pair our deep understanding of the restaurant industry with powerful cloud based software and restaurant-grade hardware to deliver an intuitive, all-in-one platform, across point of sale, guest marketing, digital ordering & delivery, and payroll & HR.
Tech stack
Company video


Candid answers by the company
Toast helps restaurants of all sizes streamline operations, boost revenue, enhance team management, and deliver exceptional guest experiences.
Similar jobs
Primary Skills:
Linux – Ubuntu Administration, Git, Gerrit, Jenkins Administration, Cloud services (Preferred AWS) Apache, Ansible, Python, Postgresql, Rabbit MQ, CloudWatch AWS, CFT in AWS
Additional Skills Required:
- Should have experience working with Jenkins, Git, Gerrit
- Should have Good understanding of AWS Security and execution.
- Should have Good python skills
- Should have experience of working with GIT, Gerrit, Jira, Confluence,
- Exposure to messaging systems Rabbit MQ
- Exposure to Html, Groovy, Javascript, shell scripting
- Exposure to Kibana, Provisioning, capacity planning and performance analysis at various levels
- Exposure to Android skills.
- Should have experience in working with cloud-native architecture.
- Experience with log stash and elastic search
- Expert in Full Stack design technique as well as experience working across large environments with multiple operating systems/infrastructure for large-scale programs
- May be recognized as a leader in Agile and cultivating teams working in Agile frameworks
- Strong understanding of techniques such as Continuous Integration, Continuous Delivery, Test Driven Development, Cloud Development, resiliency, security
- Stays abreast of cutting edge technologies/trends and uses experience to influence application of those technologies/trends to support the business
- Experience on Modelling and Provisioning cloud infrastructure using AWS CloudFormation
Key Responsibilities:
- Perform a Technical Lead role for DevOPs development and support teams.
- Need to communicate & coordinate with both offshore and onsite teams
- Should translate business requirements into project plans and workable item/activities
- Have a thorough understanding of software development lifecycle and the ability to implement software following the structured approach.
- Need to perform in-depth technical reviews of project deliverables and ensure it should be defect free (minimize post release defects).
- Understand the current applications and technical architecture and improvise them as needed.
- Stay abreast of new technologies, methods to optimize development process and latest SDKs, testing tools etc
About The Role:
The products/services of Eclat Engineering Pvt. Ltd. are being used by some of the leading institutions in India and abroad. Our services/Products are rapidly growing in demand. We are looking for a capable and dynamic Senior DevOps engineer to help setup, maintain and scale the infrastructure operations. This Individual will have the challenging responsibility of channelling our IT infrastructure and offering customer services with stringent international standard levels of service quality. This individual will leverage the latest IT tools to automate and streamline the delivery of our services while implementing industry-standard processes and knowledge management.
Roles & Responsibilities:
- Infrastructure and Deployment Automation: Design, implement, and maintain automation for infrastructure
provisioning and application deployment. Own the CI/CD pipelines and ensure they are efficient, reliable, and
scalable.
- System Monitoring and Performance: -Take ownership of monitoring systems and ensure the health and
performance of the infrastructure. Proactively identify and address performance bottlenecks and system issues.
- Cloud Infrastructure Management: Manage cloud infrastructure (e.g., AWS, Azure, GCP) and optimize resource
usage. Implement cost-saving measures while maintaining scalability and reliability.
- Configuration Management: Manage configuration management tools (e.g., Ansible, Puppet, Chef) to ensure
consistency across environments. Automate configuration changes and updates.
- Security and Compliance: Own security policies, implement best practices, and ensure compliance with industry
standards. Lead efforts to secure infrastructure and applications, including patch management and access controls.
- Collaboration with Development and Operations Teams: Foster collaboration between development and
operations teams, promoting a DevOps culture. Be the go-to person for resolving cross-functional infrastructure
issues and improving the development process.
- Disaster Recovery and Business Continuity: Develop and maintain disaster recovery plans and procedures. Ensure
business continuity in the event of system failures or other disruptions.
- Documentation and Knowledge Sharing: Create and maintain comprehensive documentation for configurations,
processes, and best practices. Share knowledge and mentor junior team members.
- Technical Leadership and Innovation: Stay up-to-date with industry trends and emerging technologies. Lead efforts
to introduce new tools and technologies that enhance DevOps practices.
- Problem Resolution and Troubleshooting: Be responsible for diagnosing and resolving complex issues related to
infrastructure and deployments. Implement preventive measures to reduce recurring problems.
Requirements:
● B.E / B.Tech / M.E / M.Tech / MCA / M.Sc.IT (if not should be able to demonstrate required skills)
● Overall 3+ years of experience in DevOps and Cloud operations specifically in AWS.
● Experience with Linux Administrator
● Experience with microservice architecture, containers, Kubernetes, and Helm is a must
● Experience in Configuration Management preferably Ansible
● Experience in Shell Scripting is a must
● Experience in developing and maintaining CI/CD processes using tools like Gitlab, Jenkins
● Experience in logging, monitoring and analytics
● An Understanding of writing Infrastructure as a Code using tools like Terraform
● Preferences - AWS, Kubernetes, Ansible
Must Have:
● Knowledge of AWS Cloud Platform.
● Good experience with microservice architecture, Kubernetes, helm and container-based technologies
● Hands-on experience with Ansible.
● Should have experience in working and maintaining CI/CD Processes.
● Hands-on experience in version control tools like GIT.
● Experience with monitoring tools such as Cloudwatch/Sysdig etc.
● Sound experience in administering Linux servers and Shell Scripting.
● Should have a good understanding of IT security and have the knowledge to secure production environments (OS and server software).
Job Overview:
You will work in engineering and development teams to integrate and develop cloud solutions and virtualized deployment of software as a service product. This will require understanding the software system architecture function as well as performance and security requirements. The DevOps Engineer is also expected to have expertise in available cloud solutions and services, administration of virtual machine clusters, performance tuning and configuration of cloud computing resources, the configuration of security, scripting and automation of monitoring functions. This position requires the deployment and management of multiple virtual clusters and working with compliance organizations to support security audits. The design and selection of cloud computing solutions that are reliable, robust, extensible, and easy to migrate are also important.
Experience:
- Experience working on billing and budgets for a GCP project - MUST
- Experience working on optimizations on GCP based on vendor recommendations - NICE TO HAVE
- Experience in implementing the recommendations on GCP
- Architect Certifications on GCP - MUST
- Excellent communication skills (both verbal & written) - MUST
- Excellent documentation skills on processes and steps and instructions- MUST
- At least 2 years of experience on GCP.
Basic Qualifications:
- Bachelor’s/Master’s Degree in Engineering OR Equivalent.
- Extensive scripting or programming experience (Shell Script, Python).
- Extensive experience working with CI/CD (e.g. Jenkins).
- Extensive experience working with GCP, Azure, or Cloud Foundry.
- Experience working with databases (PostgreSQL, elastic search).
- Must have 2 years of minimum experience with GCP certification.
Benefits :
- Competitive salary.
- Work from anywhere.
- Learning and gaining experience rapidly.
- Reimbursement for basic working set up at home.
- Insurance (including top-up insurance for COVID).
Location :
Remote - work from anywhere.
Ideal joining preferences:
Immediate or 15 days
Position: Senior DevOps Engineer (Azure Cloud Infra & Application deployments)
Location:
Hyderabad
Hiring a Senior
DevOps engineer having 2 to 5 years of experience.
Primary Responsibilities
Strong Programming experience in PowerShell and batch
scripts.
Strong expertise in Azure DevOps, GitLab, CI/CD, Jenkins and
Git Actions, Azure Infrastructure.
Strong
experience in configuring infrastructure and Deployments application in
Kubernetes, docker & Helm charts, App Services, Server less, SQL database, Cloud
services and Container deployment.
Continues
Integration, deployment, and version control (Git/ADO).
Strong experience in managing and configuring RBAC, managed identity and
security best practices for cloud environments.
Strong verbal and written communication skills.
Experience with agile development process.
·
Good analytical skills
Additional Responsibility.
Familiar with
various design and architecture patterns.
Work with modern frameworks and design patterns.
Experience with cloud applications Azure/AWS. Should have experience in developing solutions and plugins and should have used XRM Toolbox/ToolKit.
Exp in Customer Portal and Fetchxml and Power Apps and Power Automate is good to have.
About RaRa Delivery
Not just a delivery company…
RaRa Delivery is revolutionising instant delivery for e-commerce in Indonesia through data driven logistics.
RaRa Delivery is making instant and same-day deliveries scalable and cost-effective by leveraging a differentiated operating model and real-time optimisation technology. RaRa makes it possible for anyone, anywhere to get same day delivery in Indonesia. While others are focusing on ‘one-to-one’ deliveries, the company has developed proprietary, real-time batching tech to do ‘many-to-many’ deliveries within a few hours.. RaRa is already in partnership with some of the top eCommerce players in Indonesia like Blibli, Sayurbox, Kopi Kenangan and many more.
We are a distributed team with the company headquartered in Singapore 🇸🇬 , core operations in Indonesia 🇮🇩 and technology team based out of India 🇮🇳
Future of eCommerce Logistics.
- Datadriven logistics company that is bringing in same day delivery revolution in Indonesia 🇮🇩
- Revolutionising delivery as an experience
- Empowering D2C Sellers with logistics as the core technology
About the Role
- Build and maintain CI/CD tools and pipelines.
- Designing and managing highly scalable, reliable, and fault-tolerant infrastructure & networking that forms the backbone of distributed systems at RaRa Delivery.
- Continuously improve code quality, product execution, and customer delight.
- Communicate, collaborate and work effectively across distributed teams in a global environment.
- Operate to strengthen teams across their product with their knowledge base
- Contribute to improving team relatedness, and help build a culture of camaraderie.
- Continuously refactor applications to ensure high-quality design
- Pair with team members on functional and non-functional requirements and spread design philosophy and goals across the team
- Excellent bash, and scripting fundamentals and hands-on with scripting in programming languages such as Python, Ruby, Golang, etc.
- Good understanding of distributed system fundamentals and ability to troubleshoot issues in a larger distributed infrastructure
- Working knowledge of the TCP/IP stack, internet routing, and load balancing
- Basic understanding of cluster orchestrators and schedulers (Kubernetes)
- Deep knowledge of Linux as a production environment, container technologies. e.g. Docker, Infrastructure As Code such as Terraform, K8s administration at large scale.
- Have worked on production distributed systems and have an understanding of microservices architecture, RESTful services, CI/CD.
Azure DeVops
On premises to Azure Migration
Docker, Kubernetes
Terraform CI CD pipeline
9+ Location
Budget – BG, Hyderabad, Remote , Hybrid –
Budget – up to 30 LPA
- Design, Develop, deploy, and run operations of infrastructure services in the Acqueon AWS cloud environment
- Manage uptime of Infra & SaaS Application
- Implement application performance monitoring to ensure platform uptime and performance
- Building scripts for operational automation and incident response
- Handle schedule and processes surrounding cloud application deployment
- Define, measure, and meet key operational metrics including performance, incidents and chronic problems, capacity, and availability
- Lead the deployment, monitoring, maintenance, and support of operating systems (Windows, Linux)
- Build out lifecycle processes to mitigate risk and ensure platforms remain current, in accordance with industry standard methodologies
- Run incident resolution within the environment, facilitating teamwork with other departments as required
- Automate the deployment of new software to cloud environment in coordination with DevOps engineers
- Work closely with Presales, understand customer requirement to deploy in Production
- Lead and mentor a team of operations engineers
- Drive the strategy to evolve and modernize existing tools and processes to enable highly secure and scalable operations
- AWS infrastructure management, provisioning, cost management and planning
- Prepare RCA incident reports for internal and external customers
- Participate in product engineering meetings to ensure product features and patches comply with cloud deployment standards
- Troubleshoot and analyse performance issues and customer reported incidents working to restore services within the SLA
- Monthly SLA Performance reports
As a Cloud Operations Manager in Acqueon you will need….
- 8 years’ progressive experience managing IT infrastructure and global cloud environments such as AWS, GCP (must)
- 3-5 years management experience leading a Cloud Operations / Site Reliability / Production Engineering team working with globally distributed teams in a fast-paced environment
- 3-5 years’ experience in IAC (Terraform, K8)
- 3+ years end-to-end incident management experience
- Experience with communicating and presenting to all stakeholders
- Experience with Cloud Security compliance and audits
- Detail-oriented. The ideal candidate is one who naturally digs as deep as they need to understand the why
- Knowledge on GCP will be added advantage
- Manage and monitor customer instances for uptime and reliability
- Staff scheduling and planning to ensure 24x7x365 coverage for cloud operations
- Customer facing, excellent communication skills, team management, troubleshooting
Job Brief:
We are looking for candidates that have experience in development and have performed CI/CD based projects. Should have a good hands-on Jenkins Master-Slave architecture, used AWS native services like CodeCommit, CodeBuild, CodeDeploy and CodePipeline. Should have experience in setting up cross platform CI/CD pipelines which can be across different cloud platforms or on-premise and cloud platform.
Job Location:
Pune.
Job Description:
- Hands on with AWS (Amazon Web Services) Cloud with DevOps services and CloudFormation.
- Experience interacting with customer.
- Excellent communication.
- Hands-on in creating and managing Jenkins job, Groovy scripting.
- Experience in setting up Cloud Agnostic and Cloud Native CI/CD Pipelines.
- Experience in Maven.
- Experience in scripting languages like Bash, Powershell, Python.
- Experience in automation tools like Terraform, Ansible, Chef, Puppet.
- Excellent troubleshooting skills.
- Experience in Docker and Kuberneties with creating docker files.
- Hands on with version control systems like GitHub, Gitlab, TFS, BitBucket, etc.
At Karza technologies, we take pride in building one of the most comprehensive digital onboarding & due-diligence platforms by profiling millions of entities and trillions of associations amongst them using data collated from more than 700 publicly available government sources. Primarily in the B2B Fintech Enterprise space, we are headquartered in Mumbai in Lower Parel with 100+ strong workforce. We are truly furthering the cause of Digital India by providing the entire BFSI ecosystem with tech products and services that aid onboarding customers, automating processes and mitigating risks seamlessly, in real-time and at fraction of the current cost.
A few recognitions:
- Recognized as Top25 startups in India to work with 2019 by LinkedIn
- Winner of HDFC Bank's Digital Innovation Summit 2020
- Super Winners (Won every category) at Tecnoviti 2020 by Banking Frontiers
- Winner of Amazon AI Award 2019 for Fintech
- Winner of FinTech Spot Pitches at Fintegrate Zone 2018 held at BSE
- Winner of FinShare 2018 challenge held by ShareKhan
- Only startup in Yes Bank Global Fintech Accelerator to win the account during the Cohort
- 2nd place Citi India FinTech Challenge 2018 by Citibank
- Top 3 in Viacom18's Startup Engagement Programme VStEP
What your average day would look like:
- Deploy and maintain mission-critical information extraction, analysis, and management systems
- Manage low cost, scalable streaming data pipelines
- Provide direct and responsive support for urgent production issues
- Contribute ideas towards secure and reliable Cloud architecture
- Use open source technologies and tools to accomplish specific use cases encountered within the project
- Use coding languages or scripting methodologies to solve automation problems
- Collaborate with others on the project to brainstorm about the best way to tackle a complex infrastructure, security, or deployment problem
- Identify processes and practices to streamline development & deployment to minimize downtime and maximize turnaround time
What you need to work with us:
- Proficiency in at least one of the general-purpose programming languages like Python, Java, etc.
- Experience in managing the IAAS and PAAS components on popular public Cloud Service Providers like AWS, Azure, GCP etc.
- Proficiency in Unix Operating systems and comfortable with Networking concepts
- Experience with developing/deploying a scalable system
- Experience with the Distributed Database & Message Queues (like Cassandra, ElasticSearch, MongoDB, Kafka, etc.)
- Experience in managing Hadoop clusters
- Understanding of containers and have managed them in production using container orchestration services.
- Solid understanding of data structures and algorithms.
- Applied exposure to continuous delivery pipelines (CI/CD).
- Keen interest and proven track record in automation and cost optimization.
Experience:
- 1-4 years of relevant experience
- BE in Computer Science / Information Technology
Job Location: Jaipur
Experience Required: Minimum 3 years
About the role:
As a DevOps Engineer for Punchh, you will be working with our developers, SRE, and DevOps teams implementing our next generation infrastructure. We are looking for a self-motivated, responsible, team player who love designing systems that scale. Punchh provides a rich engineering environment where you can be creative, learn new technologies, solve engineering problems, all while delivering business objectives. The DevOps culture here is one with immense trust and responsibility. You will be given the opportunity to make an impact as there are no silos here.
Responsibilities:
- Deliver SLA and business objectives through whole lifecycle design of services through inception to implementation.
- Ensuring availability, performance, security, and scalability of AWS production systems
- Scale our systems and services through continuous integration, infrastructure as code, and gradual refactoring in an agile environment.
- Maintain services once a project is live by monitoring and measuring availability, latency, and overall system and application health.
- Write and maintain software that runs the infrastructure that powers the Loyalty and Data platform for some of the world’s largest brands.
- 24x7 in shifts on call for Level 2 and higher escalations
- Respond to incidents and write blameless RCA’s/postmortems
- Implement and practice proper security controls and processes
- Providing recommendations for architecture and process improvements.
- Definition and deployment of systems for metrics, logging, and monitoring on platform.
Must have:
- Minimum 3 Years of Experience in DevOps.
- BS degree in Computer Science, Mathematics, Engineering, or equivalent practical experience.
- Strong inter-personal skills.
- Must have experience in CI/CD tooling such as Jenkins, CircleCI, TravisCI
- Must have experience in Docker, Kubernetes, Amazon ECS or Mesos
- Experience in code development in at least one high-level programming language fromthis list: python, ruby, golang, groovy
- Proficient in shell scripting, and most importantly, know when to stop scripting and start developing.
- Experience in creation of highly automated infrastructures with any Configuration Management tools like: Terraform, Cloudformation or Ansible.
- In-depth knowledge of the Linux operating system and administration.
- Production experience with a major cloud provider such Amazon AWS.
- Knowledge of web server technologies such as Nginx or Apache.
- Knowledge of Redis, Memcache, or one of the many in-memory data stores.
- Experience with various load balancing technologies such as Amazon ALB/ELB, HA Proxy, F5.
- Comfortable with large-scale, highly-available distributed systems.
Good to have:
- Understanding of Web Standards (REST, SOAP APIs, OWASP, HTTP, TLS)
- Production experience with Hashicorp products such as Vault or Consul
- Expertise in designing, analyzing troubleshooting large-scale distributed systems.
- Experience in an PCI environment
- Experience with Big Data distributions from Cloudera, MapR, or Hortonworks
- Experience maintaining and scaling database applications
- Knowledge of fundamental systems engineering principles such as CAP Theorem, Concurrency Control, etc.
- Understanding of the network fundamentals: OSI, TCI/IP, topologies, etc.
- Understanding of Auditing of Infrastructure and help org. to control Infrastructure costs.
- Experience in Kafka, RabbitMQ or any messaging bus.










