
- Provide consultation and review all outgoing critical customer communications.
- Apply DevOps thinking in bringing the development and IT Ops process, people, and tools together within the company in order to increase the speed, efficiency, and quality.
- Perform architecture and security reviews for different projects, work with leads to develop strategy and roadmap for the client requirements. Involve in designing of the overall architecture of the system with another leads/architect.
- Develop and grow engineers in DevOps technology to meet the incoming requirements from the business team.
- Work with senior technical team to bring in new technologies/tools being used within the company. Develop and promote best practices and emerging concepts for DevSecOps and secure CI/CD. Participate in Solution Strategy, innovation areas, and technology roadmap.
Key Skills:
- Deals positively with high levels of uncertainty, ambiguity, and shifting priorities.
- Ability to influence stakeholders as a trusted advisor across all levels, including teams outside of shared services.
- Ability to think outside of the box and be innovative by keeping abreast of new trends, identifying opportunities to bring in change for business benefit.
- Implementing CI (Continuous Integration) and CD (Continuous Deployment). Have Good exposure to CI & Build Management tools like Jenkins Azure DevOps GitHub Actions Maven Gradle and etc
- Deployment and provisioning tools (Chef/Ansible/Terraform/AWS CDK etc)
- Docker Orchestration tools like Kubernetes/Swarm etc
- Good hands-on knowledge of automation scripting Python Shell Ruby etc
- Version Control for Source Code Management (SCM) tool: GIT/Bitbucket and etc
- Expertise in Linux based systems like Unix Linux Ubuntu and also manage security systems Linux file system permission etc
- Container Orchestration tool: Kubernetes Swarm Meso Marathon Docker Writing Docker file Docker compose
- Expertise in managing Cloud resources and good exposure to Docker
- Public/Private/Hybrid cloud: AWS /Microsoft Azure/ Google Cloud Platform etc
- Extensive experience with cloud services elastic capacity administration and cloud deployment and migration.
- Good to have knowledge of tools like Splunk, New Relic, PagerDuty, VictorOps
- Familiarity with Network protocols and elements - TCP/IP HTTP(S) SSL DNS Firewall router load balancers proxy.
- Excellent in creating new and improve existing workflows within the agile software development lifecycle.
- Familiar with incident and change management processes.
- Ability to effectively priorities work with fast-changing requirements.
- Troubleshoot and debug infrastructure Network and operating system issues.
- Resolve complex issues in scenarios like resource consumptions server performance backup strategy Scaling.
- Investigate and perform Root Cause Analysis on users' reported issues and provide a workaround before implementing a final fix.
- Monitor servers and applications to ensure the smooth running of IT Architecture (Applications Services Schedulers Server Performance etc)
Design Skills:
- Interpret and implement the designs of others adhering to standards and guidelines
- Design solutions within their area of expertise using technologies that already exist within Tesco
- Understand the roadmaps for their area of Technology Design secure solutions
- Design solutions that can be consumed in a self-service manner by the engineering teams
- Understand the impact of technologies at an enterprise-scale innovation
- Demonstrate knowledge of the latest technology trends related to Infrastructure
- Understand how Industry trends impact their own area
- identify opportunities to automate work and deliver against them

Similar jobs
Roles & Responsibilities:
- Bachelor’s degree in Computer Science, Information Technology or a related field
- Experience in designing and maintaining high volume and scalable micro-services architecture on cloud infrastructure
- Knowledge in Linux/Unix Administration and Python/Shell Scripting
- Experience working with cloud platforms like AWS (EC2, ELB, S3, Auto-scaling, VPC, Lambda), GCP, Azure
- Knowledge in deployment automation, Continuous Integration and Continuous Deployment (Jenkins, Maven, Puppet, Chef, GitLab) and monitoring tools like Zabbix, Cloud Watch Monitoring, Nagios Knowledge of Java Virtual Machines, Apache Tomcat, Nginx, Apache Kafka, Microservices architecture, Caching mechanisms
- Experience in enterprise application development, maintenance and operations
- Knowledge of best practices and IT operations in an always-up, always-available service
- Excellent written and oral communication skills, judgment and decision-making skills
Now, more than ever, the Toast team is committed to our customers. We’re taking steps to help restaurants navigate these unprecedented times with technology, resources, and community. Our focus is on building a restaurant platform that helps restaurants adapt, take control, and get back to what they do best: building the businesses they love. And because our technology is purpose-built for restaurants by restaurant people, restaurants can trust that we’ll deliver on their needs for today while investing in experiences that will power their restaurant of the future.
At Toast, our Site Reliability Engineers (SREs) are responsible for keeping all customer-facing services and other Toast production systems running smoothly. SREs are a blend of pragmatic operators and software craftspeople who apply sound software engineering principles, operational discipline, and mature automation to our environments and our codebase. Our decisions are based on instrumentation and continuous observability, as well as predictions and capacity planning.
About this roll* (Responsibilities)
- Gather and analyze metrics from both operating systems and applications to assist in performance tuning and fault finding
- Partner with development teams to improve services through rigorous testing and release procedures
- Participate in system design consulting, platform management, and capacity planning
- Create sustainable systems and services through automation and uplift
- Balance feature development speed and reliability with well-defined service level objectives
Troubleshooting and Supporting Escalations:
- Gather and analyze metrics from both operating systems and applications to assist in performance tuning and fault finding
- Diagnose performance bottlenecks and implement optimizations across infrastructure, databases, web, and mobile applications
- Implement strategies to increase system reliability and performance through on-call rotation and process optimization
- Perform and run blameless RCAs on incidents and outages aggressively, looking for answers that will prevent the incident from ever happening again
Do you have the right ingredients? (Requirements)
- Extensive industry experience with at least 7+ years in SRE and/or DevOps roles
- Polyglot technologist/generalist with a thirst for learning
- Deep understanding of cloud and microservice architecture and the JVM
- Experience with tools such as APM, Terraform, Ansible, GitHub, Jenkins, and Docker
- Experience developing software or software projects in at least four languages, ideally including two of Go, Python, and Java
- Experience with cloud computing technologies ( AWS cloud provider preferred)
Bread puns are encouraged but not required
Objectives :
- Building and setting up new development tools and infrastructure
- Working on ways to automate and improve development and release processes
- Testing code written by others and analyzing results
- Ensuring that systems are safe and secure against cybersecurity threats
- Identifying technical problems and developing software updates and ‘fixes’
- Working with software developers and software engineers to ensure that development follows established processes and works as intended
- Planning out projects and being involved in project management decisions
Daily and Monthly Responsibilities :
- Deploy updates and fixes
- Build tools to reduce occurrences of errors and improve customer experience
- Develop software to integrate with internal back-end systems
- Perform root cause analysis for production errors
- Investigate and resolve technical issues
- Develop scripts to automate visualization
- Design procedures for system troubleshooting and maintenance
Skills and Qualifications :
- Degree in Computer Science or Software Engineering or BSc in Computer Science, Engineering or relevant field
- 3+ years of experience as a DevOps Engineer or similar software engineering role
- Proficient with git and git workflows
- Good logical skills and knowledge of programming concepts(OOPS,Data Structures)
- Working knowledge of databases and SQL
- Problem-solving attitude
- Collaborative team spirit
About us:
HappyFox is a software-as-a-service (SaaS) support platform. We offer an enterprise-grade help desk ticketing system and intuitively designed live chat software.
We serve over 12,000 companies in 70+ countries. HappyFox is used by companies that span across education, media, e-commerce, retail, information technology, manufacturing, non-profit, government and many other verticals that have an internal or external support function.
To know more, Visit! - https://www.happyfox.com/
Responsibilities:
- Build and scale production infrastructure in AWS for the HappyFox platform and its products.
- Research, Build/Implement systems, services and tooling to improve uptime, reliability and maintainability of our backend infrastructure. And to meet our internal SLOs and customer-facing SLAs.
- Proficient in managing/patching servers with Unix-based operating systems like Ubuntu Linux.
- Proficient in writing automation scripts or building infrastructure tools using Python/Ruby/Bash/Golang
- Implement consistent observability, deployment and IaC setups
- Patch production systems to fix security/performance issues
- Actively respond to escalations/incidents in the production environment from customers or the support team
- Mentor other Infrastructure engineers, review their work and continuously ship improvements to production infrastructure.
- Build and manage development infrastructure, and CI/CD pipelines for our teams to ship & test code faster.
- Participate in infrastructure security audits
Requirements:
- At least 5 years of experience in handling/building Production environments in AWS.
- At least 2 years of programming experience in building API/backend services for customer-facing applications in production.
- Demonstrable knowledge of TCP/IP, HTTP and DNS fundamentals.
- Experience in deploying and managing production Python/NodeJS/Golang applications to AWS EC2, ECS or EKS.
- Proficient in containerised environments such as Docker, Docker Compose, Kubernetes
- Proficient in managing/patching servers with Unix-based operating systems like Ubuntu Linux.
- Proficient in writing automation scripts using any scripting language such as Python, Ruby, Bash etc.,
- Experience in setting up and managing test/staging environments, and CI/CD pipelines.
- Experience in IaC tools such as Terraform or AWS CDK
- Passion for making systems reliable, maintainable, scalable and secure.
- Excellent verbal and written communication skills to address, escalate and express technical ideas clearly
- Bonus points – if you have experience with Nginx, Postgres, Redis, and Mongo systems in production.
Azure DeVops
On premises to Azure Migration
Docker, Kubernetes
Terraform CI CD pipeline
9+ Location
Budget – BG, Hyderabad, Remote , Hybrid –
Budget – up to 30 LPA
- Hands on experience in AWS provisioning of AWS services like EC2, S3,EBS, AMI, VPC, ELB, RDS, Auto scaling groups, Cloud Formation.
- Good experience on Build and release process and extensively involved in the CICD using
Jenkins
- Experienced on configuration management tools like Ansible.
- Designing, implementing and supporting fully automated Jenkins CI/CD
- Extensively worked on Jenkins for continuous Integration and for end to end Automation for all Builds and Deployments.
- Proficient with Docker based container deployments to create shelf environments for dev teams and containerization of environment delivery for releases.
- Experience working on Docker hub, creating Docker images and handling multiple images primarily for middleware installations and domain configuration.
- Good knowledge in version control system in Git and GitHub.
- Good experience in build tools
- Implemented CI/CD pipeline using Jenkins, Ansible, Docker, Kubernetes ,YAML and Manifest
Job description
The role requires you to design development pipelines from the ground up, Creation of Docker Files, design and operate highly available systems in AWS Cloud environments. Also involves Configuration Management, Web Services Architectures, DevOps Implementation, Database management, Backups, and Monitoring.
Key responsibility area
- Ensure reliable operation of CI/CD pipelines
- Orchestrate the provisioning, load balancing, configuration, monitoring and billing of resources in the cloud environment in a highly automated manner
- Logging, metrics and alerting management.
- Creation of Bash/Python scripts for automation
- Performing root cause analysis for production errors.
Requirement
- 2 years experience as Team Lead.
- Good Command on kubernetes.
- Proficient in Linux Commands line and troubleshooting.
- Proficient in AWS Services. Deployment, Monitoring and troubleshooting applications in AWS.
- Hands-on experience with CI tooling preferably with Jenkins.
- Proficient in deployment using Ansible.
- Knowledge of infrastructure management tools (Infrastructure as cloud) such as terraform, AWS cloudformation etc.
- Proficient in deployment of applications behind load balancers and proxy servers such as nginx, apache.
- Scripting languages: Bash, Python, Groovy.
- Experience with Logging, Monitoring, and Alerting tools like ELK(Elastic-search, Logstash, Kibana), Nagios. Graylog, splunk Prometheus, Grafana is a plus.
Must Have:
Linux, CI/CD(Jenkin), AWS, Scripting(Bash,shell Python, Go), Ngnix, Docker.
Good to have
Configuration Management(Ansible or similar tool), Logging tool( ELK or similar), Monitoring tool(Ngios or similar), IaC(Terraform, cloudformation).








