- Design, Develop, deploy, and run operations of infrastructure services in the Acqueon AWS cloud environment
- Manage uptime of Infra & SaaS Application
- Implement application performance monitoring to ensure platform uptime and performance
- Building scripts for operational automation and incident response
- Handle schedule and processes surrounding cloud application deployment
- Define, measure, and meet key operational metrics including performance, incidents and chronic problems, capacity, and availability
- Lead the deployment, monitoring, maintenance, and support of operating systems (Windows, Linux)
- Build out lifecycle processes to mitigate risk and ensure platforms remain current, in accordance with industry standard methodologies
- Run incident resolution within the environment, facilitating teamwork with other departments as required
- Automate the deployment of new software to cloud environment in coordination with DevOps engineers
- Work closely with Presales, understand customer requirement to deploy in Production
- Lead and mentor a team of operations engineers
- Drive the strategy to evolve and modernize existing tools and processes to enable highly secure and scalable operations
- AWS infrastructure management, provisioning, cost management and planning
- Prepare RCA incident reports for internal and external customers
- Participate in product engineering meetings to ensure product features and patches comply with cloud deployment standards
- Troubleshoot and analyse performance issues and customer reported incidents working to restore services within the SLA
- Monthly SLA Performance reports
As a Cloud Operations Manager in Acqueon you will need….
- 8 years’ progressive experience managing IT infrastructure and global cloud environments such as AWS, GCP (must)
- 3-5 years management experience leading a Cloud Operations / Site Reliability / Production Engineering team working with globally distributed teams in a fast-paced environment
- 3-5 years’ experience in IAC (Terraform, K8)
- 3+ years end-to-end incident management experience
- Experience with communicating and presenting to all stakeholders
- Experience with Cloud Security compliance and audits
- Detail-oriented. The ideal candidate is one who naturally digs as deep as they need to understand the why
- Knowledge on GCP will be added advantage
- Manage and monitor customer instances for uptime and reliability
- Staff scheduling and planning to ensure 24x7x365 coverage for cloud operations
- Customer facing, excellent communication skills, team management, troubleshooting

Similar jobs
Job Title : Senior DevOps Engineer
Location : Remote
Experience Level : 5+ Years
Role Overview :
We are a funded AI startup seeking a Senior DevOps Engineer to design, implement, and maintain a secure, scalable, and efficient infrastructure. In this role, you will focus on automating operations, optimizing deployment processes, and enabling engineering teams to deliver high-quality products seamlessly.
Key Responsibilities:
Infrastructure Scalability & Reliability :
- Architect and manage cloud infrastructure on AWS, GCP, or Azure for high availability, reliability, and cost-efficiency.
- Implement container orchestration using Kubernetes or Docker Compose.
- Utilize Infrastructure as Code (IaC) tools like Pulumi or Terraform to manage and configure infrastructure.
Deployment Automation :
- Design and maintain CI/CD pipelines using GitHub Actions, Jenkins, or similar tools.
- Implement deployment strategies such as canary or blue-green deployments, and create rollback mechanisms to ensure seamless updates.
Monitoring & Observability :
- Leverage tools like OpenTelemetry, Grafana, and Datadog to monitor system health and performance.
- Establish centralized logging systems and create real-time dashboards for actionable insights.
Security & Compliance :
- Securely manage secrets using tools like HashiCorp Vault or Doppler.
- Conduct static code analysis with tools such as SonarQube or Snyk to ensure compliance with security standards.
Collaboration & Team Enablement :
- Mentor and guide team members on DevOps best practices and workflows.
- Document infrastructure setups, incident runbooks, and troubleshooting workflows to enhance team efficiency.
Required Skills :
- Expertise in managing cloud platforms like AWS, GCP, or Azure.
- In-depth knowledge of Kubernetes, Docker, and IaC tools like Terraform or Pulumi.
- Advanced scripting capabilities in Python or Bash.
- Proficiency in CI/CD tools such as GitHub Actions, Jenkins, or similar.
- Experience with observability tools like Grafana, OpenTelemetry, and Datadog.
- Strong troubleshooting skills for debugging production systems and optimizing performance.
Preferred Qualifications :
- Experience in scaling AI or ML-based applications.
- Familiarity with distributed systems and microservices architecture.
- Understanding of agile methodologies and DevSecOps practices.
- Certifications in AWS, Azure, or Kubernetes.
What We Offer :
- Opportunity to work in a fast-paced AI startup environment.
- Flexible remote work culture.
- Competitive salary and equity options.
- Professional growth through challenging projects and learning opportunities.

Roles & Responsibilities:
- Bachelor’s degree in Computer Science, Information Technology or a related field
- Experience in designing and maintaining high volume and scalable micro-services architecture on cloud infrastructure
- Knowledge in Linux/Unix Administration and Python/Shell Scripting
- Experience working with cloud platforms like AWS (EC2, ELB, S3, Auto-scaling, VPC, Lambda), GCP, Azure
- Knowledge in deployment automation, Continuous Integration and Continuous Deployment (Jenkins, Maven, Puppet, Chef, GitLab) and monitoring tools like Zabbix, Cloud Watch Monitoring, Nagios Knowledge of Java Virtual Machines, Apache Tomcat, Nginx, Apache Kafka, Microservices architecture, Caching mechanisms
- Experience in enterprise application development, maintenance and operations
- Knowledge of best practices and IT operations in an always-up, always-available service
- Excellent written and oral communication skills, judgment and decision-making skills
Must have | Proficient exp of minimum 4 years into DevOps with at least one devops end to end project implementation. Strong expertise on DevOps concepts like Continuous Integration (CI), Continuous delivery (CD) and Infrastructure as Code, Cloud deployments. Minimum exp of 2.5-3 years of Configuration, development and deployment with their underlying technologies including Docker/Kubernetes and Prometheus. Should have implemented an end to end devops pipeline using Jenkins or any similar framework. Experience with Microservices architecture. Sould have sound knowledge in branching and merging strategies. Experience working with cloud computing technologies like Oracle Cloud *(preferred) /GCP/AWS/OpenStack Strong experience in AWS/Azure/GCP/open stack , deployment process, dockerization. Good experience in release management tools like JIRA or similar tools. |
Good to have | Knowledge of Infra automation tools Terraform/CHEF/ANSIBLE (Preferred) Experience in test automation tools like selenium/cucumber/postman Good communication skills to present devops solutions to the client and drive the implementation. Experience in creating and managing custom operational and monitoring scripts. Good knowledge in source control tools like Subversion, Git,bitbucket, clearcase. Experience in system architecture design |

-
Job Title - DevOps Engineer
-
Reports Into - Lead DevOps Engineer
-
Location - India
A Little Bit about Kwalee….
Kwalee is one of the world’s leading multiplatform game developers and publishers, with well over 900 million downloads worldwide for mobile hits such as Draw It, Teacher Simulator, Let’s Be Cops 3D, Airport Security and Makeover Studio 3D. We also have a growing PC and Console team of incredible pedigree that is on the hunt for great new titles to join TENS!, Eternal Hope, Die by the Blade and Scathe.
What’s In It For You?
-
Hybrid working - 3 days in the office, 2 days remote/ WFH is the norm
-
Flexible working hours - we trust you to choose how and when you work best
-
Profit sharing scheme - we win, you win
-
Private medical cover - delivered through BUPA
-
Life Assurance - for long term peace of mind
-
On site gym - take care of yourself
-
Relocation support - available
-
Quarterly Team Building days - we’ve done Paintballing, Go Karting & even Robot Wars
-
Pitch and make your own games on https://www.kwalee.com/blog/inside-kwalee/what-are-creative-wednesdays/">Creative Wednesdays!
Are You Up To The Challenge?
As a DevOps Engineer you have a passion for automation, security and building reliable expandable systems. You develop scripts and tools to automate deployment tasks and monitor critical aspects of the operation, resolve engineering problems and incidents. Collaborate with architects and developers to help create platforms for the future.
Your Team Mates
The DevOps team works closely with game developers, front-end and back-end server developers making, updating and monitoring application stacks in the cloud.Each team member has specific responsibilities with their own projects to manage and bring their own ideas to how the projects should work. Everyone strives for the most efficient, secure and automated delivery of application code and supporting infrastructure.
What Does The Job Actually Involve?
-
Find ways to automate tasks and monitoring systems to continuously improve our systems.
-
Develop scripts and tools to make our infrastructure resilient and efficient.
-
Understand our applications and services and keep them running smoothly.
Your Hard Skills
-
Minimum 1 years of experience on a dev ops engineering role
-
Deep experience with Linux and Unix systems
-
Networking basics knowledge (named, nginx, etc)
-
Some coding experience (Python, Ruby, Perl, etc.)
-
Experience with common automation tools (Ex. Chef, Terraform, etc)
-
AWS experience is a plus
-
A creative mindset motivated by challenges and constantly striving for the best
Your Soft Skills
Kwalee has grown fast in recent years but we’re very much a family of colleagues. We welcome people of all ages, races, colours, beliefs, sexual orientations, genders and circumstances, and all we ask is that you collaborate, work hard, ask questions and have fun with your team and colleagues.
We don’t like egos or arrogance and we love playing games and celebrating success together. If that sounds like you, then please apply.
A Little More About Kwalee
Founded in 2011 by David Darling CBE, a key architect of the UK games industry who previously co-founded and led Codemasters, our team also includes legends such as Andrew Graham (creator of Micro Machines series) and Jason Falcus (programmer of classics including NBA Jam) alongside a growing and diverse team of global gaming experts.
Everyone contributes creatively to Kwalee’s success, with all employees eligible to pitch their own game ideas on Creative Wednesdays, and we’re proud to have built our success on this inclusive principle.
We have an amazing team of experts collaborating daily between our studios in Leamington Spa, Lisbon, Bangalore and Beijing, or on a remote basis from Turkey, Brazil, Cyprus, the Philippines and many more places around the world. We’ve recently acquired our first external studio, TicTales, which is based in France.
We have a truly global team making games for a global audience, and it’s paying off: - Kwalee has been voted the Best Large Studio and Best Leadership Team at the TIGA Awards (Independent Game Developers’ Association) and our games have been downloaded in every country on earth - including Antarctica!
Key Responsibilities:
- Work with the development team to plan, execute and monitor deployments
- Capacity planning for product deployments
- Adopt best practices for deployment and monitoring systems
- Ensure the SLAs for performance, up time are met
- Constantly monitor systems, suggest changes to improve performance and decrease costs.
- Ensure the highest standards of security
Key Competencies (Functional):
- Proficiency in coding in atleast one scripting language - bash, Python, etc
- Has personally managed a fleet of servers (> 15)
- Understand different environments production, deployment and staging
- Worked in micro service / Service oriented architecture systems
- Has worked with automated deployment systems – Ansible / Chef / Puppet.
- Can write MySQL queries
Now, more than ever, the Toast team is committed to our customers. We’re taking steps to help restaurants navigate these unprecedented times with technology, resources, and community. Our focus is on building a restaurant platform that helps restaurants adapt, take control, and get back to what they do best: building the businesses they love. And because our technology is purpose-built for restaurants by restaurant people, restaurants can trust that we’ll deliver on their needs for today while investing in experiences that will power their restaurant of the future.
At Toast, our Site Reliability Engineers (SREs) are responsible for keeping all customer-facing services and other Toast production systems running smoothly. SREs are a blend of pragmatic operators and software craftspeople who apply sound software engineering principles, operational discipline, and mature automation to our environments and our codebase. Our decisions are based on instrumentation and continuous observability, as well as predictions and capacity planning.
About this roll* (Responsibilities)
- Gather and analyze metrics from both operating systems and applications to assist in performance tuning and fault finding
- Partner with development teams to improve services through rigorous testing and release procedures
- Participate in system design consulting, platform management, and capacity planning
- Create sustainable systems and services through automation and uplift
- Balance feature development speed and reliability with well-defined service level objectives
Troubleshooting and Supporting Escalations:
- Gather and analyze metrics from both operating systems and applications to assist in performance tuning and fault finding
- Diagnose performance bottlenecks and implement optimizations across infrastructure, databases, web, and mobile applications
- Implement strategies to increase system reliability and performance through on-call rotation and process optimization
- Perform and run blameless RCAs on incidents and outages aggressively, looking for answers that will prevent the incident from ever happening again
Do you have the right ingredients? (Requirements)
- Extensive industry experience with at least 7+ years in SRE and/or DevOps roles
- Polyglot technologist/generalist with a thirst for learning
- Deep understanding of cloud and microservice architecture and the JVM
- Experience with tools such as APM, Terraform, Ansible, GitHub, Jenkins, and Docker
- Experience developing software or software projects in at least four languages, ideally including two of Go, Python, and Java
- Experience with cloud computing technologies ( AWS cloud provider preferred)
Bread puns are encouraged but not required
Looking for an experienced candidate with strong development and programming experience, knowledge preferred-
- Cloud computing (i.e. Kubernetes, AWS, Google Cloud, Azure)
- Coming from a strong development background and has programming experience with Java and/or NodeJS (other programming languages such as Groovy/python are a big bonus)
- Proficient with Unix systems and bash
- Proficient with git/GitHub/GitLab/bitbucket
Desired skills-
- Docker
- Kubernetes
- Jenkins
- Experience in any scripting language (Phyton, Shell Scripting, Java Script)
- NGINX / Load Balancer
- Splunk / ETL tools
The DevOps Engineer's core responsibilities include automated configuration and management
of infrastructure, continuous integration and delivery of distributed systems at scale in a Hybrid
environment.
Must-Have:
● You have 4-10 years of experience in DevOps
● You have experience in managing IT infrastructure at scale
● You have experience in automation of deployment of distributed systems and in
infrastructure provisioning at scale.
● You have in-depth hands-on experience on Linux and Linux-based systems, Linux
scripting
● You have experience in Server hardware, Networking, firewalls
● You have experience in source code management, configuration management,
continuous integration, continuous testing, continuous monitoring
● You have experience with CI/CD and related tools
* You have experience with Monitoring tools like ELK, Grafana, Prometheus
● You have experience with containerization, container orchestration, management
● Have a penchant for solving complex and interesting problems.
● Worked in startup-like environments with high levels of ownership and commitment.
● BTech, MTech or Ph.D. in Computer Science or related Technical Discipline

Skills required:
Strong knowledge and experience of cloud infrastructure (AWS, Azure or GCP), systems, network design, and cloud migration projects.
Strong knowledge and understanding of CI/CD processes tools (Jenkins/Azure DevOps) is a must.
Strong knowledge and understanding of Docker & Kubernetes is a must.
Strong knowledge of Python, along with one more language (Shell, Groovy, or Java).
Strong prior experience using automation tools like Ansible, Terraform.
Architect systems, infrastructure & platforms using Cloud Services.
Strong communication skills. Should have demonstrated the ability to collaborate across teams and organizations.
Benefits of working with OpsTree Solutions:
Opportunity to work on the latest cutting edge tools/technologies in DevOps
Knowledge focused work culture
Collaboration with very enthusiastic DevOps experts
High growth trajectory
Opportunity to work with big shots in the IT industry


