
Cloud DevOps Architect
· Practices self-leadership and promotes learning in others by building relationships with cross- functional stakeholders; communicating information and providing advice to drive projects forward; adapting to competing demands and new responsibilities; providing feedback to others; mentoring junior team members; creating and executing plans to capitalize on strengths and improve opportunity areas; and adapting to and learning from change, difficulties, and feedback.
· Ensure appropriate translation of business requirements and functional specifications into physical program designs, code modules, stable application systems, and software solutions by partnering with Business Analysts and other team members to understand business needs and functional specifications.
· Build use cases/scenarios and reference architectures to enable rapid adoption of cloud services
in Product’s cloud journey.
· Provide insight into recommendations for technical solutions that meet design and functional needs.
· Experience or familiarity with Firewall/NGFW deployed in a variety of form factors (Checkpoint, Imperva, Palo Alto, Azure Firewall).
· Establish credibility & build deep relationships with senior technical individuals to enable them to be cloud advocates.
· Participate in deep architectural discussions to build confidence and ensure engineering success when building new and migrating existing applications, software. and services to AWS and GCP.
· Conduct deep-dive hands-on education/training sessions to transfer knowledge to DevOps and engineering teams considering or already using public cloud services.
· Be a cloud (Amazon Web Services, Google Cloud Platform) and DevOps evangelist and advise the stakeholders on cloud readiness, workload identification, migration and identifying the right multi cloud mix to effectively accomplish business objectives.
· Understands engineering requirements and architect scalable solutions adopting DevOps and leveraging advanced technologies such as AWS CodePipelines, AWS Code-Commit, ECS containers, API Gateway, CloudFormation Templates, AWS Kinesis, Splunk, Dome9, AWS-SQS, AWS-SNS, SonarCube, Microservices, and Kubernetes to realize stronger benefits and future proof outcomes for customer-facing applications.
· Build use cases/scenarios and reference architectures to enable rapid adoption of cloud services in product’s cloud journey.
· Be an integral part of the technology and architecture community in the public cloud partners (AWS, GCP, Azure) and bring in new services launched by cloud providers into 8K Miles Product Platform scope.
· Capture and share best-practice knowledge amongst the DevOps and Cloud community.
· Act as a technical liaison between product management, service engineering, and support teams.
· Qualification:
o Master’s Degree in Computer Science/Engineering with 12+ years’ experience in information technology (networking, infrastructure, database).
o Strong and recent exposure to AWS/GCP/Azure Cloud platforms and designing hybrid multi cloud solutions. Preferred to be Certified AWS Architect professional or similar
· Working knowledge of UNIX shell scripting.
· Strong hands-on programming experience in Python
· Working knowledge of data visualization tools – Tableau.
· Experience working in cloud environment — AWS.
· Experience working with modern tools in the Agile Software Development Life Cycle.
· Version Control Systems (Ex. Git, Github, Stash/Bitbucket), Knowledge Management (Ex. Confluence, Google Docs), Development Workflow (Ex. Jira), Continuous Integration (Ex. Bamboo), Real Time Collaboration (Ex. Hipchat, Slack).

Similar jobs

Location: Bangalore
Experience: 2–5 years
Type: Full-time | On-site
Start: Immediate
Why this role exists
Most systems don’t fail because of one big outage.
They fail because reliability is treated as an afterthought.
Right now, uptime depends too much on individual heroics.
That doesn’t scale.
This role exists to build a reliability system where:
- Uptime is predictable
- Failures are contained
- Escalations don’t depend on leadership
What you’ll do
You will not just monitor systems.
You will own reliability as a product.
1. Drive uptime to production-grade reliability
- Improve system uptime to 99.9% customer-facing SLA within 4 months
- Define and track:
- SLAs / SLOs / error budgets
- Ensure reliability is measured from the customer’s perspective, not internal metrics
2. Build incident response as a system
- Set up a 24/7 incident response rotation across 3 engineers
- Eliminate dependency on leadership (no single escalation point)
- Define:
- Incident severity levels
- Response playbooks
- Escalation protocols
- Ensure fast detection → containment → resolution
3. Contain and fix erratic system behavior
- Identify and resolve:
- Latency spikes
- Downtime incidents
- Integration failures
- Build guardrails to prevent recurrence
- Focus on root cause elimination, not temporary fixes
4. Create continuous reliability feedback loops
- Work closely with engineering teams to:
- Surface recurring failure patterns
- Improve build quality
- Reduce production bugs
- Ensure learnings from incidents directly improve future releases
5. Improve observability and monitoring
- Build dashboards and alerts for:
- System health
- Performance metrics
- Failure signals
- Ensure issues are detected before customers report them
6. Reduce operational fragility
- Remove single points of failure (people, systems, workflows)
- Improve system resilience across:
- Deployments
- Integrations
- Runtime environments
What success looks like
- Uptime reaches 99.9%+ reliably
- Incidents are:
- Detected early
- Contained quickly
- Resolved permanently
- No dependency on a single individual for escalation
- System behavior becomes predictable and stable
- Engineering teams ship with higher reliability confidence
Who you are
- You have 2-5 years of experience in SRE / DevOps / backend systems
- You have worked on production systems with real uptime expectations
- You think in:
- Systems
- Failure modes
- Trade-offs
- You are comfortable debugging live, high-pressure environments
What will make you stand out
- Experience with:
- Distributed systems
- Cloud infrastructure (AWS / Azure / GCP)
- Monitoring & alerting tools
- Have built or improved:
- Incident response systems
- Reliability frameworks
- Strong debugging skills across:
- Infra
- Application
- Integrations
Compensation
₹60,000/month (fixed)
(Aligned with role scope and impact expectations)
Why join
- You will define reliability standards for a production AI platform
- Your work directly impacts:
- Customer trust
- Product performance
- Enterprise readiness
- You will move the system from reactive → predictable
What this role is not
- Not just monitoring dashboards
- Not limited to handling tickets
- Not dependent on escalation to leadership
What this role is
- A builder of reliability systems
- A guardian of uptime and performance
- A multiplier of engineering quality
One question to self-evaluate
Can you build a system where downtime is rare, predictable, and never dependent on a single person?
Key Responsibilities
DevOps Strategy & Leadership
- Define and execute the end-to-end DevOps strategy for high-frequency trading and fintech platforms.
- Lead, mentor, and scale a high-performing DevOps team focused on automation, reliability, and performance.
- Partner closely with engineering and product leaders to ensure infrastructure strategy supports business and technical goals.
CI/CD & Infrastructure Automation
- Architect, implement, and optimize enterprise-grade CI/CD pipelines for ultra-low-latency trading systems.
- Drive Infrastructure as Code (IaC) adoption using Terraform, Helm, Kubernetes, and advanced automation toolsets.
- Establish robust release management, deployment workflows, and versioning best practices for mission‑critical environments.
Cloud & On‑Prem Infrastructure Management
- Design and manage hybrid infrastructures across AWS, GCP, and on-premise data centers ensuring high availability and fault tolerance.
- Implement sophisticated networking strategies for low-latency workloads including routing optimization and performance tuning.
- Lead multi‑cloud scalability, cost optimization, and environment standardization initiatives.
Performance Monitoring & Optimization
- Oversee large-scale monitoring systems using Prometheus, Grafana, ELK, and related observability tools.
- Implement predictive alerting, automated remediation, and system‑wide health checks for zero‑downtime operations.
- Conduct root-cause analyses and performance tuning for systems processing millions of transactions per second.
Security & Compliance
- Champion DevSecOps practices and embed security across the entire development and deployment lifecycle.
- Ensure adherence to financial regulatory standards (SEBI and global frameworks) with strong audit and compliance mechanisms.
- Lead security automation efforts, vulnerability management, and advanced IAM policy implementation.
Required Skills & Qualifications
- 10+ years of DevOps experience, with 5+ years in a leadership capacity.
- Deep hands-on expertise in CI/CD tools such as Jenkins, GitLab CI/CD, and ArgoCD.
- Strong command of AWS, GCP, and hybrid cloud infrastructures.
- Expert-level knowledge of Kubernetes, Docker, and large-scale container orchestration.
- Advanced proficiency in Terraform, Helm, and overall IaC workflows.
- Strong Linux administration, networking fundamentals (TCP/IP, DNS, Firewalls), and system internals.
- Experience with monitoring and observability platforms (Prometheus, Grafana, ELK).
- Excellent scripting skills in Python, Bash, or Go for automation and tooling.
- Deep understanding of security principles, encryption, IAM, and compliance frameworks.
Good to Have
- Experience with ultra-low-latency or high-frequency trading systems.
- Knowledge of FIX protocol, FPGA acceleration, or network‑level optimizations.
- Familiarity with Redis, Nginx, or other high‑throughput systems.
- Exposure to micro‑second‑level performance tuning or network acceleration technologies.
Why Join Us?
- Be part of a team that consistently raises the bar and delivers exceptional engineering outcomes.
- A culture where innovation, ownership, and bold thinking are valued.
- Exceptional growth opportunities—ideal for someone who thrives in fast-paced, high-impact environments.
- Build systems that influence markets and redefine the fintech landscape.
This isn’t just a role—it’s a challenge, a platform, and a proving ground.
Ready to step up? Apply now.
- Candidate should be able to write the sample programs using the Tools (Bash, PowerShell, Python or Shell scripting)
- Analytical/logical reasoning
- GitHub Actions
- Should have good working experience with GitHub Actions
- Repository/Workflow Dispatch, writing reusable workflows, etc
- AZ CLI commands
- Hands-on experience with AZ CLI commands
Objectives :
- Building and setting up new development tools and infrastructure
- Working on ways to automate and improve development and release processes
- Testing code written by others and analyzing results
- Ensuring that systems are safe and secure against cybersecurity threats
- Identifying technical problems and developing software updates and ‘fixes’
- Working with software developers and software engineers to ensure that development follows established processes and works as intended
- Planning out projects and being involved in project management decisions
Daily and Monthly Responsibilities :
- Deploy updates and fixes
- Build tools to reduce occurrences of errors and improve customer experience
- Develop software to integrate with internal back-end systems
- Perform root cause analysis for production errors
- Investigate and resolve technical issues
- Develop scripts to automate visualization
- Design procedures for system troubleshooting and maintenance
Skills and Qualifications :
- Degree in Computer Science or Software Engineering or BSc in Computer Science, Engineering or relevant field
- 3+ years of experience as a DevOps Engineer or similar software engineering role
- Proficient with git and git workflows
- Good logical skills and knowledge of programming concepts(OOPS,Data Structures)
- Working knowledge of databases and SQL
- Problem-solving attitude
- Collaborative team spirit
About BootLabs
https://www.google.com/url?q=https://www.bootlabs.in/&sa=D&source=calendar&ust=1667803146567128&usg=AOvVaw1r5g0R_vYM07k6qpoNvvh6" target="_blank">https://www.bootlabs.in/
-We are a Boutique Tech Consulting partner, specializing in Cloud Native Solutions.
-We are obsessed with anything “CLOUD”. Our goal is to seamlessly automate the development lifecycle, and modernize infrastructure and its associated applications.
-With a product mindset, we enable start-ups and enterprises on the cloud
transformation, cloud migration, end-to-end automation and managed cloud services.
-We are eager to research, discover, automate, adapt, empower and deliver quality solutions on time.
-We are passionate about customer success. With the right blend of experience and exuberant youth in our in-house team, we have significantly impacted customers.
Technical Skills:
• Expertise in any one hyper scaler (AWS/AZURE/GCP), including basic services like networking,
data and workload management.
- AWS
Networking: VPC, VPC Peering, Transit Gateway, Route Tables, Security Groups, etc.
Data: RDS, DynamoDB, Elastic Search
Workload: EC2, EKS, Lambda, etc.
- Azure
Data: Azure MySQL, Azure MSSQL, etc.
Workload: AKS, Virtual Machines, Azure Functions
- GCP
Data: Cloud Storage, DataFlow, Cloud SQL, Firestore, BigTable, BigQuery
Workload: GKE, Instances, App Engine, Batch, etc.
• Experience in any one of the CI/CD tools (Gitlab/Github/Jenkins) including runner setup,
templating and configuration.
• Kubernetes experience or Ansible Experience (EKS/AKS/GKE), basics like pod, deployment,
networking, service mesh. Used any package manager like helm.
• Scripting experience (Bash/python), automation in pipelines when required, system service.
• Infrastructure automation (Terraform/pulumi/cloud formation), write modules, setup pipeline and version the code.
Optional:
• Experience in any programming language is not required but is appreciated.
• Good experience in GIT, SVN or any other code management tool is required.
• DevSecops tools like (Qualys/SonarQube/BlackDuck) for security scanning of artifacts, infrastructure and code.
• Observability tools (Opensource: Prometheus, Elasticsearch, Open Telemetry; Paid: Datadog,
24/7, etc)
Client: Sony Corporation
Position: 3
Exp: 5-8Years
DevOps Engineer
Location: Bangalore
Budget: 16.5 LPA Max
Gerrit ,Jenkins, Rabbit MQ AWS Linux
Python ,Ansible, Tomcat ,Postgresql
Grafana ,Groovy,HTML,Shell,Apache,Git , ELK
Requirements
You will make an ideal candidate if you have:
-
Experience of building a range of Services in a Cloud Service provider
-
Expert understanding of DevOps principles and Infrastructure as a Code concepts and techniques
-
Strong understanding of CI/CD tools (Jenkins, Ansible, GitHub)
-
Managed an infrastructure that involved 50+ hosts/network
-
3+ years of Kubernetes experience & 5+ years of experience in Native services such as Compute (virtual machines), Containers (AKS), Databases, DevOps, Identity, Storage & Security
-
Experience in engineering solutions on cloud foundation platform using Infrastructure As Code methods (eg. Terraform)
-
Security and Compliance, e.g. IAM and cloud compliance/auditing/monitoring tools
-
Customer/stakeholder focus. Ability to build strong relationships with Application teams, cross functional IT and global/local IT teams
-
Good leadership and teamwork skills - Works collaboratively in an agile environment
-
Operational effectiveness - delivers solutions that align to approved design patterns and security standards
-
Excellent skills in at least one of following: Python, Ruby, Java, JavaScript, Go, Node.JS
-
Experienced in full automation and configuration management
-
A track record of constantly looking for ways to do things better and an excellent understanding of the mechanism necessary to successfully implement change
-
Set and achieved challenging short, medium and long term goals which exceeded the standards in their field
-
Excellent written and spoken communication skills; an ability to communicate with impact, ensuring complex information is articulated in a meaningful way to wide and varied audiences
-
Built effective networks across business areas, developing relationships based on mutual trust and encouraging others to do the same
-
A successful track record of delivering complex projects and/or programmes, utilizing appropriate techniques and tools to ensure and measure success
-
A comprehensive understanding of risk management and proven experience of ensuring own/others' compliance with relevant regulatory processes
Essential Skills :
-
Demonstrable Cloud service provider experience - infrastructure build and configurations of a variety of services including compute, devops, databases, storage & security
-
Demonstrable experience of Linux administration and scripting preferably Red Hat
-
Experience of working with Continuous Integration (CI), Continuous Delivery (CD) and continuous testing tools
-
Experience working within an Agile environment
-
Programming experience in one or more of the following languages: Python, Ruby, Java, JavaScript, Go, Node.JS
-
Server administration (either Linux or Windows)
-
Automation scripting (using scripting languages such as Terraform, Ansible etc.)
-
Ability to quickly acquire new skills and tools
Required Skills :
-
Linux & Windows Server Certification

Total Experience: 6 – 12 Years
Required Skills and Experience
- 3+ years of relevant experience with DevOps tools Jenkins, Ansible, Chef etc
- 3+ years of experience in continuous integration/deployment and software tools development experience with Python and shell scripts etc
- Building and running Docker images and deployment on Amazon ECS
- Working with AWS services (EC2, S3, ELB, VPC, RDS, Cloudwatch, ECS, ECR, EKS)
- Knowledge and experience working with container technologies such as Docker and Amazon ECS, EKS, Kubernetes
- Experience with source code and configuration management tools such as Git, Bitbucket, and Maven
- Ability to work with and support Linux environments (Ubuntu, Amazon Linux, CentOS)
- Knowledge and experience in cloud orchestration tools such as AWS Cloudformation/Terraform etc
- Experience with implementing "infrastructure as code", “pipeline as code” and "security as code" to enable continuous integration and delivery
- Understanding of IAM, RBAC, NACLs, and KMS
- Good communication skills
Good to have:
- Strong understanding of security concepts, methodologies and apply them such as SSH, public key encryption, access credentials, certificates etc.
- Knowledge of database administration such as MongoDB.
- Knowledge of maintaining and using tools such as Jira, Bitbucket, Confluence.
Responsibilities
- Work with Leads and Architects in designing and implementation of technical infrastructure, platform, and tools to support modern best practices and facilitate the efficiency of our development teams through automation, CI/CD pipelines, and ease of access and performance.
- Establish and promote DevOps thinking, guidelines, best practices, and standards.
- Contribute to architectural discussions, Agile software development process improvement, and DevOps best practices.

Must Have Skills
- AWS Solutions Architect and/or DevOps certification, Professional preferred
- BS level technical degree or equivalent experience; Computer Science or Engineering background preferred
- Hands-on technical expertise on Amazon Web Services (AWS) to include but not limited to EC2, VPC, IAM, Security groups, ELB/NLB/ALB, Internet gateway, S3, EBS and EFS.
- Experience in migration-deployment of applications to Cloud, re-engineering of application for cloud, setting up OS and app environments in virtualized cloud
- DevOps automation, CI/CD, infrastructure/services provisioning, application deployment and configuration
- DevOps toolsets to include Ansible, Jenkins and XLdeploy, XLrelease
- Deploy and configuration of Java/Wildfly, Springboot, JavaScript/Node.js and Ruby applications and middleware
- Scripting in Shell (bash) and Extensive Python
- Agile software development
- Excellent written, verbal communication skills, presentation, and collaboration skills - Team leadership skills
- Linux (RHEL) administration/engineering
Experience:
- Deploying, configuring, and supporting large scale monolithic and microservices based SaaS applications
- Working as both an infrastructure and application migration specialist
- Identifying and documenting application requirements for network, F5, IAM, and security groups
- Implementing DevOps practices such as infrastructure as code, continuous integration, and automated deployment
- Work with Technology leadership to understand business goals and requirements
- Experience with continuous integration tools
- Experience with configuration management platforms
- Writing and diagnosing issues with complex shell scripts
- Strong practical application development experience on Linux and Windows-based systems








