Please Apply - https://zrec.in/7EYKe?source=CareerSite
About Us
Infra360 Solutions is a services company specializing in Cloud, DevSecOps, Security, and Observability solutions. We help technology companies adapt DevOps culture in their organization by focusing on long-term DevOps roadmap. We focus on identifying technical and cultural issues in the journey of successfully implementing the DevOps practices in the organization and work with respective teams to fix issues to increase overall productivity. We also do training sessions for the developers and make them realize the importance of DevOps. We provide these services - DevOps, DevSecOps, FinOps, Cost Optimizations, CI/CD, Observability, Cloud Security, Containerization, Cloud Migration, Site Reliability, Performance Optimizations, SIEM and SecOps, Serverless automation, Well-Architected Review, MLOps, Governance, Risk & Compliance. We do assessments of technology architecture, security, governance, compliance, and DevOps maturity model for any technology company and help them optimize their cloud cost, streamline their technology architecture, and set up processes to improve the availability and reliability of their website and applications. We set up tools for monitoring, logging, and observability. We focus on bringing the DevOps culture to the organization to improve its efficiency and delivery.
Job Description
Job Title: Senior DevOps Engineer / SRE
Department: Technology
Location: Gurgaon
Work Mode: On-site
Working Hours: 10 AM - 7 PM
Terms: Permanent
Experience: 4-6 years
Education: B.Tech/MCA
Notice Period: Immediately
About Us
At Infra360.io, we are a next-generation cloud consulting and services company committed to delivering comprehensive, 360-degree solutions for cloud, infrastructure, DevOps, and security. We partner with clients to transform and optimize their technology landscape, ensuring resilience, scalability, cost efficiency and innovation.
Our core services include Cloud Strategy, Site Reliability Engineering (SRE), DevOps, Cloud Security Posture Management (CSPM), and related Managed Services. We specialize in driving operational excellence across multi-cloud environments, helping businesses achieve their goals with agility and reliability.
We thrive on ownership, collaboration, problem-solving, and excellence, fostering an environment where innovation and continuous learning are at the forefront. Join us as we expand and redefine what’s possible in cloud technology and infrastructure.
Role Summary
We are seeking a Senior DevOps Engineer (SRE) to manage and optimize large-scale, mission-critical production systems. The ideal candidate will have a strong problem-solving mindset, extensive experience in troubleshooting, and expertise in scaling, automating, and enhancing system reliability. This role requires hands-on proficiency in tools like Kubernetes, Terraform, CI/CD, and cloud platforms (AWS, GCP, Azure), along with scripting skills in Python or Go. The candidate will drive observability and monitoring initiatives using tools like Prometheus, Grafana, and APM solutions (Datadog, New Relic, OpenTelemetry).
Strong communication, incident management skills, and a collaborative approach are essential. Experience in team leadership and multi-client engagement is a plus.
Ideal Candidate Profile
- Solid 4-6 years of experience as an SRE and DevOps with a proven track record of handling large-scale production environments
- Bachelor's or Master's degree in Computer Science, Engineering, or a related field
- Strong Hands-on experience with managing Large Scale Production Systems
- Strong Production Troubleshooting Skills and handling high-pressure situations.
- Strong Experience with Databases (PostgreSQL, MongoDB, ElasticSearch, Kafka)
- Worked on making production systems more Scalable, Highly Available and Fault-tolerant
- Hands-on experience with ELK or other logging and observability tools
- Hands-on experience with Prometheus, Grafana & Alertmanager and on-call processes like Pagerduty
- Problem-Solving Mindset
- Strong with skills - K8s, Terraform, Helm, ArgoCD, AWS/GCP/Azure etc
- Good with Python/Go Scripting Automation
- Strong with fundamentals like DNS, Networking, Linux
- Experience with APM tools like - Newrelic, Datadog, OpenTelemetry
- Good experience with Incident Response, Incident Management, Writing detailed RCAs
- Experience with Applications best practices in making apps more reliable and fault-tolerant
- Strong leadership skills and the ability to mentor team members and provide guidance on best practices.
- Able to manage multiple clients and take ownership of client issues.
- Experience with Git and coding best practices
Good to have
- Team-leading Experience
- Multiple Client Handling
- Requirements gathering from clients
- Good Communication
Key Responsibilities
- Design and Development:
- Architect, design, and develop high-quality, scalable, and secure cloud-based software solutions.
- Collaborate with product and engineering teams to translate business requirements into technical specifications.
- Write clean, maintainable, and efficient code, following best practices and coding standards.
- Cloud Infrastructure:
- Develop and optimise cloud-native applications, leveraging cloud services like AWS, Azure, or Google Cloud Platform (GCP).
- Implement and manage CI/CD pipelines for automated deployment and testing.
- Ensure the security, reliability, and performance of cloud infrastructure.
- Technical Leadership:
- Mentor and guide junior engineers, providing technical leadership and fostering a collaborative team environment.
- Participate in code reviews, ensuring adherence to best practices and high-quality code delivery.
- Lead technical discussions and contribute to architectural decisions.
- Problem Solving and Troubleshooting:
- Identify, diagnose, and resolve complex software and infrastructure issues.
- Perform root cause analysis for production incidents and implement preventative measures.
- Continuous Improvement:
- Stay up-to-date with the latest industry trends, tools, and technologies in cloud computing and software engineering.
- Contribute to the continuous improvement of development processes, tools, and methodologies.
- Drive innovation by experimenting with new technologies and solutions to enhance the platform.
- Collaboration:
- Work closely with DevOps, QA, and other teams to ensure smooth integration and delivery of software releases.
- Communicate effectively with stakeholders, including technical and non-technical team members.
- Client Interaction & Management:
- Will serve as a direct point of contact for multiple clients.
- Able to handle the unique technical needs and challenges of two or more clients concurrently.
- Involve both direct interaction with clients and internal team coordination.
- Production Systems Management:
- Must have extensive experience in managing, monitoring, and debugging production environments.
- Will work on troubleshooting complex issues and ensure that production systems are running smoothly with minimal downtime.
About Infra360 Solutions Pvt Ltd
About
Similar jobs
We are looking for a seasoned DevOps Engineer with a strong background in solution architecture, ideally from the Banking or BFSI (Banking, Financial Services, and Insurance) domain. This role is crucial for implementing scalable, secure infrastructure and CI/CD practices tailored to the needs of high-compliance, high-availability environments. The ideal candidate will have deep expertise in Docker, Kubernetes, cloud platforms, and solution architecture, with knowledge of ML/AI and database management as a plus.
Key Responsibilities:
● Infrastructure & Solution Architecture: Design secure, compliant, and high-
performance cloud infrastructures (AWS, Azure, or GCP) optimized for BFSI-specific
applications.
● Containerization & Orchestration: Lead Docker and Kubernetes initiatives,
deploying applications with a focus on security, compliance, and resilience.
● CI/CD Pipelines: Build and maintain CI/CD pipelines suited to BFSI workflows,
incorporating automated testing, security checks, and rollback mechanisms.
● Cloud Infrastructure & Database Management: Manage cloud resources and
automate provisioning using Terraform, ensuring security standards. Optimize
relational and NoSQL databases for BFSI application needs.
● Monitoring & Incident Response: Implement monitoring and alerting (e.g.,
Prometheus, Grafana) for rapid incident response, ensuring uptime and reliability.
● Collaboration: Work closely with compliance, security, and development teams,
aligning infrastructure with BFSI standards and regulations.
Qualifications:
● Education: Bachelor’s or Master’s degree in Computer Science, Engineering,
Information Technology, or a related field.
● Experience: 5+ years of experience in DevOps with cloud infrastructure and solution
architecture expertise, ideally in ML/AI environments.
● Technical Skills:
○ Cloud Platforms: Proficient in AWS, Azure, or GCP; certifications (e.g., AWS
Solutions Architect, Azure Solutions Architect) are a plus.
○ Containerization & Orchestration: Expertise with Docker and Kubernetes,
including experience deploying and managing clusters at scale.
○ CI/CD Pipelines: Hands-on experience with CI/CD tools like Jenkins, GitLab
CI, or GitHub Actions, with automation and integration for ML/AI workflows
preferred.
○ Infrastructure as Code: Strong knowledge of Terraform and/or
CloudFormation for infrastructure provisioning.
○ Database Management: Proficiency in relational databases (PostgreSQL,
MySQL) and NoSQL databases (MongoDB, DynamoDB), with a focus on
optimization and scalability.
○ ML/AI Infrastructure: Experience supporting ML/AI pipelines, model serving,
and data processing within cloud or hybrid environments.
○ Monitoring and Logging: Proficient in monitoring tools like Prometheus and
Grafana, and log management solutions like ELK Stack or Splunk.
○ Scripting and Automation: Strong skills in Python, Bash, or PowerShell for
scripting and automating processes.
Company - Apptware Solutions
Location Baner Pune
Team Size - 130+
Job Description -
Cloud Engineer with 8+yrs of experience
Roles and Responsibilities
● Have 8+ years of strong experience in deployment, management and maintenance of large systems on-premise or cloud
● Experience maintaining and deploying highly-available, fault-tolerant systems at scale
● A drive towards automating repetitive tasks (e.g. scripting via Bash, Python, Ruby, etc)
● Practical experience with Docker containerization and clustering (Kubernetes/ECS)
● Expertise with AWS (e.g. IAM, EC2, VPC, ELB, ALB, Autoscaling, Lambda, VPN)
● Version control system experience (e.g. Git)
● Experience implementing CI/CD (e.g. Jenkins, TravisCI, CodePipeline)
● Operational (e.g. HA/Backups) NoSQL experience (e.g. MongoDB, Redis) SQL experience (e.g. MySQL)
● Experience with configuration management tools (e.g. Ansible, Chef) ● Experience with infrastructure-as-code (e.g. Terraform, Cloudformation)
● Bachelor's or master’s degree in CS, or equivalent practical experience
● Effective communication skills
● Hands-on cloud providers like MS Azure and GC
● A sense of ownership and ability to operate independently
● Experience with Jira and one or more Agile SDLC methodologies
● Nice to Have:
○ Sensu and Graphite
○ Ruby or Java
○ Python or Groovy
○ Java Performance Analysis
Role: Cloud Engineer
Industry Type: IT-Software, Software Services
Functional Area: IT Software - Application Programming, Maintenance Employment Type: Full Time, Permanent
Role Category: Programming & Design
• Bachelor’s or master’s degree in Computer Engineering,
Computer Science, Computer Applications, Mathematics, Statistics or related technical field or
equivalent practical experience. Relevant experience of at least 3 years in lieu of above if from a
different stream of education.
• Well-versed in DevOps principals & practices and hands-on DevOps
tool-chain integration experience: Release Orchestration & Automation, Source Code & Build
Management, Code Quality & Security Management, Behavior Driven Development, Test Driven
Development, Continuous Integration, Continuous Delivery, Continuous Deployment, and
Operational Monitoring & Management; extra points if you can demonstrate your knowledge with
working examples.
• Hands-on experience with demonstrable working experience with DevOps tools
and platforms viz., Slack, Jira, GIT, Jenkins, Code Quality & Security Plugins, Maven, Artifactory,
Terraform, Ansible/Chef/Puppet, Spinnaker, Tekton, StackStorm, Prometheus, Grafana, ELK,
PagerDuty, VictorOps, etc.
• Well-versed in Virtualization & Containerization; must demonstrate
experience in technologies such as Kubernetes, Istio, Docker, OpenShift, Anthos, Oracle VirtualBox,
Vagrant, etc.
• Well-versed in AWS and/or Azure or and/or Google Cloud; must demonstrate
experience in at least FIVE (5) services offered under AWS and/or Azure or and/or Google Cloud in
any categories: Compute or Storage, Database, Networking & Content Delivery, Management &
Governance, Analytics, Security, Identity, & Compliance (or) equivalent demonstratable Cloud
Platform experience.
• Well-versed with demonstrable working experience with API Management,
API Gateway, Service Mesh, Identity & Access Management, Data Protection & Encryption, tools &
platforms.
• Hands-on programming experience in either core Java and/or Python and/or JavaScript
and/or Scala; freshers passing out of college or lateral movers into IT must be able to code in
languages they have studied.
• Well-versed with Storage, Networks and Storage Networking basics
which will enable you to work in a Cloud environment.
• Well-versed with Network, Data, and
Application Security basics which will enable you to work in a Cloud as well as Business
Applications / API services environment.
• Extra points if you are certified in AWS and/or Azure
and/or Google Cloud.
- Building and setting up new development tools and infrastructure
- Understanding the needs of stakeholders and conveying this to developers
- Working on ways to automate and improve development and release processes
- Ensuring that systems are safe and secure against cybersecurity threats
- Identifying technical problems and developing software updates and 'fixes'
- Working with software developers and software engineers to ensure that development follows established processes and works as intended
Daily and Monthly Responsibilities :
- Deploy updates and fixes
- Provide Level 2 technical support
- Build tools to reduce occurrences of errors and improve customer experience
- Develop software to integrate with internal back end systems
- Perform root cause analysis for production errors
- Investigate and resolve technical issues
- Develop scripts to automate visualization
- Design procedures for system troubleshooting and maintenance
Skills and Qualifications :
- Bachelors in Computer Science, Engineering or relevant field
- Experience as a DevOps Engineer or similar software engineering role
- Proficient with git and git workflows
- Good knowledge of Python
- Working knowledge of databases such as Mysql,Postgres and SQL
- Problem solving attitude
- Collaborative team spirit
- Detail knowledge of Linux systems (Ubuntu)
- Proficient in AWS console and should have handled the infrastructure of any product (Including dev and prod environments)
Mandatory hands on experience in the following :
- Python based application deployment and maintenance
- NGINX web server
- AWS modules EC2, VPC, EBS, S3
- IAM setup
- Database configurations MySQL, PostgreSQL
- Linux flavoured OS
- Instance/Disaster management
Job Description
- Implement IAM policies and configure VPCs to create a scalable and secure network for the application workloads
- Will be client point of contact for High Priority technical issues and new requirements
- Should act as Tech Lead and guide the junior members of team and mentor them
- Work with client application developers to build, deploy and run both monolithic and microservices based applications on AWS Cloud
- Analyze workload requirements and work with IT stakeholders to define proper sizing for cloud workloads on AWS
- Build, Deploy and Manage production workloads including applications on EC2 instance, APIs on Lambda Functions and more
- Work with IT stakeholders to monitor system performance and proactively improve the environment for scale and security
Qualifications
- Prefer to have at least 5+ years of IT experience implementing enterprise applications
- Should be AWS Solution Architect Associate Certified
- Must have at least 3+ years of working as a Cloud Engineer focused on AWS services such as EC2, CloudFront, VPC, CloudWatch, RDS, DynamoDB, Systems Manager, Route53, WAF, API Gateway, Elastic beanstalk, ECS, ECR, Lambda, SQS, SNS, S3 bucket, Elastic Search, DocumentDB IAM, etc.
- Must have a strong understanding of EC2 instances, types and deploying applications to the cloud
- Must have a strong understanding of IAM policies, VPC creation, and other security/networking principles
- Must have through experience in doing on prem to AWS cloud workload migration
- Should be comfortable in using AWS and other migrations tools
- Should have experience is working on AWS performance, Cost and Security optimisation
- Should be experience in implementing automated patching and hardening of the systems
- Should be involved in P1 tickets and also guide team wherever needed
- Creating Backups and Managing Disaster Recovery
- Experience in using Infra as a code automation using scripts & tools like CloudFormation and Terraform
- Any exposure towards creating CI/CD pipelines on AWS using CodeBuild, CodeDeploy, etc. is an advantage
- Experience with Docker, Bitbucket, ELK and deploying applications on AWS
- Good understanding of Containerisation technologies like Docker, Kubernetes etc.
- Should be experience in using and configuring cloud monitoring tools and ITSM ticketing tools
- Good exposure to Logging & Monitoring tools like Dynatrace, Prometheus, Grafana, ELF/EFK
- Experience using AWS (that’s just common sense)
- Experience designing and building web environments on AWS, which includes working with services like EC2, ELB, RDS, and S3
- Experience building and maintaining cloud-native applications
- A solid background in Linux/Unix and Windows server system administration
- Experience using https://www.simplilearn.com/tutorials/devops-tutorial/devops-tools" target="_blank">DevOps tools in a cloud environment, such as Ansible, Artifactory, https://www.simplilearn.com/tutorials/docker-tutorial/what-is-docker-container" target="_blank">Docker, GitHub, https://www.simplilearn.com/tutorials/jenkins-tutorial/what-is-jenkins" target="_blank">Jenkins, https://www.simplilearn.com/tutorials/kubernetes-tutorial/what-is-kubernetes" target="_blank">Kubernetes, Maven, and Sonar Qube
- Experience installing and configuring different application servers such as JBoss, Tomcat, and WebLogic
- Experience using monitoring solutions like CloudWatch, ELK Stack, and Prometheus
- An understanding of writing Infrastructure-as-Code (IaC), using tools like CloudFormation or Terraform
- Knowledge of one or more of the most-used programming languages available for today’s cloud computing (i.e., SQL data, XML data, R math, Clojure math, Haskell functional, Erlang functional, Python procedural, and Go procedural languages)
- Experience in troubleshooting distributed systems
- Proficiency in script development and scripting languages
- The ability to be a team player
- The ability and skill to train other people in procedural and technical topics
- Strong communication and collaboration skills
As a special aside, an AWS engineer who works in DevOps should also have experience with:
- The theory, concepts, and real-world application of Continuous Delivery (CD), which requires familiarity with tools like AWS CodeBuild, AWS CodeDeploy, and AWS CodePipeline
- An understanding of automation
Implementing various development, testing, automation tools, and IT infrastructure
Selecting and deploying appropriate CI/CD tools
Required Candidate profile
LinuxWorking knowledge of any webserver eg- NGINX or Apache
BlueOptima’s vision is to become the global reference for the optimisation of the performance of Software Engineers across all industries. We provide industry-leading objective metrics in software development. We enable large organisations to deliver better software, faster and at lower cost, with technology that pushes the limits of what has been done before.
We are a global company which has consistently doubled in headcount and revenue YoY, with no external investment. We currently are located in 4 countries: London (our HQ), Mexico, India and the US. A total number of 250+ employees (and increasing every day) from 34 different nationalities and with over 25 languages spoken.
We promote an open-minded environment and encourage our employees to create their own success story in this high-performance environment.
Location: Bangalore
Department: DevOps
Job Summary:
We are looking for skilled and talented engineers to join our Platform team and directly contribute to Continuous Delivery, and improve the state of art in CI/CD and Observability within BlueOptima.
As a Senior DevOps Engineer, you will define and outline CI/CD related aspects and collaborate with application teams on imparting training and enforcing best practices to follow for CI/CD and also directly implement, maintain, and consult on the observability and monitoring framework that supports the needs of multiple internal stakeholders.
Your team: The Platform team in BlueOptima works across Product lines and is responsible for providing a scalable technology platform which is used by the Product team to build their application, improve performance of it, or even improve the SDLC by improving the application delivery pipeline, etc.
Platform team is also responsible for driving technology adoption across the product development team. The team works on components that are common across product lines like IAM (Identity & Access Management), Auto Scaling, APM (Application Performance Monitoring) and CI/CD, etc
Responsibilities and tasks:
- Define & Outline of CI/CD and related aspects
- Own & Improve state of build process to reduce manual intervention
- Own & Improve state of deployment to make it 100% automated
- Define guidelines and standards of automated testing required for a good CI/CD pipeline, ensures alignment on an ongoing basis (includes artifacts generation, promotions, etc)
- Automating Deployment and Roll back into Production Environment.
- Collaborate with engineering teams, application developers, management and infrastructure teams to assess near- and long-term monitoring needs and provide them with Tooling to improve observability of application in production.
- Keep an eye on the emerging observability tools, trends and methodologies, and continuously enhance our existing systems and processes.
- Ability to choose the right set of tools for a given problem and apply that to all the applications which are available
- Collaborate with the application team for following
- Define and enforce logging standard
- Define metrics applications should track and provide support to application teams visualise same on Grafana (or similar tools)
- Define alerts for application health monitoring in Production
- Tooling like APM, E2E, etc
- Continuously improve the state of the art on above
- Assist in scheduling and hosting regular tool training sessions to better enable tool adoption and best practices, also making sure training materials are maintained.
Qualifications
What You Need to Succeed at BlueOptima:
- Minimum bachelor's degree in Computer Science or equivalent
- Demonstrable years of experience with implementation, operations, maintenance of IT systems and/or administration of software functions in multi-platform and multi-system environments.
- At least 1 year of experience leading or mentoring a small team.
- Demonstrable experience having developed containerized application components, using docker or similar solutions in previous roles
- Have extensive experience with metrics and logging libraries and aggregators, data analysis and visualization tools.
- Experience in defining, creating, and supporting monitoring dashboards
- 2+ Years of Experience with CI tools and building pipelines using Jenkins.
- 2 + Years of Experience with monitoring and observability tools and methodology of products such as; Grafana, Prometheus, ElasticSearch, Splunk, AppDynamics, Dynatrace, Nagios, Graphite ,Datadog etc.
- Ability to write and read simple scripts using Python / Shell Scripts.
- Familiarity with configuration languages such as Ansible.
- Ability to work autonomously with minimum supervision
- Demonstrate strong oral and written communication skill
Additional information
Why join our team?
Culture and Growth:
- Global team with a creative, innovative and welcoming mindset.
- Rapid career growth and opportunity to be an outstanding and visible contributor to the company's success.
- Freedom to create your own success story in a high-performance environment.
- Training programs and Personal Development Plans for each employee
Benefits:
- 32 days of holidays - this includes public and religious holidays
- Contributions to your Provident Fund which can be matched by the company above the statutory minimum as agreed
- Private Medical Insurance provided by the company
- Gratuity payments
- Claim Mobile/Internet expenses and Professional Development costs
- Leave Travel Allowance
- Flexible Work from Home policy - 2 days home p/w
- International travel opportunities
- Global annual meet up (most recent meetups have been held in Cancun and India Thailand, Oct 2022.
- High quality equipment (Ergonomic chairs and 32’ screens)
- Pet friendly offices
- Creche Policy for working parents.
- Paternity and Maternity leave.
Stay connected with us on https://www.linkedin.com/company/blueoptima">LinkedIn or keep an eye on our https://www.blueoptima.com/careers">career page for future opportunities!