
About Skarpsinne Infotech
About
Connect with the team
Similar jobs

Location: Bangalore
Experience: 2–5 years
Type: Full-time | On-site
Start: Immediate
Why this role exists
Most systems don’t fail because of one big outage.
They fail because reliability is treated as an afterthought.
Right now, uptime depends too much on individual heroics.
That doesn’t scale.
This role exists to build a reliability system where:
- Uptime is predictable
- Failures are contained
- Escalations don’t depend on leadership
What you’ll do
You will not just monitor systems.
You will own reliability as a product.
1. Drive uptime to production-grade reliability
- Improve system uptime to 99.9% customer-facing SLA within 4 months
- Define and track:
- SLAs / SLOs / error budgets
- Ensure reliability is measured from the customer’s perspective, not internal metrics
2. Build incident response as a system
- Set up a 24/7 incident response rotation across 3 engineers
- Eliminate dependency on leadership (no single escalation point)
- Define:
- Incident severity levels
- Response playbooks
- Escalation protocols
- Ensure fast detection → containment → resolution
3. Contain and fix erratic system behavior
- Identify and resolve:
- Latency spikes
- Downtime incidents
- Integration failures
- Build guardrails to prevent recurrence
- Focus on root cause elimination, not temporary fixes
4. Create continuous reliability feedback loops
- Work closely with engineering teams to:
- Surface recurring failure patterns
- Improve build quality
- Reduce production bugs
- Ensure learnings from incidents directly improve future releases
5. Improve observability and monitoring
- Build dashboards and alerts for:
- System health
- Performance metrics
- Failure signals
- Ensure issues are detected before customers report them
6. Reduce operational fragility
- Remove single points of failure (people, systems, workflows)
- Improve system resilience across:
- Deployments
- Integrations
- Runtime environments
What success looks like
- Uptime reaches 99.9%+ reliably
- Incidents are:
- Detected early
- Contained quickly
- Resolved permanently
- No dependency on a single individual for escalation
- System behavior becomes predictable and stable
- Engineering teams ship with higher reliability confidence
Who you are
- You have 2-5 years of experience in SRE / DevOps / backend systems
- You have worked on production systems with real uptime expectations
- You think in:
- Systems
- Failure modes
- Trade-offs
- You are comfortable debugging live, high-pressure environments
What will make you stand out
- Experience with:
- Distributed systems
- Cloud infrastructure (AWS / Azure / GCP)
- Monitoring & alerting tools
- Have built or improved:
- Incident response systems
- Reliability frameworks
- Strong debugging skills across:
- Infra
- Application
- Integrations
Compensation
₹60,000/month (fixed)
(Aligned with role scope and impact expectations)
Why join
- You will define reliability standards for a production AI platform
- Your work directly impacts:
- Customer trust
- Product performance
- Enterprise readiness
- You will move the system from reactive → predictable
What this role is not
- Not just monitoring dashboards
- Not limited to handling tickets
- Not dependent on escalation to leadership
What this role is
- A builder of reliability systems
- A guardian of uptime and performance
- A multiplier of engineering quality
One question to self-evaluate
Can you build a system where downtime is rare, predictable, and never dependent on a single person?
The DevOps Engineer will play a critical role in operationalizing artificial intelligence across Bell Techlogix client environments. This role focuses on building and supporting cloud infrastructure, CI/CD pipelines, and automation frameworks that power AI and machine learning workloads. The ideal candidate has experience supporting AI platforms such as Azure AI, Azure Machine Learning, Azure OpenAI, and ServiceNow or conversational AI platforms, and understands the operational requirements of production AI systems, including reliability, scalability, and security.
Key Responsibilities
•Design, build, and operate cloud infrastructure and platform services that support AI and machine learning workloads in production, SLA-driven managed services environments
•Implement CI/CD and MLOps pipelines to enable automated training, testing, deployment, and rollback of AI and ML models
•Develop and maintain Infrastructure as Code to provision AI-ready environments consistently across dev/test/prod
•Support AI platform operations including monitoring model health, pipeline execution, compute utilization, and data dependencies
•Partner with Machine Learning Engineers and Data Engineers to standardize deployment patterns for AI services and LLM-based solutions
•Enable secure and scalable AI integrations using APIs, messaging, and event-driven architectures
•Implement observability solutions for AI platforms, including logging, metrics, alerting, and drift detection integrations
•Troubleshoot AI platform incidents, perform root cause analysis, and implement remediation to improve reliability and automation coverage
•Apply security best practices for AI environments including secrets management, identity and access controls, network isolation, and policy enforcement
•Support AI-driven automation use cases across platforms such as Microsoft Copilot, ServiceNow, and conversational AI tools
•Collaborate with service desk, security, and architecture teams to continuously improve AI service delivery and operational maturity
Required Qualifications
•Bachelor’s degree in Computer Science, Engineering, or equivalent practical experience
•5+ years of experience in DevOps, cloud engineering, or platform operations, with exposure to AI or data workloads
•Hands-on experience with Microsoft Azure, including compute, networking, storage, and monitoring services
•Experience building CI/CD pipelines using Azure DevOps, GitHub Actions, or similar tools
•Working knowledge of Infrastructure as Code (Terraform and/or Bicep/ARM)
•Scripting experience using PowerShell and/or Python
•Experience supporting production platforms with incident management, change control, and root cause analysis
•Understanding of cloud security fundamentals and enterprise governance requirements
Preferred Qualifications
•Experience with Azure Machine Learning, Azure AI Services, Azure OpenAI, or MLOps frameworks
•Exposure to containerization and orchestration technologies (Docker, Kubernetes, AKS)
•Experience supporting data pipelines or feature stores used by machine learning systems
•Familiarity with ServiceNow, AI-driven ITSM workflows, or automation platforms
•Experience with observability tools
•Knowledge of Responsible AI, data governance, and compliance considerations for AI systems
•Relevant certifications (Microsoft Azure Administrator, Azure DevOps Engineer, Azure AI Engineer)
What does a successful Senior DevOps Engineer do at Fiserv?
This role’s focus will be on contributing and enhancing our DevOps environment within Issuer Solution group, where our cross functional Scrum teams are delivering solutions built on cutting-edge mobile technology and products. You will be expected to support across the wider business unit, leading DevOps practices and initiatives.
What will you do:
• Build, manage, and deploy CI/CD pipelines.
• DevOps Engineer - Helm Chart, Rundesk, Openshift
• Strive for continuous improvement and build continuous integration, continuous development, and constant deployment pipeline.
• Implementing various development, testing, automation tools, and IT infrastructure
• Optimize and automate release/development cycles and processes.
• Be part of and help promote our DevOps culture.
• Identify and implement continuous improvements to the development practice
What you must have:
• 3+ years of experience in devops with hands-on experience in the following:
- Writing automation scripts for deployments and housekeeping using shell scripts (bash) and ansible playbooks
- Building docker images and running/managing docker instances
- Building Jenkins pipelines using groovy scripts
- Working knowledge on kubernetes including application deployments, managing application configurations and persistence volumes
• Has good understanding on infrastructure as code
• Ability to write and update documentation
• Demonstrate a logical, process orientated approach to problems and troubleshooting
• Ability to collaborate with multi development teams
What you are preferred to have:
• 8+ years of development experience
• Jenkins administration experience
• Hands-on experience in building and deploying helm charts
Process Skills:
• Should have worked in Agile Project
Overview
adesso India specialises in optimization of core business processes for organizations. Our focus is on providing state-of-the-art solutions that streamline operations and elevate productivity to new heights.
Comprised of a team of industry experts and experienced technology professionals, we ensure that our software development and implementations are reliable, robust, and seamlessly integrated with the latest technologies. By leveraging our extensive knowledge and skills, we empower businesses to achieve their objectives efficiently and effectively.
Job Description
We are looking for an experienced Technical Team Lead to guide a local IT Services Management Team and also acting as a software developer. In this role, you will be responsible for the application management of a B2C application to meet the agreed Service Level Agreements (SLAs) and fulfil customer expectations.
Your Team will act as a on-call-duty team in the time between 6 pm to 8 am, 365 days a year. You will work together with the responsible Senior Project Manager in Germany.
We are seeking a hands-on leader who thrives in both team management and operational development. Whether you have experience in DevOps and Backend or Frontend, your expertise in both leadership and technical skills will be key to success in this position.
Responsibilities:
Problem Management & Incident Management activities: Identifying and resolving technical issues and errors that arise during application usage.
Release and Update Coordination: Planning and executing software updates, new versions, or system upgrades to keep applications up to date.
Change Management: Responsible for implementing and coordinating changes to the application, considering the impact on ongoing operations.
Requirements:
Education und Experience: A Bachelor’s or Master’s degree in a relevant field, with a minimum of 5 years of professional experience or equivalent work experience.
Skills & Expertise:
Proficient in ITIL service management frameworks.
Strong analytical and problem-solving abilities.
Experienced in project management methodologies (Agile, Kanban).
Leadership: Very good leadership skills with a customer orientated, proactive and results driven approach.
Communication: Excellent communication, presentation, and interpersonal skills, with the ability to engage and collaborate with stakeholders.
Language: English on a C2 Level.
Skills & Requirements
kubeAPI high Kustomize high docker/container high Debug Tools openSSL high Curl high Azure Devops, Pipeline, Repository, Deployment, ArgoCD, Certificates: Certificate Management / SSL, LetsEncrypt, Linux Shell, Keycloak.
Role & Responsiblities
- DevOps Engineer will be working with implementation and management of DevOps tools and technologies.
- Create and support advanced pipelines using Gitlab.
- Create and support advanced container and serverless environments.
- Deploy Cloud infrastructure using Terraform and cloud formation templates.
- Implement deployments to OpenShift Container Platform, Amazon ECS and EKS
- Troubleshoot containerized builds and deployments
- Implement processes and automations for migrating between OpenShift, AKS and EKS
- Implement CI/CD automations.
Required Skillsets
- 3-5 years of cloud-based architecture software engineering experience.
- Deep understanding of Kubernetes and its architecture.
- Mastery of cloud security engineering tools, techniques, and procedures.
- Experience with AWS services such as Amazon S3, EKS, ECS, DynamoDB, AWS Lambda, API Gateway, etc.
- Experience with designing and supporting infrastructure via Infrastructure-as-Code in AWS, via CDK, CloudFormation Templates, Terraform or other toolset.
- Experienced with tools like Jenkins, Github, Puppet or other similar toolset.
- Experienced with monitoring functions like cloudwatch, newrelic, graphana, splunk, etc,
- Excellence in verbal and written communication, and in working collaboratively with a variety of colleagues and clients in a remote development environment.
- Proven track record in cloud computing systems and enterprise architecture and security
Main tasks
- Supervision of the CI/CD process for the automated builds and deployments of web services and web applications as well as desktop tool in the cloud and container environment
- Responsibility of the operations part of a DevOps organization especially for development at LS telcom in the environment of container technology and orchestration, e.g. with Kubernetes
- Installation, operation and monitoring of web applications in cloud data centers for the purpose of development of the test as well as for the operation of an own productive cloud as LS service
- Implementation of installations of the LS system solution especially in the container context
- Introduction, maintenance and improvement of installation solutions for LS development in the desktop and server environment as well as in the cloud and with on-premise Kubernetes
- Maintenance of the system installation documentation and implementation of trainings
Execution of internal software tests and support of involved teams and stakeholders
- Hands on Experience with Azure DevOps.
Qualification profile
- Bachelor’s or master’s degree in communications engineering, electrical engineering, physics or comparable qualification
- Experience in software
- Installation and administration of Linux and Windows systems including network and firewalling aspects
- Experience with build and deployment automation with tools like Jenkins, Gradle, Argo or similar as well as system scripting (Bash, Power-Shell, etc.)
- Interest in operation and monitoring of applications in virtualized and containerized environments in cloud and on-premise
- Server environments, especially application, web-and database servers
- Knowledge in VMware/K3D/Rancer is an advantage
- Good spoken and written knowledge of English

Roles and Responsibilities:
• Gather and analyse cloud infrastructure requirements
• Automating system tasks and infrastructure using a scripting language (Shell/Python/Ruby
preferred), with configuration management tools (Ansible/ Puppet/Chef), service registry and
discovery tools (Consul and Vault, etc), infrastructure orchestration tools (Terraform,
CloudFormation), and automated imaging tools (Packer)
• Support existing infrastructure, analyse problem areas and come up with solutions
• An eye for monitoring – the candidate should be able to look at complex infrastructure and be
able to figure out what to monitor and how.
• Work along with the Engineering team to help out with Infrastructure / Network automation needs.
• Deploy infrastructure as code and automate as much as possible
• Manage a team of DevOps
Desired Profile:
• Understanding of provisioning of Bare Metal and Virtual Machines
• Working knowledge of Configuration management tools like Ansible/ Chef/ Puppet, Redfish.
• Experience in scripting languages like Ruby/ Python/ Shell Scripting
• Working knowledge of IP networking, VPN's, DNS, load balancing, firewalling & IPS concepts
• Strong Linux/Unix administration skills.
• Self-starter who can implement with minimal guidance
• Hands-on experience setting up CICD from SCRATCH in Jenkins
• Experience with Managing K8s infrastructure
• Design cloud infrastructure that is secure, scalable, and highly available on AWS
• Define infrastructure and deployment requirements
• Provision, configure and maintain AWS cloud infrastructure defined as code
• Ensure configuration and compliance with configuration management tools
• Troubleshoot problems across a wide array of services and functional areas
• Build and maintain operational tools for deployment, monitoring, and analysis of AWS infrastructure and systems
• Perform infrastructure cost analysis and optimization
Qualifications:
• At least 3-5 years of experience building and maintaining AWS infrastructure (VPC, EC2, Security Groups, IAM, ECS, CodeDeploy, CloudFront, S3)
• Strong understanding of how to secure AWS environments and meet compliance requirements
• Expertise on configuration management
• Hands-on experience deploying and managing infrastructure with Terraform
• Solid foundation of networking and Linux administration
• Experience with Docker, GitHub, Jenkins, ELK and deploying applications on AWS
• Ability to learn/use a wide variety of open source technologies and tools
• Strong bias for action and ownership
We are looking for an experienced DevOps engineer that will help our team establish DevOps practice. You will work closely with the technical lead ( and/or CTO ) to identify and establish DevOps practices in the company.
You will help us build scalable, efficient cloud infrastructure. You’ll implement monitoring for automated system health checks. Lastly, you’ll build our CI pipeline, and train and guide the team in DevOps practices.
Responsibilities
- Implement and own the CI.
- Manage CD tooling.
- Implement and maintain monitoring and alerting.
- Build and maintain highly available production systems.
Qualification- B.tech in IT
- Experience: 3-5 Years
- Scripting: PowerShell and either of ( JavaScript, Python)
- Kubernetes and docker Hands-On
- Good to have either (Azure / AWS )
- Any of DB technologies Hands-ON: No SQL - Admin, COSMOS DB, MONGO DB, Maria DBA
- Good to have Analytics knowledge









