
The Key Responsibilities Include But Not Limited to:
Help identify and drive Speed, Performance, Scalability, and Reliability related optimization based on experience and learnings from the production incidents.
Work in an agile DevSecOps environment in creating, maintaining, monitoring, and automation of the overall solution-deployment.
Understand and explain the effect of product architecture decisions on systems.
Identify issues and/or opportunities for improvements that are common across multiple services/teams.
This role will require weekend deployments
Skills and Qualifications:
1. 3+ years of experience in a DevOps end-to-end development process with heavy focus on service monitoring and site reliability engineering work.
2. Advanced knowledge of programming/scripting languages (Bash, PERL, Python, Node.js).
3. Experience in Agile/SCRUM enterprise-scale software development including working with GiT, JIRA, Confluence, etc.
4. Advance experience with core microservice technology (RESTFul development).
5. Working knowledge of using Advance AI/ML tools are pluses.
6. Working knowledge in the one or more of the Cloud Services: Amazon AWS, Microsoft Azure
7. Bachelors or Master’s degree in Computer Science or equivalent related field experience
Key Behaviours / Attitudes:
Professional curiosity and a desire to a develop deep understanding of services and technologies.
Experience building & running systems to drive high availability, performance and operational improvements
Excellent written & oral communication skills; to ask pertinent questions, and to assess/aggregate/report the responses.
Ability to quickly grasp and analyze complex and rapidly changing systemsSoft skills
1. Self-motivated and self-managing.
2. Excellent communication / follow-up / time management skills.
3. Ability to fulfill role/duties independently within defined policies and procedures.
4. Ability to balance multi-task and multiple priorities while maintaining a high level of customer satisfaction is key.
5. Be able to work in an interrupt-driven environment.Work with Dori Ai world class technology to develop, implement, and support Dori's global infrastructure.
As a member of the IT organization, assist with the analyze of existing complex programs and formulate logic for new complex internal systems. Prepare flowcharting, perform coding, and test/debug programs. Develop conversion and system implementation plans. Recommend changes to development, maintenance, and system standards.
Leading contributor individually and as a team member, providing direction and mentoring to others. Work is non-routine and very complex, involving the application of advanced technical/business skills in a specialized area. BS or equivalent experience in programming on enterprise or department servers or systems.

About Dori AI
About
At Dori, we develop platforms and services that enable artificial intelligence centered application development for mobile edge devices, embedded IoT devices, on-premise servers, and cloud platforms. The company provides a turnkey solution to add intelligence in applications by simplifying model development and deployment.
We have developed an AI-as-a-service platform that provides prebuilt and custom engines to evaluate, deploy, and monitor artificial intelligence systems for consumer and enterprise applications. Application developers can rapidly develop and deploy AI-enabled applications for multiple operating systems, hardware architectures, and cloud infrastructures.
Connect with the team
Similar jobs
Job Title : DevOps Engineer
Experience : 3+ Years
Location : Mumbai
Employment Type : Full-time
Job Overview :
We’re looking for an experienced DevOps Engineer to design, build, and manage Kubernetes-based deployments for a microservices data discovery platform.
The ideal candidate has strong hands-on expertise with Helm, Docker, CI/CD pipelines, and cloud networking — and can handle complex deployments across on-prem, cloud, and air-gapped environments.
Mandatory Skills :
✅ Helm, Kubernetes, Docker
✅ Jenkins, ArgoCD, GitOps
✅ Cloud Networking (VPCs, bare metal vs. VMs)
✅ Storage (MinIO, Ceph, NFS, S3/EBS)
✅ Air-gapped & multi-tenant deployments
Key Responsibilities :
- Build and customize Helm charts from scratch.
- Implement CI/CD pipelines using Jenkins, ArgoCD, GitOps.
- Manage containerized deployments across on-prem/cloud setups.
- Work on air-gapped and restricted environments.
- Optimize for scalability, monitoring, and security (Prometheus, Grafana, RBAC, HPA).

Location: Bangalore
Experience: 2–5 years
Type: Full-time | On-site
Start: Immediate
Why this role exists
Most systems don’t fail because of one big outage.
They fail because reliability is treated as an afterthought.
Right now, uptime depends too much on individual heroics.
That doesn’t scale.
This role exists to build a reliability system where:
- Uptime is predictable
- Failures are contained
- Escalations don’t depend on leadership
What you’ll do
You will not just monitor systems.
You will own reliability as a product.
1. Drive uptime to production-grade reliability
- Improve system uptime to 99.9% customer-facing SLA within 4 months
- Define and track:
- SLAs / SLOs / error budgets
- Ensure reliability is measured from the customer’s perspective, not internal metrics
2. Build incident response as a system
- Set up a 24/7 incident response rotation across 3 engineers
- Eliminate dependency on leadership (no single escalation point)
- Define:
- Incident severity levels
- Response playbooks
- Escalation protocols
- Ensure fast detection → containment → resolution
3. Contain and fix erratic system behavior
- Identify and resolve:
- Latency spikes
- Downtime incidents
- Integration failures
- Build guardrails to prevent recurrence
- Focus on root cause elimination, not temporary fixes
4. Create continuous reliability feedback loops
- Work closely with engineering teams to:
- Surface recurring failure patterns
- Improve build quality
- Reduce production bugs
- Ensure learnings from incidents directly improve future releases
5. Improve observability and monitoring
- Build dashboards and alerts for:
- System health
- Performance metrics
- Failure signals
- Ensure issues are detected before customers report them
6. Reduce operational fragility
- Remove single points of failure (people, systems, workflows)
- Improve system resilience across:
- Deployments
- Integrations
- Runtime environments
What success looks like
- Uptime reaches 99.9%+ reliably
- Incidents are:
- Detected early
- Contained quickly
- Resolved permanently
- No dependency on a single individual for escalation
- System behavior becomes predictable and stable
- Engineering teams ship with higher reliability confidence
Who you are
- You have 2-5 years of experience in SRE / DevOps / backend systems
- You have worked on production systems with real uptime expectations
- You think in:
- Systems
- Failure modes
- Trade-offs
- You are comfortable debugging live, high-pressure environments
What will make you stand out
- Experience with:
- Distributed systems
- Cloud infrastructure (AWS / Azure / GCP)
- Monitoring & alerting tools
- Have built or improved:
- Incident response systems
- Reliability frameworks
- Strong debugging skills across:
- Infra
- Application
- Integrations
Compensation
₹60,000/month (fixed)
(Aligned with role scope and impact expectations)
Why join
- You will define reliability standards for a production AI platform
- Your work directly impacts:
- Customer trust
- Product performance
- Enterprise readiness
- You will move the system from reactive → predictable
What this role is not
- Not just monitoring dashboards
- Not limited to handling tickets
- Not dependent on escalation to leadership
What this role is
- A builder of reliability systems
- A guardian of uptime and performance
- A multiplier of engineering quality
One question to self-evaluate
Can you build a system where downtime is rare, predictable, and never dependent on a single person?
Job role: Systems Engineer (L2)
Location: Remote/Bengaluru
Experience: 3-6 years
About the Role:
We are looking for a Systems Engineer (L2) to join our growing infrastructure team. You will be responsible for managing, optimizing, and scaling our cloud communication platform that handles billions of messages and voice calls annually.
Key Responsibilities:
— Design, deploy, and maintain scalable cloud infrastructure — AWS/GCP/Azure.
— Manage and optimize networking components — routers, switches, firewalls, load balancers.
— Handle incident response — monitor systems, identify issues, resolve production problems.
— Implement DevOps best practices — CI/CD pipelines, automation, containerization.
— Collaborate with backend and product teams on system architecture.
— Performance tuning — ensure high availability and reliability of platform.
— Security management — implement security protocols and compliance standards.
Required Skills:
Technical:
- Linux/Unix administration — strong fundamentals
- Networking — TCP/IP, DNS, BGP, VoIP protocols
- Cloud platforms — AWS/GCP/Azure — minimum 2 years
- DevOps tools — Docker, Kubernetes, Jenkins, CI/CD
- Monitoring tools — Grafana, Prometheus, Kibana, Datadog
- Scripting — Python, Bash, Shell
- Databases — MySQL, PostgreSQL, Redis
Soft skills:
- Strong problem-solving under pressure
- Good communication — English written and verbal
- Team player — collaborative mindset
Good to Have:
- Experience in telecom/CPaaS/cloud communications industry
- Knowledge of VoIP, SIP, RTP protocols
- AI/ML operations experience
- CCNA/AWS certifications
Come Dive In
The DevOps Engineer will execute the tools and processes to enable DevOps.
Engage in and improve the whole lifecycle of services from inception and design through deployment, operation, and refinement to efficiently deliver high-quality solutions. The candidate should bridge the gap between Development and Operational teams, working with the development teams to meet acceptance criteria and gather and document the requirements. Candidates should be able to work in fast-paced,
multi-disciplinary environments.
As An DevOps Engineer, You Will
● Work in a dynamic, agile team environment developing excellent new applications.
● Participate in design decisions, including new technology research and prototyping
● Collaborate closely with other AWS engineers and architects, cloud engineers, support teams, and other stakeholders
● Promote great Kubernetes and AWS platform design and quality
● Innovate new ideas to evolve our applications and processes
● Continuously analyzing and evaluating our systems, products, and methods for potential improvements.
Mandatory Skills:
● Experience on Linux based infrastructure
● Experience in ECS - Amazon services*
● Should have hands-on containerized Services
● Must know about AWS CI/CD pipeline.
● Must know DevOps concepts and Agile principles
● Knowledge of Git, Docker, and Jenkins
● Knowledge of Infrastructure as Code.
● Experience in using Automation Tools
● Must have experience in Test Driven Development environment setup.
● Working knowledge of Docker and Kubernetes
We recognize that asking you to give 100% of yourself daily requires us to show you the love.
PERKS: what can we offer you?
● Bi-Yearly performance audits and appraisals
● The flexibility of working days/hours
● 5 working days/week (Mon to Fri) and added payout for working Saturday
● Recognition and Appreciation
● A plethora of industry exposure and self-growth opportunities
Visit our site: www.cedcoss.com

Roles and Responsibilities:
• Gather and analyse cloud infrastructure requirements
• Automating system tasks and infrastructure using a scripting language (Shell/Python/Ruby
preferred), with configuration management tools (Ansible/ Puppet/Chef), service registry and
discovery tools (Consul and Vault, etc), infrastructure orchestration tools (Terraform,
CloudFormation), and automated imaging tools (Packer)
• Support existing infrastructure, analyse problem areas and come up with solutions
• An eye for monitoring – the candidate should be able to look at complex infrastructure and be
able to figure out what to monitor and how.
• Work along with the Engineering team to help out with Infrastructure / Network automation needs.
• Deploy infrastructure as code and automate as much as possible
• Manage a team of DevOps
Desired Profile:
• Understanding of provisioning of Bare Metal and Virtual Machines
• Working knowledge of Configuration management tools like Ansible/ Chef/ Puppet, Redfish.
• Experience in scripting languages like Ruby/ Python/ Shell Scripting
• Working knowledge of IP networking, VPN's, DNS, load balancing, firewalling & IPS concepts
• Strong Linux/Unix administration skills.
• Self-starter who can implement with minimal guidance
• Hands-on experience setting up CICD from SCRATCH in Jenkins
• Experience with Managing K8s infrastructure
Client: Sony Corporation
Position: 3
Exp: 5-8Years
DevOps Engineer
Location: Bangalore
Budget: 16.5 LPA Max
Gerrit ,Jenkins, Rabbit MQ AWS Linux
Python ,Ansible, Tomcat ,Postgresql
Grafana ,Groovy,HTML,Shell,Apache,Git , ELK
Implement DevOps capabilities in cloud offerings using CI/CD toolsets and automation
Defining and setting development, test, release, update, and support processes for DevOps
operation
Troubleshooting techniques and fixing the code bugs
Coordination and communication within the team and with client team
Selecting and deploying appropriate CI/CD tools
Strive for continuous improvement and build continuous integration, continuous
development, and constant deployment pipeline (CI/CD Pipeline)
Pre-requisite skills required:
Experience working on Linux based infrastructure
Experience of scripting in at-least 2 languages ( Bash + Python / Ruby )
Working knowledge of various tools, open-source technologies, and cloud services
Experience with Docker, AWS ( ec2, s3, iam, eks, route53), Ansible, Helm, Terraform
Experience with building, maintaining, and deploying Kubernetes environments and
applications
Experience with build and release automation and dependency management; implementing
CI/CD
Clear fundamentals with DNS, HTTP, HTTPS, Micro-Services, Monolith etc.
Job Dsecription: (8-12 years)
○ Develop best practices for team and also responsible for the architecture
○ solutions and documentation operations in order to meet the engineering departments quality and standards
○ Participate in production outage and handle complex issues and works towards Resolution
○ Develop custom tools and integration with existing tools to increase engineering Productivity
Required Experience and Expertise
○ Deep understanding of Kernel, Networking and OS fundamentals
○ Strong experience in writing helm charts.
○ Deep understanding of K8s.
○ Good knowledge in service mesh.
○ Good Database understanding
Notice Period: 30 day max
- You have a Bachelor's degree in computer science or equivalent
- You have at least 7 years of DevOps experience.
- You have deep understanding of AWS and cloud architectures/services.
- You have expertise within the container and container orchestration space (Docker, Kubernetes, etc.).
- You have experience working with infrastructure provisioning tools like CloudFormation, Terraform, Chef, Puppet, or others.
- You have experience enabling CI/CD pipelines using tools such as Jenkins, AWS Code Pipeline, Gitlab, or others.
- You bring a deep understanding and application of computer science fundamentals: data structures, algorithms, and design patterns.
- You have a track record of delivering successful solutions and collaborating with others.
- You take security into account when building new systems.











