About the job
Our goal
We are reinventing the future of MLOps. Censius Observability platform enables businesses to gain greater visibility into how their AI makes decisions to understand it better. We enable explanations of predictions, continuous monitoring of drifts, and assessing fairness in the real world. (TLDR build the best ML monitoring tool)
The culture
We believe in constantly iterating and improving our team culture, just like our product. We have found a good balance between async and sync work default is still Notion docs over meetings, but at the same time, we recognize that as an early-stage startup brainstorming together over calls leads to results faster. If you enjoy taking ownership, moving quickly, and writing docs, you will fit right in.
The role:
Our engineering team is growing and we are looking to bring on board a senior software engineer who can help us transition to the next phase of the company. As we roll out our platform to customers, you will be pivotal in refining our system architecture, ensuring the various tech stacks play well with each other, and smoothening the DevOps process.
On the platform, we use Python (ML-related jobs), Golang (core infrastructure), and NodeJS (user-facing). The platform is 100% cloud-native and we use Envoy as a proxy (eventually will lead to service-mesh architecture).
By joining our team, you will get the exposure to working across a swath of modern technologies while building an enterprise-grade ML platform in the most promising area.
Responsibilities
- Be the bridge between engineering and product teams. Understand long-term product roadmap and architect a system design that will scale with our plans.
- Take ownership of converting product insights into detailed engineering requirements. Break these down into smaller tasks and work with the team to plan and execute sprints.
- Author high-quality, highly-performance, and unit-tested code running on a distributed environment using containers.
- Continually evaluate and improve DevOps processes for a cloud-native codebase.
- Review PRs, mentor others and proactively take initiatives to improve our team's shipping velocity.
- Leverage your industry experience to champion engineering best practices within the organization.
Qualifications
Work Experience
- 3+ years of industry experience (2+ years in a senior engineering role) preferably with some exposure in leading remote development teams in the past.
- Proven track record building large-scale, high-throughput, low-latency production systems with at least 3+ years working with customers, architecting solutions, and delivering end-to-end products.
- Fluency in writing production-grade Go or Python in a microservice architecture with containers/VMs for over 3+ years.
- 3+ years of DevOps experience (Kubernetes, Docker, Helm and public cloud APIs)
- Worked with relational (SQL) as well as non-relational databases (Mongo or Couch) in a production environment.
- (Bonus: worked with big data in data lakes/warehouses).
- (Bonus: built an end-to-end ML pipeline)
Skills
- Strong documentation skills. As a remote team, we heavily rely on elaborate documentation for everything we are working on.
- Ability to motivate, mentor, and lead others (we have a flat team structure, but the team would rely upon you to make important decisions)
- Strong independent contributor as well as a team player.
- Working knowledge of ML and familiarity with concepts of MLOps
Benefits
- Competitive Salary
- Work Remotely
- Health insurance
- Unlimited Time Off
- Support for continual learning (free books and online courses)
- Reimbursement for streaming services (think Netflix)
- Reimbursement for gym or physical activity of your choice
- Flex hours
- Leveling Up Opportunities
You will excel in this role if
- You have a product mindset. You understand, care about, and can relate to our customers.
- You take ownership, collaborate, and follow through to the very end.
- You love solving difficult problems, stand your ground, and get what you want from engineers.
- Resonate with our core values of innovation, curiosity, accountability, trust, fun, and social good.

About Censiusai
About
Connect with the team
Similar jobs
A strong proficiency in at least one scripting language (e.g., Python, Bash, PowerShell) is required.
Candidates must possess an in-depth ability to design, write, and implement complex automation logic, not just basic scripts.
Proven experience in automating DevOps processes, environment provisioning, and configuration management is essential.
Cloud Platform (AWS Preferred) : • Extensive hands-on experience with Amazon Web Services (AWS) is highly preferred.
Candidates must be able to demonstrate expert-level knowledge of core AWS services and articulate their use cases.
Excellent debugging and problem-solving skills within the AWS ecosystem are mandatory. The ability to diagnose and resolve issues efficiently is a key requirement.
Infrastructure as Code (IaC - Terraform Preferred) : • Expert-level knowledge and practical experience with Terraform are required.
Candidates must have a deep understanding of how to write scalable, modular, and reusable Terraform code.
Containerization and Orchestration (Kubernetes Preferred) : • Advanced, hands-on experience with Kubernetes is mandatory. • Candidates must be proficient in solving complex, production-level issues related to deployments, networking, and cluster management. • A solid foundational knowledge of Docker is required.
Job Summary:
We are seeking a highly skilled and proactive DevOps Engineer with 4+ years of experience to join our dynamic team. This role requires strong technical expertise across cloud infrastructure, CI/CD pipelines, container orchestration, and infrastructure as code (IaC). The ideal candidate should also have direct client-facing experience and a proactive approach to managing both internal and external stakeholders.
Key Responsibilities:
- Collaborate with cross-functional teams and external clients to understand infrastructure requirements and implement DevOps best practices.
- Design, build, and maintain scalable cloud infrastructure on AWS (EC2, S3, RDS, ECS, etc.).
- Develop and manage infrastructure using Terraform or CloudFormation.
- Manage and orchestrate containers using Docker and Kubernetes (EKS).
- Implement and maintain CI/CD pipelines using Jenkins or GitHub Actions.
- Write robust automation scripts using Python and Shell scripting.
- Monitor system performance and availability, and ensure high uptime and reliability.
- Execute and optimize SQLqueries for MSSQL and PostgresQL databases.
- Maintain clear documentation and provide technical support to stakeholders and clients.
Required Skills:
- Minimum 4+ years of experience in a DevOps or related role.
- Proven experience in client-facing engagements and communication.
- Strong knowledge of AWS services – EC2, S3, RDS, ECS, etc.
- Proficiency in Infrastructure as Code using Terraform or CloudFormation.
- Hands-on experience with Docker and Kubernetes (EKS).
- Strong experience in setting up and maintaining CI/CD pipelines with Jenkins or GitHub.
- Solid understanding of SQL and working experience with MSSQL and PostgreSQL.
- Proficient in Python and Shell scripting.
Preferred Qualifications:
- AWS Certifications (e.g., AWS Certified DevOps Engineer) are a plus.
- Experience working in Agile/Scrum environments.
- Strong problem-solving and analytical skills.
What does a successful Senior DevOps Engineer do at Fiserv?
This role’s focus will be on contributing and enhancing our DevOps environment within Issuer Solution group, where our cross functional Scrum teams are delivering solutions built on cutting-edge mobile technology and products. You will be expected to support across the wider business unit, leading DevOps practices and initiatives.
What will you do:
• Build, manage, and deploy CI/CD pipelines.
• DevOps Engineer - Helm Chart, Rundesk, Openshift
• Strive for continuous improvement and build continuous integration, continuous development, and constant deployment pipeline.
• Implementing various development, testing, automation tools, and IT infrastructure
• Optimize and automate release/development cycles and processes.
• Be part of and help promote our DevOps culture.
• Identify and implement continuous improvements to the development practice
What you must have:
• 3+ years of experience in devops with hands-on experience in the following:
- Writing automation scripts for deployments and housekeeping using shell scripts (bash) and ansible playbooks
- Building docker images and running/managing docker instances
- Building Jenkins pipelines using groovy scripts
- Working knowledge on kubernetes including application deployments, managing application configurations and persistence volumes
• Has good understanding on infrastructure as code
• Ability to write and update documentation
• Demonstrate a logical, process orientated approach to problems and troubleshooting
• Ability to collaborate with multi development teams
What you are preferred to have:
• 8+ years of development experience
• Jenkins administration experience
• Hands-on experience in building and deploying helm charts
Process Skills:
• Should have worked in Agile Project
Your challenge
As a DevOps Engineer, you’re responsible for automating the deployment of our software solutions. You interact with software engineers, functional product managers, and ICT professionals daily. Using your technical skills, you provide internal tooling for development and QA teams around the globe.
We believe in an integrated approach, where every team member is involved in all steps of the software development life cycle: analysis, architectural design, programming, and maintenance. We expect you to be the proud owner of your work and take responsibility for it.
Together with a tight-knit group of 5-6 team players, you develop, maintain and support key elements of our infrastructure:
- Continuous integration and production systems
- Release and build management
- Package management
- Containerization and orchestration
Your team
As our new DevOps Engineer, you’ll be part of a large, fast-growing, international team located in Belgium (Antwerp, Ghent, Wavre), Spain (Barcelona), Ukraine (Lviv), and the US (Atlanta). Software Development creates leading software solutions that make a difference to our customers. We make smart, robust, and scalable software to solve complex supply chain planning challenges.
Your profile
We are looking for someone who meets the following qualifications:
- A bachelor’s or master’s degree in a field related to Computer Science.
- Pride in developing high-quality solutions and taking responsibility for their maintenance.
- Minimum 6 years' experience in a similar role
- Good knowledge of the following technologies: Kubernetes, PowerShell or bash scripting, Jenkins, Azure Pipelines or similar automation systems, Git.
- Familiarity with the Cloud–Native Landscape. Terraform, Ansible, and Helm are tools we use daily.
- Supportive towards users.
Bonus points if you have:
- A background in DevOps, ICT, or technical support.
- Customer support experience or other relevant work experience, including internships.
- Understanding of Windows networks and Active Directory.
- Experience with transferring applications into the cloud.
- Programming skills.
Soft skills
Team Work
Pragmatic attitude
Passionate
Analytical thinker
Tech Savvy
Fast Learner
Hard skills
Kubernetes
CI/CD
Git
Powershell
Your future
At OMP, we’re eager to find your best career fit. Our talent management program supports your personal development and empowers you to build a career in line with your ambitions.
Many of our team members who start as DevOps Engineers grow into roles in DevOps/Cloud architecture, project management, or people management.
Company Overview
Adia Health revolutionizes clinical decision support by enhancing diagnostic accuracy and personalizing care. It modernizes the diagnostic process by automating optimal lab test selection and interpretation, utilizing a combination of expert medical insights, real-world data, and artificial intelligence. This approach not only streamlines the diagnostic journey but also ensures precise, individualized patient care by integrating comprehensive medical histories and collective platform knowledge.
Position Overview
We are seeking a talented and experienced Site Reliability Engineer/DevOps Engineer to join our dynamic team. The ideal candidate will be responsible for ensuring the reliability, scalability, and performance of our infrastructure and applications. You will collaborate closely with development, operations, and product teams to automate processes, implement best practices, and improve system reliability.
Key Responsibilities
- Design, implement, and maintain highly available and scalable infrastructure solutions using modern DevOps practices.
- Automate deployment, monitoring, and maintenance processes to streamline operations and increase efficiency.
- Monitor system performance and troubleshoot issues, ensuring timely resolution to minimize downtime and impact on users.
- Implement and manage CI/CD pipelines to automate software delivery and ensure code quality.
- Manage and configure cloud-based infrastructure services to optimize performance and cost.
- Collaborate with development teams to design and implement scalable, reliable, and secure applications.
- Implement and maintain monitoring, logging, and alerting solutions to proactively identify and address potential issues.
- Conduct periodic security assessments and implement appropriate measures to ensure the integrity and security of systems and data.
- Continuously evaluate and implement new tools and technologies to improve efficiency, reliability, and scalability.
- Participate in on-call rotation and respond to incidents promptly to ensure system uptime and availability.
Qualifications
- Bachelor's degree in Computer Science, Engineering, or related field
- Proven experience (5+ years) as a Site Reliability Engineer, DevOps Engineer, or similar role
- Strong understanding of cloud computing principles and experience with AWS
- Experience of building and supporting complex CI/CD pipelines using Github
- Experience of building and supporting infrastructure as a code using Terraform
- Proficiency in scripting and automating tools
- Solid understanding of networking concepts and protocols
- Understanding of security best practices and experience implementing security controls in cloud environments
- Knowing modern security requirements like SOC2, HIPAA, HITRUST will be a solid advantage.
Please Apply - https://zrec.in/7EYKe?source=CareerSite
About Us
Infra360 Solutions is a services company specializing in Cloud, DevSecOps, Security, and Observability solutions. We help technology companies adapt DevOps culture in their organization by focusing on long-term DevOps roadmap. We focus on identifying technical and cultural issues in the journey of successfully implementing the DevOps practices in the organization and work with respective teams to fix issues to increase overall productivity. We also do training sessions for the developers and make them realize the importance of DevOps. We provide these services - DevOps, DevSecOps, FinOps, Cost Optimizations, CI/CD, Observability, Cloud Security, Containerization, Cloud Migration, Site Reliability, Performance Optimizations, SIEM and SecOps, Serverless automation, Well-Architected Review, MLOps, Governance, Risk & Compliance. We do assessments of technology architecture, security, governance, compliance, and DevOps maturity model for any technology company and help them optimize their cloud cost, streamline their technology architecture, and set up processes to improve the availability and reliability of their website and applications. We set up tools for monitoring, logging, and observability. We focus on bringing the DevOps culture to the organization to improve its efficiency and delivery.
Job Description
Job Title: Senior DevOps Engineer / SRE
Department: Technology
Location: Gurgaon
Work Mode: On-site
Working Hours: 10 AM - 7 PM
Terms: Permanent
Experience: 4-6 years
Education: B.Tech/MCA
Notice Period: Immediately
About Us
At Infra360.io, we are a next-generation cloud consulting and services company committed to delivering comprehensive, 360-degree solutions for cloud, infrastructure, DevOps, and security. We partner with clients to transform and optimize their technology landscape, ensuring resilience, scalability, cost efficiency and innovation.
Our core services include Cloud Strategy, Site Reliability Engineering (SRE), DevOps, Cloud Security Posture Management (CSPM), and related Managed Services. We specialize in driving operational excellence across multi-cloud environments, helping businesses achieve their goals with agility and reliability.
We thrive on ownership, collaboration, problem-solving, and excellence, fostering an environment where innovation and continuous learning are at the forefront. Join us as we expand and redefine what’s possible in cloud technology and infrastructure.
Role Summary
We are seeking a Senior DevOps Engineer (SRE) to manage and optimize large-scale, mission-critical production systems. The ideal candidate will have a strong problem-solving mindset, extensive experience in troubleshooting, and expertise in scaling, automating, and enhancing system reliability. This role requires hands-on proficiency in tools like Kubernetes, Terraform, CI/CD, and cloud platforms (AWS, GCP, Azure), along with scripting skills in Python or Go. The candidate will drive observability and monitoring initiatives using tools like Prometheus, Grafana, and APM solutions (Datadog, New Relic, OpenTelemetry).
Strong communication, incident management skills, and a collaborative approach are essential. Experience in team leadership and multi-client engagement is a plus.
Ideal Candidate Profile
- Solid 4-6 years of experience as an SRE and DevOps with a proven track record of handling large-scale production environments
- Bachelor's or Master's degree in Computer Science, Engineering, or a related field
- Strong Hands-on experience with managing Large Scale Production Systems
- Strong Production Troubleshooting Skills and handling high-pressure situations.
- Strong Experience with Databases (PostgreSQL, MongoDB, ElasticSearch, Kafka)
- Worked on making production systems more Scalable, Highly Available and Fault-tolerant
- Hands-on experience with ELK or other logging and observability tools
- Hands-on experience with Prometheus, Grafana & Alertmanager and on-call processes like Pagerduty
- Problem-Solving Mindset
- Strong with skills - K8s, Terraform, Helm, ArgoCD, AWS/GCP/Azure etc
- Good with Python/Go Scripting Automation
- Strong with fundamentals like DNS, Networking, Linux
- Experience with APM tools like - Newrelic, Datadog, OpenTelemetry
- Good experience with Incident Response, Incident Management, Writing detailed RCAs
- Experience with Applications best practices in making apps more reliable and fault-tolerant
- Strong leadership skills and the ability to mentor team members and provide guidance on best practices.
- Able to manage multiple clients and take ownership of client issues.
- Experience with Git and coding best practices
Good to have
- Team-leading Experience
- Multiple Client Handling
- Requirements gathering from clients
- Good Communication
Key Responsibilities
- Design and Development:
- Architect, design, and develop high-quality, scalable, and secure cloud-based software solutions.
- Collaborate with product and engineering teams to translate business requirements into technical specifications.
- Write clean, maintainable, and efficient code, following best practices and coding standards.
- Cloud Infrastructure:
- Develop and optimise cloud-native applications, leveraging cloud services like AWS, Azure, or Google Cloud Platform (GCP).
- Implement and manage CI/CD pipelines for automated deployment and testing.
- Ensure the security, reliability, and performance of cloud infrastructure.
- Technical Leadership:
- Mentor and guide junior engineers, providing technical leadership and fostering a collaborative team environment.
- Participate in code reviews, ensuring adherence to best practices and high-quality code delivery.
- Lead technical discussions and contribute to architectural decisions.
- Problem Solving and Troubleshooting:
- Identify, diagnose, and resolve complex software and infrastructure issues.
- Perform root cause analysis for production incidents and implement preventative measures.
- Continuous Improvement:
- Stay up-to-date with the latest industry trends, tools, and technologies in cloud computing and software engineering.
- Contribute to the continuous improvement of development processes, tools, and methodologies.
- Drive innovation by experimenting with new technologies and solutions to enhance the platform.
- Collaboration:
- Work closely with DevOps, QA, and other teams to ensure smooth integration and delivery of software releases.
- Communicate effectively with stakeholders, including technical and non-technical team members.
- Client Interaction & Management:
- Will serve as a direct point of contact for multiple clients.
- Able to handle the unique technical needs and challenges of two or more clients concurrently.
- Involve both direct interaction with clients and internal team coordination.
- Production Systems Management:
- Must have extensive experience in managing, monitoring, and debugging production environments.
- Will work on troubleshooting complex issues and ensure that production systems are running smoothly with minimal downtime.
Experience of Linux
Experience using Python or Shell scripting (for Automation)
Hands-on experience with Implementation of CI/CD Processes
Experience working with one cloud platforms (AWS or Azure or Google)
Experience working with configuration management tools such as Ansible & Chef
Experience working with Containerization tool Docker.
Experience working with Container Orchestration tool Kubernetes.
Experience in source Control Management including SVN and/or Bitbucket
& GitHub
Experience with setup & management of monitoring tools like Nagios, Sensu & Prometheus or any other popular tools
Hands-on experience in Linux, Scripting Language & AWS is mandatory
Troubleshoot and Triage development, Production issues
About the client :
Asia’s largest global sports media property in history with a global broadcast to 150+ countries. As the world’s largest martial arts organization, they are a celebration of Asia’s greatest cultural treasure, and its deep-rooted Asian values of integrity, humility, honor, respect, courage, discipline, and compassion. Has achieved some of the highest TV ratings and social media engagement metrics across Asia with its unique brand of Asian values, world-class athletes, and world-class production. Broadcast partners include Turner Sports, Star India, TV Tokyo, Fox Sports, ABS-CBN, Astro, ClaroSports, Bandsports, Startimes, Premier Sports, Thairath TV, Skynet, Mediacorp, OSN, and more. Institutional investors include Sequoia Capital, Temasek Holdings, GIC, Iconiq Capital, Greenoaks Capital, and Mission Holdings. Currently has offices in Singapore, Tokyo, Los Angeles, Shanghai, Milan, Beijing, Bangkok, Manila, Jakarta, and Bangalore.
Position : Devops Engineer – SDE3
As part of the engineering team, you would be expected to have deep technology expertise with a passion for building highly scalable products. This is a unique opportunity where you can impact the lives of people across 150+ countries!
Responsibilities
• Develop Collaborate in large-scale systems design discussions.
• Deploying and maintaining in-house/customer systems ensuring high availability, performance and optimal cost.
• Automate build pipelines. Ensuring right architecture for CI/CD
• Work with engineering leaders to ensure cloud security
• Develop standard operating procedures for various facets of Infrastructure services (CI/CD, Git Branching, SAST, Quality gates, Auto Scaling)
• Perform & automate regular backups of servers & databases. Ensure rollback and restore capabilities are Realtime and with zero-downtime.
• Lead the entire DevOps charter for ONE Championship. Mentor other DevOps engineers. Ensure industry standards are followed.
Requirements
• Overall 5+ years of experience in as DevOps Engineer/Site Reliability Engineer
• B.E/B.Tech in CS or equivalent streams from institute of repute
• Experience in Azure is a must. AWS experience is a plus
• Experience in Kubernetes, Docker, and containers
• Proficiency in developing and deploying fully automated environments using Puppet/Ansible and Terraform
• Experience with monitoring tools like Nagios/Icinga, Prometheus, AlertManager, Newrelic
• Good knowledge of source code control (git)
• Expertise in Continuous Integration and Continuous Deployment setup using Azure Pipeline or Jenkins
• Strong experience in programming languages. Python is preferred
• Experience in scripting and unit testing
• Basic knowledge of SQL & NoSQL databases
• Strong Linux fundamentals
• Experience in SonarQube, Locust & Browserstack is a plus
Job Brief:
We are looking for candidates that have experience in development and have performed CI/CD based projects. Should have a good hands-on Jenkins Master-Slave architecture, used AWS native services like CodeCommit, CodeBuild, CodeDeploy and CodePipeline. Should have experience in setting up cross platform CI/CD pipelines which can be across different cloud platforms or on-premise and cloud platform.
Job Location:
Pune.
Job Description:
- Hands on with AWS (Amazon Web Services) Cloud with DevOps services and CloudFormation.
- Experience interacting with customer.
- Excellent communication.
- Hands-on in creating and managing Jenkins job, Groovy scripting.
- Experience in setting up Cloud Agnostic and Cloud Native CI/CD Pipelines.
- Experience in Maven.
- Experience in scripting languages like Bash, Powershell, Python.
- Experience in automation tools like Terraform, Ansible, Chef, Puppet.
- Excellent troubleshooting skills.
- Experience in Docker and Kuberneties with creating docker files.
- Hands on with version control systems like GitHub, Gitlab, TFS, BitBucket, etc.
DevOps Engineer Skills Building a scalable and highly available infrastructure for data science Knows data science project workflows Hands-on with deployment patterns for online/offline predictions (server/serverless)
Experience with either terraform or Kubernetes
Experience of ML deployment frameworks like Kubeflow, MLflow, SageMaker Working knowledge of Jenkins or similar tool Responsibilities Owns all the ML cloud infrastructure (AWS) Help builds out an entirely CI/CD ecosystem with auto-scaling Work with a testing engineer to design testing methodologies for ML APIs Ability to research & implement new technologies Help with cost optimizations of infrastructure.
Knowledge sharing Nice to Have Develop APIs for machine learning Can write Python servers for ML systems with API frameworks Understanding of task queue frameworks like Celery











