About the job
Our goal
We are reinventing the future of MLOps. Censius Observability platform enables businesses to gain greater visibility into how their AI makes decisions to understand it better. We enable explanations of predictions, continuous monitoring of drifts, and assessing fairness in the real world. (TLDR build the best ML monitoring tool)
The culture
We believe in constantly iterating and improving our team culture, just like our product. We have found a good balance between async and sync work default is still Notion docs over meetings, but at the same time, we recognize that as an early-stage startup brainstorming together over calls leads to results faster. If you enjoy taking ownership, moving quickly, and writing docs, you will fit right in.
The role:
Our engineering team is growing and we are looking to bring on board a senior software engineer who can help us transition to the next phase of the company. As we roll out our platform to customers, you will be pivotal in refining our system architecture, ensuring the various tech stacks play well with each other, and smoothening the DevOps process.
On the platform, we use Python (ML-related jobs), Golang (core infrastructure), and NodeJS (user-facing). The platform is 100% cloud-native and we use Envoy as a proxy (eventually will lead to service-mesh architecture).
By joining our team, you will get the exposure to working across a swath of modern technologies while building an enterprise-grade ML platform in the most promising area.
Responsibilities
- Be the bridge between engineering and product teams. Understand long-term product roadmap and architect a system design that will scale with our plans.
- Take ownership of converting product insights into detailed engineering requirements. Break these down into smaller tasks and work with the team to plan and execute sprints.
- Author high-quality, highly-performance, and unit-tested code running on a distributed environment using containers.
- Continually evaluate and improve DevOps processes for a cloud-native codebase.
- Review PRs, mentor others and proactively take initiatives to improve our team's shipping velocity.
- Leverage your industry experience to champion engineering best practices within the organization.
Qualifications
Work Experience
- 3+ years of industry experience (2+ years in a senior engineering role) preferably with some exposure in leading remote development teams in the past.
- Proven track record building large-scale, high-throughput, low-latency production systems with at least 3+ years working with customers, architecting solutions, and delivering end-to-end products.
- Fluency in writing production-grade Go or Python in a microservice architecture with containers/VMs for over 3+ years.
- 3+ years of DevOps experience (Kubernetes, Docker, Helm and public cloud APIs)
- Worked with relational (SQL) as well as non-relational databases (Mongo or Couch) in a production environment.
- (Bonus: worked with big data in data lakes/warehouses).
- (Bonus: built an end-to-end ML pipeline)
Skills
- Strong documentation skills. As a remote team, we heavily rely on elaborate documentation for everything we are working on.
- Ability to motivate, mentor, and lead others (we have a flat team structure, but the team would rely upon you to make important decisions)
- Strong independent contributor as well as a team player.
- Working knowledge of ML and familiarity with concepts of MLOps
Benefits
- Competitive Salary
- Work Remotely
- Health insurance
- Unlimited Time Off
- Support for continual learning (free books and online courses)
- Reimbursement for streaming services (think Netflix)
- Reimbursement for gym or physical activity of your choice
- Flex hours
- Leveling Up Opportunities
You will excel in this role if
- You have a product mindset. You understand, care about, and can relate to our customers.
- You take ownership, collaborate, and follow through to the very end.
- You love solving difficult problems, stand your ground, and get what you want from engineers.
- Resonate with our core values of innovation, curiosity, accountability, trust, fun, and social good.

About Censiusai
About
Connect with the team
Similar jobs
Job Overview:
We are seeking a skilled and proactive Cloud Administrator with strong hands-on experience in Azure and AWS environments. The ideal candidate will have a solid background in Kubernetes administration, ArgoCD, MySQL, and GitLab, with a keen eye for cloud optimization and security best practices.
This role demands a self-motivated, quick learner who can confidently manage cloud infrastructure in production environments, communicate effectively with stakeholders, and escalate issues promptly when needed.
Key Skills & Qualifications:
- Strong experience with Azure and AWS cloud platforms.
- Proven expertise in Kubernetes administration (certification a plus).
- Experience with ArgoCD, GitLab CI/CD, and Helm.
- Proficiency in MySQL administration, including performance tuning and backups.
- Working knowledge of cloud security principles and cost optimization strategies.
- Ability to troubleshoot issues in high-pressure production environments.
- Strong communication and customer-facing skills.
- Quick learner with a positive attitude toward problem-solving.
About the Role:
We are seeking a talented and passionate DevOps Engineer to join our dynamic team. You will be responsible for designing, implementing, and managing scalable and secure infrastructure across multiple cloud platforms. The ideal candidate will have a deep understanding of DevOps best practices and a proven track record in automating and optimizing complex workflows.
Key Responsibilities:
Cloud Management:
- Design, implement, and manage cloud infrastructure on AWS, Azure, and GCP.
- Ensure high availability, scalability, and security of cloud resources.
Containerization & Orchestration:
- Develop and manage containerized applications using Docker.
- Deploy, scale, and manage Kubernetes clusters.
CI/CD Pipelines:
- Build and maintain robust CI/CD pipelines to automate the software delivery process.
- Implement monitoring and alerting to ensure pipeline efficiency.
Version Control & Collaboration:
- Manage code repositories and workflows using Git.
- Collaborate with development teams to optimize branching strategies and code reviews.
Automation & Scripting:
- Automate infrastructure provisioning and configuration using tools like Terraform, Ansible, or similar.
- Write scripts to optimize and maintain workflows.
Monitoring & Logging:
- Implement and maintain monitoring solutions to ensure system health and performance.
- Analyze logs and metrics to troubleshoot and resolve issues.
Required Skills & Qualifications:
- 3-5 years of experience with AWS, Azure, and Google Cloud Platform (GCP).
- Proficiency in containerization tools like Docker and orchestration tools like Kubernetes.
- Hands-on experience building and managing CI/CD pipelines.
- Proficient in using Git for version control.
- Experience with scripting languages such as Bash, Python, or PowerShell.
- Familiarity with infrastructure-as-code tools like Terraform or CloudFormation.
- Solid understanding of networking, security, and system administration.
- Excellent problem-solving and troubleshooting skills.
- Strong communication and teamwork skills.
Preferred Qualifications:
- Certifications such as AWS Certified DevOps Engineer, Azure DevOps Engineer, or Google Professional DevOps Engineer.
- Experience with monitoring tools like Prometheus, Grafana, or ELK Stack.
- Familiarity with serverless architectures and microservices.

We are now seeking a talented and motivated individual to contribute to our product in the Cloud data
protection space. Ability to clearly comprehend customer needs in a cloud environment, excellent
troubleshooting skills, and the ability to focus on problem resolution until completion are a requirement.
Responsibilities Include:
Review proposed feature requirements
Create test plan and test cases
Analyze performance, diagnosis, and troubleshooting
Enter and track defects
Interact with customers, partners, and development teams
Researching customer issues and product initiatives
Provide input for service documentation
Required Skills:
Bachelor's degree in Computer Science, Information Systems or related discipline
3+ years' experience inclusive of Software as a Service and/or DevOps engineering experience
Experience with AWS services like VPC, EC2, RDS, SES, ECS, Lambda, S3, ELB
Experience with technologies such as REST, Angular, Messaging, Databases, etc.
Strong troubleshooting skills and issue isolation skills
Possess excellent communication skills (written and verbal English)
Must be able to work as an individual contributor within a team
Ability to think outside the box
Experience in configuring infrastructure
Knowledge of CI / CD
Desirable skills:
Programming skills in scripting languages (e.g., python, bash)
Knowledge of Linux administration
Knowledge of testing tools/frameworks: TestNG, Selenium, etc
Knowledge of Identity and Security
- Seeking an Individual carrying around 5+ yrs of experience.
- Must have skills - Jenkins, Groovy, Ansible, Shell Scripting, Python, Linux Admin
- Terraform, AWS deep knowledge to automate and provision EC2, EBS, SQL Server, cost optimization, CI/CD pipeline using Jenkins, Server less automation is plus.
- Excellent writing and communication skills in English. Enjoy writing crisp and understandable documentation
- Comfortable programming in one or more scripting languages
- Enjoys tinkering with tooling. Find easier ways to handle systems by doing some research. Strong awareness around build vs buy.

Platform Services Engineer
DevSecOps Engineer
- Strong Systems Experience- Linux, networking, cloud, APIs
- Scripting language Programming - Shell, Python
- Strong Debugging Capability
- AWS Platform -IAM, Network,EC2, Lambda, S3, CloudWatch
- Knowledge on Terraform, Packer, Ansible, Jenkins
- Observability - Prometheus, InfluxDB, Dynatrace,
- Grafana, Splunk • DevSecOps-CI/CD - Jenkins
- Microservices
- Security & Access Management
- Container Orchestration a plus - Kubernetes, Docker etc.
- Big Data Platforms knowledge EMR, Databricks. Cloudera a plus
- Collaborate with Dev, QA and Data Science teams on environment maintenance, monitoring (ELK, Prometheus or equivalent), deployments and diagnostics
- Administer a hybrid datacenter, including AWS and EC2 cloud assets
- Administer, automate and troubleshoot container based solutions deployed on AWS ECS
- Be able to troubleshoot problems and provide feedback to engineering on issues
- Automate deployment (Ansible, Python), build (Git, Maven. Make, or equivalent) and integration (Jenkins, Nexus) processes
- Learn and administer technologies such as ELK, Hadoop etc.
- A self-starter and enthusiasm to learn and pick up new technologies in a fast-paced environment.
Need to have
- Hands-on Experience in Cloud based DevOps
- Experience working in AWS (EC2, S3, CloudFront, ECR, ECS etc)
- Experience with any programming language.
- Experience using Ansible, Docker, Jenkins, Kubernetes
- Experience in Python.
- Should be very comfortable working in Linux/Unix environment.
- Exposure to Shell Scripting.
- Solid troubleshooting skills
This person MUST have:
- B.E Computer Science or equivalent
- 2+ Years of hands-on experience troubleshooting/setting up of the Linux environment, who can write shell scripts for any given requirement.
- 1+ Years of hands-on experience setting up/configuring AWS or GCP services from SCRATCH and maintaining them.
- 1+ Years of hands-on experience setting up/configuring Kubernetes & EKS and ensuring high availability of container orchestration.
- 1+ Years of hands-on experience setting up CICD from SCRATCH in Jenkins & Gitlab.
- Experience configuring/maintaining one monitoring tool.
- Excellent verbal & written communication skills.
- Candidates with certifications - AWS, GCP, CKA, etc will be preferred
- Hands-on experience with databases (Cassandra, MongoDB, MySQL, RDS).
Experience:
- Min 3 years of experience as SRE automation engineer building, running, and maintaining production sites. Not looking for candidates who have experience only as L1/L2.
Location:
- Remotely, anywhere in India
Timings:
- The person is expected to deliver with both high speed and high quality as well as work for 40 Hours per week (~6.5 hours per day, 6 days per week) in shifts which will rotate every month.
Position:
- Full time/Direct
- We have great benefits such as PF, medical insurance, 12 annual company holidays, 12 PTO leaves per year, annual increments, Diwali bonus, spot bonuses and other incentives etc.
- We dont believe in locking in people with large notice periods. You will stay here because you love the company. We have only a 15 days notice period.
Karkinos Healthcare Pvt. Ltd.
The fundamental principle of Karkinos healthcare is democratization of cancer care in a participatory fashion with existing health providers, researchers and technologists. Our vision is to provide millions of cancer patients with affordable and effective treatments and have India become a leader in oncology research. Karkinos will be with the patient every step of the way, to advise them, connect them to the best specialists, and to coordinate their care.
Karkinos has an eclectic founding team with strong technology, healthcare and finance experience, and a panel of eminent clinical advisors in India and abroad.
Roles and Responsibilities:
- Critical role that involves in setting up and owning the dev, staging, and production infrastructure for the platform that uses micro services, data warehouses and a datalake.
- Demonstrate technical leadership with incident handling and troubleshooting.
- Provide software delivery operations and application release management support, including scripting, automated build and deployment processing and process reengineering.
- Build automated deployments for consistent software releases with zero downtime
- Deploy new modules, upgrades and fixes to the production environment.
- Participate in the development of contingency plans including reliable backup and restore procedures.
- Participate in the development of the end to end CI / CD process and follow through with other team members to ensure high quality and predictable delivery
- Participate in development of CI / CD processes
- Work on implementing DevSecOps and GitOps practices
- Work with the Engineering team to integrate more complex testing into a containerized pipeline to ensure minimal regressions
- Build platform tools that rest of the engineering teams can use.
Apply only if you have:
- 2+ years of software development/technical support experience.
- 1+ years of software development, operations experience deploying and maintaining multi-tiered infrastructure and applications at scale.
- 2+ years of experience in public cloud services: AWS (VPC, EC2, ECS, Lambda, Redshift, S3, API Gateway) or GCP (Kubernetes Engine, Cloud SQL, Cloud Storage, BIG Query, API Gateway, Container Registry) - preferably in GCP.
- Experience managing infra for distributed NoSQL system (Kafka/MongoDB), Containers, Micro services, deployment and service orchestration using Kubernetes.
- Experience and a god understanding of Kubernetes, Service Mesh (Istio preferred), API Gateways, Network proxies, etc.
- Experience in setting up infra for central monitoring of infrastructure, ability to debug, trace
- Experience and deep understanding of Cloud Networking and Security
- Experience in Continuous Integration and Delivery (Jenkins / Maven Github/Gitlab).
- Strong scripting language knowledge, such as Python, Shell.
- Experience in Agile development methodologies and release management techniques.
- Excellent analytical and troubleshooting.
- Ability to continuously learn and make decisions with minimal supervision. You understand that making mistakes means that you are learning.
Interested Applicants can share their resume at sajal.somani[AT]karkinos[DOT]in with subject as "DevOps Engineer".
● Develop and deliver automation software required for building & improving the functionality, reliability, availability, and manageability of applications and cloud platforms
● Champion and drive the adoption of Infrastructure as Code (IaC) practices and mindset
● Design, architect, and build self-service, self-healing, synthetic monitoring and alerting platform and tools
● Automate the development and test automation processes through CI/CD pipeline (Git, Jenkins, SonarQube, Artifactory, Docker containers)
● Build container hosting-platform using Kubernetes
● Introduce new cloud technologies, tools & processes to keep innovating in commerce area to drive greater business value.
Skills Required:
● Excellent written and verbal communication skills and a good listener.
● Proficiency in deploying and maintaining Cloud based infrastructure services (AWS, GCP, Azure – good hands-on experience in at least one of them)
● Well versed with service-oriented architecture, cloud-based web services architecture, design patterns and frameworks.
● Good knowledge of cloud related services like compute, storage, network, messaging (Eg SNS, SQS) and automation (Eg. CFT/Terraform).
● Experience with relational SQL and NoSQL databases, including Postgres and
Cassandra.
● Experience in systems management/automation tools (Puppet/Chef/Ansible, Terraform)
● Strong Linux System Admin Experience with excellent troubleshooting and problem solving skills
● Hands-on experience with languages (Bash/Python/Core Java/Scala)
● Experience with CI/CD pipeline (Jenkins, Git, Maven etc)
● Experience integrating solutions in a multi-region environment
● Self-motivate, learn quickly and deliver results with minimal supervision
● Experience with Agile/Scrum/DevOps software development methodologies.
Nice to Have:
● Experience in setting-up Elastic Logstash Kibana (ELK) stack.
● Having worked with large scale data.
● Experience with Monitoring tools such as Splunk, Nagios, Grafana, DataDog etc.
● Previously experience on working with distributed architectures like Hadoop, Mapreduce etc.

