About Hive
Hive is the leading provider of cloud-based AI solutions for content understanding,
trusted by the world’s largest, fastest growing, and most innovative organizations. The
company empowers developers with a portfolio of best-in-class, pre-trained AI models, serving billions of customer API requests every month. Hive also offers turnkey software applications powered by proprietary AI models and datasets, enabling breakthrough use cases across industries. Together, Hive’s solutions are transforming content moderation, brand protection, sponsorship measurement, context-based ad targeting, and more.
Hive has raised over $120M in capital from leading investors, including General Catalyst, 8VC, Glynn Capital, Bain & Company, Visa Ventures, and others. We have over 250 employees globally in our San Francisco, Seattle, and Delhi offices. Please reach out if you are interested in joining the future of AI!
About Role
Our unique machine learning needs led us to open our own data centers, with an
emphasis on distributed high performance computing integrating GPUs. Even with these data centers, we maintain a hybrid infrastructure with public clouds when the right fit. As we continue to commercialize our machine learning models, we also need to grow our DevOps and Site Reliability team to maintain the reliability of our enterprise SaaS offering for our customers. Our ideal candidate is someone who is
able to thrive in an unstructured environment and takes automation seriously. You believe there is no task that can’t be automated and no server scale too large. You take pride in optimizing performance at scale in every part of the stack and never manually performing the same task twice.
Responsibilities
● Create tools and processes for deploying and managing hardware for Private Cloud Infrastructure.
● Improve workflows of developer, data, and machine learning teams
● Manage integration and deployment tooling
● Create and maintain monitoring and alerting tools and dashboards for various services, and audit infrastructure
● Manage a diverse array of technology platforms, following best practices and
procedures
● Participate in on-call rotation and root cause analysis
Requirements
● Minimum 5 - 10 years of previous experience working directly with Software
Engineering teams as a developer, DevOps Engineer, or Site Reliability
Engineer.
● Experience with infrastructure as a service, distributed systems, and software design at a high-level.
● Comfortable working on Linux infrastructures (Debian) via the CLIAble to learn quickly in a fast-paced environment.
● Able to debug, optimize, and automate routine tasks
● Able to multitask, prioritize, and manage time efficiently independently
● Can communicate effectively across teams and management levels
● Degree in computer science, or similar, is an added plus!
Technology Stack
● Operating Systems - Linux/Debian Family/Ubuntu
● Configuration Management - Chef
● Containerization - Docker
● Container Orchestrators - Mesosphere/Kubernetes
● Scripting Languages - Python/Ruby/Node/Bash
● CI/CD Tools - Jenkins
● Network hardware - Arista/Cisco/Fortinet
● Hardware - HP/SuperMicro
● Storage - Ceph, S3
● Database - Scylla, Postgres, Pivotal GreenPlum
● Message Brokers: RabbitMQ
● Logging/Search - ELK Stack
● AWS: VPC/EC2/IAM/S3
● Networking: TCP / IP, ICMP, SSH, DNS, HTTP, SSL / TLS, Storage systems,
RAID, distributed file systems, NFS / iSCSI / CIFS
Who we are
We are a group of ambitious individuals who are passionate about creating a revolutionary AI company. At Hive, you will have a steep learning curve and an opportunity to contribute to one of the fastest growing AI start-ups in San Francisco. The work you do here will have a noticeable and direct impact on the
development of the company.
Thank you for your interest in Hive and we hope to meet you soon

Similar jobs
Job Title : DevOps Engineer
Experience : 3+ Years
Location : Mumbai
Employment Type : Full-time
Job Overview :
We’re looking for an experienced DevOps Engineer to design, build, and manage Kubernetes-based deployments for a microservices data discovery platform.
The ideal candidate has strong hands-on expertise with Helm, Docker, CI/CD pipelines, and cloud networking — and can handle complex deployments across on-prem, cloud, and air-gapped environments.
Mandatory Skills :
✅ Helm, Kubernetes, Docker
✅ Jenkins, ArgoCD, GitOps
✅ Cloud Networking (VPCs, bare metal vs. VMs)
✅ Storage (MinIO, Ceph, NFS, S3/EBS)
✅ Air-gapped & multi-tenant deployments
Key Responsibilities :
- Build and customize Helm charts from scratch.
- Implement CI/CD pipelines using Jenkins, ArgoCD, GitOps.
- Manage containerized deployments across on-prem/cloud setups.
- Work on air-gapped and restricted environments.
- Optimize for scalability, monitoring, and security (Prometheus, Grafana, RBAC, HPA).
Roles and Responsibilities:
- AWS Cloud Management: Design, deploy, and manage AWS cloud infrastructure. Optimize and maintain cloud resources for performance and cost efficiency. Monitor and ensure the security of cloud-based systems.
- Automated Provisioning: Develop and implement automated provisioning processes for infrastructure deployment. Utilize tools like Terraform and Packer to automate and streamline the provisioning of resources.
- Infrastructure as Code (IaC): Champion the use of Infrastructure as Code principles. Collaborate with development and operations teams to define and maintain IaC scripts for infrastructure deployment and configuration.
- Collaboration and Communication: Work closely with cross-functional teams to understand project requirements and provide DevOps expertise. Communicate effectively with team members and stakeholders regarding infrastructure changes, updates, and improvements.
- Continuous Integration/Continuous Deployment (CI/CD): Implement and maintain CI/CD pipelines to automate software delivery processes. Ensure reliable and efficient deployment of applications through the development lifecycle.
- Performance Monitoring and Optimization: Implement monitoring solutions to track system performance, troubleshoot issues, and optimize resource utilization. Proactively identify opportunities for system and process improvements.
Mandatory Skills:
- Proven experience as a DevOps Engineer or similar role, with a focus on AWS.
- Strong proficiency in automated provisioning and cloud management.
- Experience with Infrastructure as Code tools, particularly Terraform and Packer.
- Solid understanding of CI/CD pipelines and version control systems.
- Strong scripting skills (e.g., Python, Bash) for automation tasks.
- Excellent problem-solving and troubleshooting skills.
- Good interpersonal and communication skills for effective collaboration.
Secondary Skills:
- AWS certifications (e.g., AWS Certified DevOps Engineer, AWS Certified Solutions Architect).
- Experience with containerization and orchestration tools (e.g., Docker, Kubernetes).
- Knowledge of microservices architecture and serverless computing.
- Familiarity with monitoring and logging tools (e.g., CloudWatch, ELK stack).
● Building and managing multiple application environments on AWS using automation tools like Terraform or
Cloudformation etc.
● Deploy applications with zero downtime via automation with configuration management tools such as Ansible.
● Setting up Infrastructure monitoring tools such as Prometheus, Grafana
● Setting up centralised logging using tools such as ELK.
● Containerisation of applications/microservices.
● Ensure application availability to 99.9% with highly available infrastructure.
● Monitoring performance of applications and databases.
● Ensuring that systems are safe and secure against cyber security threats.
● Working with software developers to ensure that release cycle and deployment processes are followed.
● Evaluating existing applications and platforms, give recommendations for enhancing performance via gap analysis,
identifying the most practical alternative solutions and assisting with modifications.
Skills -
● Strong knowledge of AWS Managed Services such as EC2, RDS, ECS, ECR, S3, Cloudfront, SES, Redshift, Elastic Cache,
AMQP etc.
● Experience in handling production workloads.
● Experience with Nginx web server.
● Experience with NoSql and Sql Databases such as MongoDB, Postgresql etc.
● Experience with Containerisation of applications/micro services using Docker.
● Understanding of system administration in Linux environments.
● Strong Knowledge of Infrastructure as a Code such as Terraform, Cloudformation etc.
● Strong knowledge of configuration management tools such as Ansible, Chef etc.
● Familiarity with tools such as GitLab, Jenkins, Vercel, JIRA etc.
● Proficiency in scripting languages including Bash, Python etc.
● Full understanding of software development lifecycle best practices and agile methodology
● Strong communication and documentation skills.
● An ability to drive to goals and milestones while valuing and maintaining a strong attention to detail
● Excellent judgment, analytical thinking, and problem-solving skills
● Self-motivated individual that possesses excellent time management and organizational skills
As DevOps Engineer, you are responsible to setup and maintain GIT repository, DevOps tools like Jenkins, UCD, Docker, Kubernetes, Jfrog Artifactory, Cloud monitoring tools, Cloud security.
- Setup, configure, and maintain GIT repos, Jenkins, UCD, etc. for multi hosting cloud environments.
- Architect and maintain the server infrastructure in AWS. Build highly resilient infrastructure following industry best practices.
- Working on Docker images and maintaining Kubernetes clusters.
- Develop and maintain the automation scripts using Ansible or other available tools.
- Maintain and monitor cloud Kubernetes Clusters and patching when necessary.
- Working on Cloud security tools to keep applications secured.
- Participate in software development lifecycle, specifically infra design, execution, and debugging required to achieve successful implementation of integrated solutions within the portfolio.
- Required Technical and Professional Expertise.
- Minimum 4-6 years of experience in IT industry.
- Expertise in implementing and managing Devops CI/CD pipeline.
- Experience in DevOps automation tools. And Very well versed with DevOps Frameworks, Agile.
- Working knowledge of scripting using shell, Python, Terraform, Ansible or puppet or chef.
- Experience and good understanding in any of Cloud like AWS, Azure, Google cloud.
- Knowledge of Docker and Kubernetes is required.
- Proficient in troubleshooting skills with proven abilities in resolving complex technical issues.
- Experience with working with ticketing tools.
- Middleware technologies knowledge or database knowledge is desirable.
- Experience and well versed with Jira tool is a plus.
We look forward to connecting with you. As you may take time to review this opportunity, we will wait for a reasonable time of around 3-5 days before we screen the collected applications and start lining up job discussions with the hiring manager. However, we assure you that we will attempt to maintain a reasonable time window for successfully closing this requirement. The candidates will be kept informed and updated on the feedback and application status.
Job Dsecription:
○ Develop best practices for team and also responsible for the architecture
○ solutions and documentation operations in order to meet the engineering departments quality and standards
○ Participate in production outage and handle complex issues and works towards Resolution
○ Develop custom tools and integration with existing tools to increase engineering Productivity
Required Experience and Expertise
○ Having a good knowledge of Terraform + someone who has worked on large TF code bases.
○ Deep understanding of Terraform with best practices & writing TF modules.
○ Hands-on experience of GCP and AWS and knowledge on AWS Services like VPC and VPC related services like (route tables, vpc endpoints, privatelinks) EKS, S3, IAM. Cost aware mindset towards Cloud services.
○ Deep understanding of Kernel, Networking and OS fundamentals
NOTICE PERIOD - Max - 30 days
BENEFITS: -
Competitive salary and stock options -
Premium medical benefits
LOCATION: - Remote (India)
EDUCATION AND EXPERIENCE -
Degree in computer science, software engineering, or a related field -
At least 3-5 years of professional work experience as a
DevOps / test automation / deployment engineer -
Experience in agile software development methodologies
JOB RESPONSIBILITIES: -
Design and development of scalable software test framework to automate test procedures - Perform health checks of existing sites periodically (manually and also developing a automation pipeline) - Manage per site software releases -
Generate release notes, user guides, technical documentation -
Perform upgrades/ downgrades (patch management) -
Generate and Manage configurations -
Suggest process improvements by interfacing with product owner -
File process/installation/deployment related issues & tracking them -
Verify customer-reported issues and translate them into technical tasks for development
REQUIREMENTS:
Following are must to have requirements, -
Automation frameworks like Selenium etc. -
Scripting languages like Perl, Python etc. -
Linux systems -
Version control system like git -
Issue tracking system like Jira -
Networking protocols -
Cloud infrastructure -
Ability to work individually and as part of a team with a sense of urgency -
Excellent communication skills in written and verbal English -
Great attention to detail
PREFERRED:
Following are good to have requirements, -
Knowledge in any of the programming languages like C, C++ etc.
Experience managing cloud-based (e.g., AWS, Google Cloud, etc.) and in-house server infrastructure -
Familiarity with machine learning / artificial intelligence infrastructure -
Experience in data visualization and statistics -
Basic knowledge of hardware infrastructure containing routers, switches and others -
Familiarity with web & data securityDesign and development of scalable software test framework to automate test
We are looking for a full-time remote DevOps Engineer who has worked with CI/CD automation, big data pipelines and Cloud Infrastructure, to solve complex technical challenges at scale that will reshape the healthcare industry for generations. You will get the opportunity to be involved in the latest tech in big data engineering, novel machine learning pipelines and highly scalable backend development. The successful candidates will be working in a team of highly skilled and experienced developers, data scientists and CTO.
Job Requirements
- Experience deploying, automating, maintaining, and improving complex services and pipelines • Strong understanding of DevOps tools/process/methodologies
- Experience with AWS Cloud Formation and AWS CLI is essential
- The ability to work to project deadlines efficiently and with minimum guidance
- A positive attitude and enjoys working within a global distributed team
Skills
- Highly proficient working with CI/CD and automating infrastructure provisioning
- Deep understanding of AWS Cloud platform and hands on experience setting up and maintaining with large scale implementations
- Experience with JavaScript/TypeScript, Node, Python and Bash/Shell Scripting
- Hands on experience with Docker and container orchestration
- Experience setting up and maintaining big data pipelines, Serverless stacks and containers infrastructure
- An interest in healthcare and medical sectors
- Technical degree with 4 plus years’ infrastructure and automation experience
• Bachelor or Master Degree in Computer Science, Software Engineering from a reputed
University.
• 5 - 8 Years of experience in building scalable, secure and compliant systems.
• More than 2 years of experience in working with GCP deployment for millions of daily visitors
• 5+ years hosting experience in a large heavy-traffic environment
• 5+ years production application support experience in a high uptime environment
• Software development and monitoring knowledge with Automated builds
• Technology:
o Cloud: AWS or Google Cloud
o Source Control: Gitlab or Bitbucket or Github
o Container Concepts: Docker, Microservices
o Continuous Integration: Jenkins, Bamboos
o Infrastructure Automation: Puppet, Chef or Ansible
o Deployment Automation: Jenkins, VSTS or Octopus Deploy
o Orchestration: Kubernets, Mesos, Swarm
o Automation: Node JS or Python
o Linux environment network administration, DNS, firewall and security management
• Ability to be adapt to the startup culture, handle multiple competing priorities, meet
deadlines and troubleshoot problems.








