
Responsibilities
- Building and maintenance of resilient and scalable production infrastructure
- Improvement of monitoring systems
- Creation and support of development automation processes (CI / CD)
- Participation in infrastructure development
- Detection of problems in architecture and proposing of solutions for solving them
- Creation of tasks for system improvements for system scalability, performance and monitoring
- Analysis of product requirements in the aspect of devops
- Managing a team of DevOps, control of task deliveries
- Incident analysis and fixing
Technology stack
Linux, Bash, Salt/Ansible, LXC, libvirt, IPsec, VXLAN, Open vSwitch, OpenVPN, OSPF, BIRD, Cisco NX-OS, Multicast, PIM, LVM, software RAID, LUKS, PostgreSQL, nginx, haproxy, Prometheus, Grafana, Zabbix, GitLab, Capistrano
Skills and Experience
- Understanding of the distributed systems principles
- Understanding of principles for building a resistant network infrastructure
- Experience of Ubuntu Linux administration (Debian-like will be a plus)
- Strong knowledge of Bash
- Experience of working with LXC-containers
- Understanding and experience with infrastructure as a code approach
- Experience of development idempotent Ansible roles
- Experience with relational databases (PostgeSQL), ability to create simple SQL queries
- Experience with git
- Experience with monitoring and metric collect systems (Prometheus, Grafana, Zabbix)
- Understanding of dynamic routing (OSPF)
Preferred experience
- Experience of working with highload zero-downtown environments
- Experience of coding on Python
- Experience of working with IPsec, VXLAN, Open vSwitch
- Knowledge and experience of working with network equipment Cisco
- Experience of working with Cisco NX-OS
- Knowledge of principles of multicast protocols IGMP, PIM
- Experience of setting multicast on Cisco equipment
- Experience of working with Solarflare Onload
- Experience administering Atlassian products

Similar jobs
Infrastructure as Code (IaC):
- Design, implement, and maintain infrastructure as code using tools like Terraform, CloudFormation, or similar.
- Automate infrastructure provisioning and configuration across multiple environments (development, staging, production).
CI/CD Pipelines:
- Design, build, and maintain robust CI/CD pipelines using tools like Jenkins, GitLab CI/CD, CircleCI, or GitHub Actions.
- Implement automated testing, build, and deployment processes.
- Optimize pipelines for speed, reliability, and security.
Cloud Infrastructure:
- Manage and optimize cloud infrastructure on platforms like AWS, Azure, or GCP.
- Monitor and troubleshoot cloud infrastructure performance and availability.
- Implement security best practices for cloud environments.
- Implement cost optimization strategies for cloud resources.
DESIRED SKILLS AND EXPERIENCE
Strong analytical and problem-solving skills
Ability to work independently, learn quickly and be proactive
3-5 years overall and at least 1-2 years of hands-on experience in designing and managing DevOps Cloud infrastructure
Experience must include a combination of:
o Experience working with configuration management tools – Ansible, Chef, Puppet, SaltStack (expertise in at least one tool is a must)
o Ability to write and maintain code in at least one scripting language (Python preferred)
o Practical knowledge of shell scripting
o Cloud knowledge – AWS, VMware vSphere o Good understanding and familiarity with Linux
o Networking knowledge – Firewalls, VPNs, Load Balancers
o Web/Application servers, Nginx, JVM environments
o Virtualization and containers - Xen, KVM, Qemu, Docker, Kubernetes, etc.
o Familiarity with logging systems - Logstash, Elasticsearch, Kibana
o Git, Jenkins, Jira


We are now seeking a talented and motivated individual to contribute to our product in the Cloud data
protection space. Ability to clearly comprehend customer needs in a cloud environment, excellent
troubleshooting skills, and the ability to focus on problem resolution until completion are a requirement.
Responsibilities Include:
Review proposed feature requirements
Create test plan and test cases
Analyze performance, diagnosis, and troubleshooting
Enter and track defects
Interact with customers, partners, and development teams
Researching customer issues and product initiatives
Provide input for service documentation
Required Skills:
Bachelor's degree in Computer Science, Information Systems or related discipline
3+ years' experience inclusive of Software as a Service and/or DevOps engineering experience
Experience with AWS services like VPC, EC2, RDS, SES, ECS, Lambda, S3, ELB
Experience with technologies such as REST, Angular, Messaging, Databases, etc.
Strong troubleshooting skills and issue isolation skills
Possess excellent communication skills (written and verbal English)
Must be able to work as an individual contributor within a team
Ability to think outside the box
Experience in configuring infrastructure
Knowledge of CI / CD
Desirable skills:
Programming skills in scripting languages (e.g., python, bash)
Knowledge of Linux administration
Knowledge of testing tools/frameworks: TestNG, Selenium, etc
Knowledge of Identity and Security
About Hive
Hive is the leading provider of cloud-based AI solutions for content understanding,
trusted by the world’s largest, fastest growing, and most innovative organizations. The
company empowers developers with a portfolio of best-in-class, pre-trained AI models, serving billions of customer API requests every month. Hive also offers turnkey software applications powered by proprietary AI models and datasets, enabling breakthrough use cases across industries. Together, Hive’s solutions are transforming content moderation, brand protection, sponsorship measurement, context-based ad targeting, and more.
Hive has raised over $120M in capital from leading investors, including General Catalyst, 8VC, Glynn Capital, Bain & Company, Visa Ventures, and others. We have over 250 employees globally in our San Francisco, Seattle, and Delhi offices. Please reach out if you are interested in joining the future of AI!
About Role
Our unique machine learning needs led us to open our own data centers, with an
emphasis on distributed high performance computing integrating GPUs. Even with these data centers, we maintain a hybrid infrastructure with public clouds when the right fit. As we continue to commercialize our machine learning models, we also need to grow our DevOps and Site Reliability team to maintain the reliability of our enterprise SaaS offering for our customers. Our ideal candidate is someone who is
able to thrive in an unstructured environment and takes automation seriously. You believe there is no task that can’t be automated and no server scale too large. You take pride in optimizing performance at scale in every part of the stack and never manually performing the same task twice.
Responsibilities
● Create tools and processes for deploying and managing hardware for Private Cloud Infrastructure.
● Improve workflows of developer, data, and machine learning teams
● Manage integration and deployment tooling
● Create and maintain monitoring and alerting tools and dashboards for various services, and audit infrastructure
● Manage a diverse array of technology platforms, following best practices and
procedures
● Participate in on-call rotation and root cause analysis
Requirements
● Minimum 5 - 10 years of previous experience working directly with Software
Engineering teams as a developer, DevOps Engineer, or Site Reliability
Engineer.
● Experience with infrastructure as a service, distributed systems, and software design at a high-level.
● Comfortable working on Linux infrastructures (Debian) via the CLIAble to learn quickly in a fast-paced environment.
● Able to debug, optimize, and automate routine tasks
● Able to multitask, prioritize, and manage time efficiently independently
● Can communicate effectively across teams and management levels
● Degree in computer science, or similar, is an added plus!
Technology Stack
● Operating Systems - Linux/Debian Family/Ubuntu
● Configuration Management - Chef
● Containerization - Docker
● Container Orchestrators - Mesosphere/Kubernetes
● Scripting Languages - Python/Ruby/Node/Bash
● CI/CD Tools - Jenkins
● Network hardware - Arista/Cisco/Fortinet
● Hardware - HP/SuperMicro
● Storage - Ceph, S3
● Database - Scylla, Postgres, Pivotal GreenPlum
● Message Brokers: RabbitMQ
● Logging/Search - ELK Stack
● AWS: VPC/EC2/IAM/S3
● Networking: TCP / IP, ICMP, SSH, DNS, HTTP, SSL / TLS, Storage systems,
RAID, distributed file systems, NFS / iSCSI / CIFS
Who we are
We are a group of ambitious individuals who are passionate about creating a revolutionary AI company. At Hive, you will have a steep learning curve and an opportunity to contribute to one of the fastest growing AI start-ups in San Francisco. The work you do here will have a noticeable and direct impact on the
development of the company.
Thank you for your interest in Hive and we hope to meet you soon
Experience: 8-10yrs
Notice Period: max 15days
Must-haves*
1. Knowledge about Database/NoSQL DB hosting fundamentals (RDS multi-AZ, DynamoDB, MongoDB, and such)
2. Knowledge of different storage platforms on AWS (EBS, EFS, FSx) - mounting persistent volumes with Docker Containers
3. In-depth knowledge of Security principles on AWS (WAF, DDoS, Security Groups, NACL's, IAM groups, and SSO)
4. Knowledge on CI/CD platforms is required (Jenkins, GitHub actions, etc.) - Migration of AWS Code pipelines to GitHub actions
5. Knowledge of vast variety of AWS services (SNS, SES, SQS, Athena, Kinesis, S3, ECS, EKS, etc.) is required
6. Knowledge on Infrastructure as Code tool is required We use Cloudformation. (Terraform is a plus), ideally, we would like to migrate to Terraform from CloudFormation
7. Setting CloudWatch Alarms and SMS/Email Slack alerts.
8. Some Knowledge on configuring any kind of monitoring tool such as Prometheus, Dynatrace, etc. (We currently use Datadog, CloudWatch)
9. Experience with any CDN provider configurations (Cloudflare, Fastly, or CloudFront)
10. Experience with either Python or Go scripting language.
11. Experience with Git branching strategy
12. Containers hosting knowledge on both Windows and Linux
The below list is *Nice to Have*
1. Integration experience with Code Quality tools (SonarQube, NetSparker, etc) with CI/CD
2. Kubernetes
3. CDN's other than CloudFront (Cloudflare, Fastly, etc)
4. Collaboration with multiple teams
5. GitOps
- Develop and Deploy Software:
- Architect and create an effective build and release process using industry best practices and tools
- Create and manage build scripts to deploy software in a multi-cloud environment
- Look for opportunities to automate as much of the deployment process as possible to provide for repeatability, auditability, scalability and build in process enforcement
- Manage Release Schedule:
- Act as a “gate keeper” for all releases into production
- Work closely with business stakeholders, development managers and developers to prepare a release schedule
- Help prioritize deployment requests for version upgrades, patches and hot-fixes
- Continuous Delivery of Software:
- Implement Continuous Integration (CI) practices to drive development teams to implement smaller changes and commit code to the version control repo frequently
- Implement Continuous Development (CD) practices that automates deployment of the application to several environments – Dev, Test and Production
- Implement Continuous Testing (functional and non-functional) to execute tests in the CI/CD pipeline
- Manage Version Control:
- Define and implement branching policies to efficiently manage source-code
- Implement business rules as a part of source control standards
- Resolve Software Issues:
- Assist technical support and development teams to troubleshoot issues and identify areas that need improvement
- Address deployment related issues
- Maintain Release Documentation:
- Maintain release notes (features available in stable versions and known issues) and other documents for both internal and external end users
Summary
We are building the fastest, most reliable & intelligent trading platform. That requires highly available, scalable & performant systems. And you will be playing one of the most crucial roles in making this happen.
You will be leading our efforts in designing, automating, deploying, scaling and monitoring all our core products.
Tech Facts so Far
1. 8+ services deployed on 50+ servers
2. 35K+ concurrent users on average
3. 1M+ algorithms run every min
4. 100M+ messages/min
We are a 4-member backend team with 1 Devops Engineer. Yes! this is all done by this incredible lean team.
Big Challenges for You
1. Manage 25+ services on 200+ servers
2. Achieve 99.999% (5 Nines) availability
3. Make 1-minute automated deployments possible
If you like to work on extreme scale, complexity & availability, then you will love it here.
Who are we
We are on a mission to help retail traders prosper in the stock market. In just 3 years, we have the 3rd most popular app for the stock markets in India. And we are aiming to be the de-facto trading app in the next 2 years.
We are a young, lean team of ordinary people that is building exceptional products, that solve real problems. We love to innovate, thrill customers and work with brilliant & humble humans.
Key Objectives for You
• Spearhead system & network architecture
• CI, CD & Automated Deployments
• Achieve 99.999% availability
• Ensure in-depth & real-time monitoring, alerting & analytics
• Enable faster root cause analysis with improved visibility
• Ensure a high level of security
Possible Growth Paths for You
• Be our Lead DevOps Engineer
• Be a Performance & Security Expert
Perks
• Challenges that will push you beyond your limits
• A democratic place where everyone is heard & aware
Responsibilities
- Designing and building infrastructure to support AWS, Azure, and GCP-based Cloud services and infrastructure.
- Creating and utilizing tools to monitor our applications and services in the cloud including system health indicators, trend identification, and anomaly detection.
- Working with development teams to help engineer scalable, reliable, and resilient software running in the cloud.
- Participating in on-call escalation to troubleshoot customer-facing issues
- Analyzing and monitoring performance bottlenecks and key metrics to optimize software and system performance.
- Providing analytics and forecasts for cloud capacity, troubleshooting analysis, and uptime.
Skills
- Should have strong experience of a couple of years, in leading DevOps team and planning, defining DevOps roadmap and executing as per the same along with the team
- Familiarity with AWS cloud and JSON templates, Python, AWS Cloud formation templates
- Designing solutions using one or more AWS features, tools, and technologies such as EC2, EBS, Glacier, S3, ELB, CloudFormation, Lambada, CloudWatch, VPC, RDS, Direct Connect, AWS CLI, REST API
- Design and implement system architecture with AWS cloud - Develop automation scripts, ARM templates, Ansible, Chef, Python, Powershell Knowledge of AWS services and cloud design patterns- Knowledge on Cloud fundamentals like autoscaling, serverless
- Have experience with DevOps and Infrastructure as Code: AWS environment and application automation utilizing CloudFormation and third-party tools. CI/CD pipeline setup utilizing
- CI experience with the following is a must: Jenkins, Bitbucket/GIT, Nexus or Artifactory, SonarQube, WireMock or other mocking solution
- Expert knowledge on Windows/Linux OS/Mac with at least 5-6 years of system administration experience
- Should have strong skills in using JIRA build tool
- Should have knowledge in managing the CI/CD pipeline on public cloud deployments using AWS
- Should have strong skills in using tools like Jenkins, Docker, Kubernetes (AWS EKS, Azure AKS), and Cloudformation.
- Experience in monitoring tools like Pingdom, Nagios, etc.
- Experience in reverse proxy services like Nginx and Apache
- Desirable experience in Bitbucket with version control tools like GIT/SVN
- Experience of manual/automated testing desired application deployments
- Experience in database technologies such as PostgreSQL, MySQL
- Knowledge of helm and terraform
At Karza technologies, we take pride in building one of the most comprehensive digital onboarding & due-diligence platforms by profiling millions of entities and trillions of associations amongst them using data collated from more than 700 publicly available government sources. Primarily in the B2B Fintech Enterprise space, we are headquartered in Mumbai in Lower Parel with 100+ strong workforce. We are truly furthering the cause of Digital India by providing the entire BFSI ecosystem with tech products and services that aid onboarding customers, automating processes and mitigating risks seamlessly, in real-time and at fraction of the current cost.
A few recognitions:
- Recognized as Top25 startups in India to work with 2019 by LinkedIn
- Winner of HDFC Bank's Digital Innovation Summit 2020
- Super Winners (Won every category) at Tecnoviti 2020 by Banking Frontiers
- Winner of Amazon AI Award 2019 for Fintech
- Winner of FinTech Spot Pitches at Fintegrate Zone 2018 held at BSE
- Winner of FinShare 2018 challenge held by ShareKhan
- Only startup in Yes Bank Global Fintech Accelerator to win the account during the Cohort
- 2nd place Citi India FinTech Challenge 2018 by Citibank
- Top 3 in Viacom18's Startup Engagement Programme VStEP
What your average day would look like:
- Deploy and maintain mission-critical information extraction, analysis, and management systems
- Manage low cost, scalable streaming data pipelines
- Provide direct and responsive support for urgent production issues
- Contribute ideas towards secure and reliable Cloud architecture
- Use open source technologies and tools to accomplish specific use cases encountered within the project
- Use coding languages or scripting methodologies to solve automation problems
- Collaborate with others on the project to brainstorm about the best way to tackle a complex infrastructure, security, or deployment problem
- Identify processes and practices to streamline development & deployment to minimize downtime and maximize turnaround time
What you need to work with us:
- Proficiency in at least one of the general-purpose programming languages like Python, Java, etc.
- Experience in managing the IAAS and PAAS components on popular public Cloud Service Providers like AWS, Azure, GCP etc.
- Proficiency in Unix Operating systems and comfortable with Networking concepts
- Experience with developing/deploying a scalable system
- Experience with the Distributed Database & Message Queues (like Cassandra, ElasticSearch, MongoDB, Kafka, etc.)
- Experience in managing Hadoop clusters
- Understanding of containers and have managed them in production using container orchestration services.
- Solid understanding of data structures and algorithms.
- Applied exposure to continuous delivery pipelines (CI/CD).
- Keen interest and proven track record in automation and cost optimization.
Experience:
- 1-4 years of relevant experience
- BE in Computer Science / Information Technology
Goodera is looking for an experienced and motivated DevOps professional to be an integral part of its core infrastructure team. As a DevOps Engineer, you must be able to troubleshoot production issues, design, implement, and deploy monitoring tools, collaborate with team members to improve the existing and develop new engineering tools, optimize company's computing architecture, design and conduct security, performance, availability and availability tests.
Responsibilities:
This is a highly accountable role and the candidate must meet the following professional expectations:
• Owning and improving the scalability and reliability of our products.
• Working directly with product engineering and infrastructure teams.
• Designing and developing various monitoring system tools.
• Accountable for developing deployment strategies and build configuration management.
• Deploying and updating system and application software.
• Ensure regular, effective communication with team members and cross-functional resources.
• Maintaining a positive and supportive work culture.
• First point of contact for handling customer (may be internal stakeholders) issues, providing guidance and recommendations to increase efficiency and reduce customer incidents.
• Develop tooling and processes to drive and improve customer experience, create playbooks.
• Eliminate manual tasks via configuration management.
• Intelligently migrate services from one AWS region to other AWS regions.
• Create, implement and maintain security policies to ensure ISO/ GDPR / SOC / PCI compliance.
• Verify infrastructure Automation meets compliance goals and is current with disaster recovery plan.
• Evangelize configuration management and automation to other product developers.
• Keep himself updated with upcoming technologies to maintain the state of the art infrastructure.
Required Candidate profile :
• 3+ years of proven experience working in a DevOps environment.
• 3+ years of proven experience working in AWS Cloud environments.
• Solid understanding of networking and security best practices.
• Experience with infrastructure-as-code frameworks such as Ansible, Terraform, Chef, Puppet, CFEngine, etc.
• Experience in scripting or programming languages (Bash, Python, PHP, Node.js, Perl, etc.)
• Experience designing and building web application environments on AWS, including services such as ECS, ECR, Foregate, Lambda, SNS / SQS, CloudFront, Code Build, Code pipeline, Configuring CloudWatch, WAF, Active Directories, Kubernetes (EKS), EC2, S3, ELB, RDS, Redshift etc.
• Hands on Experience in Docker is a big plus.
• Experience working in an Agile, fast paced, DevOps environment.
• Strong Knowledge in DB such as MongoDB / MySQL / DynamoDB / Redis / Cassandra.
• Experience with Open Source and tools such as Haproxy, Apache, Nginx and Nagios etc.
• Fluency with version control systems with a preference for Git *
• Strong Linux-based infrastructures, Linux administration
• Experience with installing and configuring application servers such as WebLogic, JBoss and Tomcat.
• Hands-on in logging, monitoring and alerting tools like ELK, Grafana, Metabase, Monit, Zbbix etc.
• A team player capable of high performance, flexibility in a dynamic working environment and the ability to lead.
d ability to rain others on technical and procedural topics.

