• Expertise in any one hyper-scale (AWS/AZURE/GCP), including basic services like networking, data and workload management.
o AWS
Networking: VPC, VPC Peering, Transit Gateway, RouteTables, SecurityGroups, etc.
Data: RDS, DynamoDB, ElasticSearch
Workload: EC2, EKS, Lambda, etc.
o Azure
Networking: VNET, VNET Peering,
Data: Azure MySQL, Azure MSSQL, etc.
Workload: AKS, VirtualMachines, AzureFunctions
o GCP
Networking: VPC, VPC Peering, Firewall, Flowlogs, Routes, Static and External IP Addresses
Data: Cloud Storage, DataFlow, Cloud SQL, Firestore, BigTable, BigQuery
Workload: GKE, Instances, App Engine, Batch, etc.
• Experience in any one of the CI/CD tools (Gitlab/Github/Jenkins) including runner setup, templating and configuration.
• Kubernetes experience or Ansible Experience (EKS/AKS/GKE), basics like pod, deployment, networking, service mesh. Used any package manager like helm.
• Scripting experience (Bash/python), automation in pipelines when required, system service.
• Infrastructure automation (Terraform/pulumi/cloudformation), write modules, setup pipeline and version the code.
Optional
• Experience in any programming language is not required but is appreciated.
• Good experience in GIT, SVN or any other code management tool is required.
• DevSecops tools like (Qualys/SonarQube/BlackDuck) for security scanning of artifacts, infrastructure and code.
• Observability tools (Opensource: Prometheus, Elasticsearch, OpenTelemetry; Paid: Datadog, 24/7, etc)
About Bootlabs Technologies Private Limited
Similar jobs
Staff DevOps Engineer with Azure
EGNYTE YOUR CAREER. SPARK YOUR PASSION.
Egnyte is a place where we spark opportunities for amazing people. We believe that every role has meaning, and every Egnyter should be respected. With 22,000+ customers worldwide and growing, you can make an impact by protecting their valuable data. When joining Egnyte, you’re not just landing a new career, you become part of a team of Egnyters that are doers, thinkers, and collaborators who embrace and live by our values:
Invested Relationships
Fiscal Prudence
Candid Conversations
ABOUT EGNYTE
Egnyte is the secure multi-cloud platform for content security and governance that enables organizations to better protect and collaborate on their most valuable content. Established in 2008, Egnyte has democratized cloud content security for more than 22,000 organizations, helping customers improve data security, maintain compliance, prevent and detect ransomware threats, and boost employee productivity on any app, any cloud, anywhere. For more information, visit www.egnyte.com.
Our Production Engineering team enables Egnyte to provide customers access to their data 24/7 by providing best in class infrastructure.
ABOUT THE ROLE
We store multibillion files and multiple petabytes of data. We observe more than 11K API requests per second on average. To make that possible and to provide the best possible experience, we rely on great engineers. For us, people who own their work, from start to finish, are integral. Our engineers are part of the process from design to code, to test, to deployment and back again for further iterations. You can, and will, touch every level of the infrastructure depending on the day and what project you are working on. The ideal candidate should be able to take a complex problem and execute end to end. Mentor and set higher standards for the rest of the team and for the new hires.
WHAT YOU’LL DO:
• Design, build and maintain self-hosted and cloud environments to serve our own applications and services.
• Collaborate with software developers to build stable, scalable and high-performance solutions.
• Taking part in big projects like migrating solutions from self-hosted environments to the cloud, from virtual machines to Kubernetes, from monolith to microservices.
- Proactively make our organization and technology better!
- Advising others as to how DevOps can make a positive impact on their work.
• Share knowledge, mentor more junior team members while also still learning and gaining new skills.
- Maintain consistently high standards of communication, productivity, and teamwork across all teams.
YOUR QUALIFICATIONS:
• 5+ years of proven experience in a DevOps Engineer, System Administrator or Developer role, working on infrastructure or build processes.
• Expert knowledge of Microsoft Azure.
• Programming prowess (Python, Golang).
• Knowledge and experience about deployment and maintenance of Java and Python apps using application and web servers (Tomcat, Nginx, etc.).
• Ability to solve complex problems with simple, elegant and clean code.
• Practical knowledge of CI/CD solutions, GitLab CI or similar.
• Practical knowledge of Docker as a tool for testing and building an environment.
• Knowledge of Kubernetes and related technologies.
• Experience with metric-based monitoring solutions.
• Solid English skills to effectively communicate with other team members.
• Good understanding of the Linux Operating System on the administration level.
• Drive to grow as a DevOps Engineer (we value open-mindedness and a can-do attitude).
• Strong sense of ownership and ability to drive big projects.
BONUS SKILLS:
• Work experience as a Microsoft Azure architect.
• Experience in Cloud migrations projects.
• Leadership skills and experience.
COMMITMENT TO DIVERSITY, EQUITY, AND INCLUSION:
At Egnyte, we celebrate our differences and thrive on our diversity for our employees, our products, our customers, our investors, and our communities. Egnyters are encouraged to bring their whole selves to work and to appreciate the many differences that collectively make Egnyte a higher-performing company and a great place to be.
Role: DevOps engineer
We are looking for an experienced DevOps engineer that will work closely with our client’s team to establish DevOps practice.
Your role will include establishing configuration management, automating the infrastructure, implementing continuous integration, and training the team in DevOps best practices to achieve a continuously deployable system.
You will be part of a continually growing team. You will have the chance to be creative and think of new ideas. You might also get the opportunity to work on some open-source projects, ranging from small to large.
What you’ll be doing – your role
Duties and tasks are varied and complex, and may require independent judgement; you should be fully competent in your own area of expertise. Some of the responsibilities associated with the role include:
- Improve CI/CD tooling
- Implement and improve monitoring and alerting
- Help support daily operations through the use of automation and assist in building a DevOps culture with our engineers for a better all-around software development and deployment experience
- Develop and maintain solutions for highly resilient services and infrastructure
- Implement automation to help deploy our services and maintain their operational health
- Contribute to the understanding of how our services are being used and help plan the capacity needs for future growth
What do we expect – experience and skills:
- Bachelor’s degree in Computer Science or related technical field, involving coding
- 3+ years of experience running large-scale customer-facing services
- 2-3 years of DevOps experience
- A strong desire and aptitude for system automation defines success in this role
- Linux experience, including expertise in system installation, configuration, administration, troubleshooting
- Experience with cloud-based providers such as AWS
- Experience with Kubernetes
- Experience with web-based API/restful service
- Experience with Configuration Management and Infrastructure as Code platforms (Terraform)
- Experience with at least one scripting language (Python, Bash, JavaScript)
- Methodical approach to troubleshooting and documenting issues
- Experience in Docker orchestration and management
- Experience with implementing and maintaining CI/CD pipelines (mainly GitHub Actions and ArgoCD)
- Experience in implementing comprehensive monitoring and logging solutions using Grafana, Prometheus, and Loki.
DESIRED SKILLS AND EXPERIENCE
Strong analytical and problem-solving skills
Ability to work independently, learn quickly and be proactive
3-5 years overall and at least 1-2 years of hands-on experience in designing and managing DevOps Cloud infrastructure
Experience must include a combination of:
o Experience working with configuration management tools – Ansible, Chef, Puppet, SaltStack (expertise in at least one tool is a must)
o Ability to write and maintain code in at least one scripting language (Python preferred)
o Practical knowledge of shell scripting
o Cloud knowledge – AWS, VMware vSphere o Good understanding and familiarity with Linux
o Networking knowledge – Firewalls, VPNs, Load Balancers
o Web/Application servers, Nginx, JVM environments
o Virtualization and containers - Xen, KVM, Qemu, Docker, Kubernetes, etc.
o Familiarity with logging systems - Logstash, Elasticsearch, Kibana
o Git, Jenkins, Jira
Key Responsibilities:
- Work with the development team to plan, execute and monitor deployments
- Capacity planning for product deployments
- Adopt best practices for deployment and monitoring systems
- Ensure the SLAs for performance, up time are met
- Constantly monitor systems, suggest changes to improve performance and decrease costs.
- Ensure the highest standards of security
Key Competencies (Functional):
- Proficiency in coding in atleast one scripting language - bash, Python, etc
- Has personally managed a fleet of servers (> 15)
- Understand different environments production, deployment and staging
- Worked in micro service / Service oriented architecture systems
- Has worked with automated deployment systems – Ansible / Chef / Puppet.
- Can write MySQL queries
- Provision Dev Test Prod Infrastructure as code using IaC (Infrastructure as Code)
- Good knowledge on Terraform
- In-depth knowledge of security and IAM / Role Based Access Controls in Azure, management of Azure Application/Network Security Groups, Azure Policy, and Azure Management Groups and Subscriptions.
- Experience with Azure and GCP compute, storage and networking (we can also look for GCP )
- Experience in working with ADLS Gen2, Databricks and Synapse Workspace
- Experience supporting cloud development pipelines using Git, CI/CD tooling, Terraform and other Infrastructure as Code tooling as appropriate
- Configuration Management (e.g. Jenkins, Ansible, Git, etc...)
- General automation including Azure CLI, or Python, PowerShell and Bash scripting
- Experience with Continuous Integration/Continuous Delivery models
- Knowledge of and experience in resolving configuration issues
- Understanding of software and infrastructure architecture
- Experience in Paas, Terraform and AKS
- Monitoring, alerting and logging tools, and build/release processes Understanding of computing technologies across Windows and Linux
This company is a network of the world's best developers - full-time, long-term remote software jobs with better compensation and career growth. We enable our clients to accelerate their Cloud Offering, and Capitalize on Cloud. We have our own IOT/AI platform and we provide professional services on that platform to build custom clouds for their IOT devices. We also build mobile apps, run 24x7 devops/site reliability engineering for our clients.
We are looking for very hands-on SRE (Site Reliability Engineering) engineers with 3 to 6 years of experience. The person will be part of team that is responsible for designing & implementing automation from scratch for medium to large scale cloud infrastructure and providing 24x7 services to our North American / European customers. This also includes ensuring ~100% uptime for almost 50+ internal sites. The person is expected to deliver with both high speed and high quality as well as work for 40 Hours per week (~6.5 hours per day, 6 days per week) in shifts which will rotate every month.
This person MUST have:
- B.E Computer Science or equivalent
- 2+ Years of hands-on experience troubleshooting/setting up of the Linux environment, who can write shell scripts for any given requirement.
- 1+ Years of hands-on experience setting up/configuring AWS or GCP services from SCRATCH and maintaining them.
- 1+ Years of hands-on experience setting up/configuring Kubernetes & EKS and ensuring high availability of container orchestration.
- 1+ Years of hands-on experience setting up CICD from SCRATCH in Jenkins & Gitlab.
- Experience configuring/maintaining one monitoring tool.
- Excellent verbal & written communication skills.
- Candidates with certifications - AWS, GCP, CKA, etc will be preferred
- Hands-on experience with databases (Cassandra, MongoDB, MySQL, RDS).
Experience:
- Min 3 years of experience as SRE automation engineer building, running, and maintaining production sites. Not looking for candidates who have experience only as L1/L2 or Build & Deploy..
Location:
- Remotely, anywhere in India
Timings:
- The person is expected to deliver with both high speed and high quality as well as work for 40 Hours per week (~6.5 hours per day, 6 days per week) in shifts which will rotate every month.
Position:
- Full time/Direct
- We have great benefits such as PF, medical insurance, 12 annual company holidays, 12 PTO leaves per year, annual increments, Diwali bonus, spot bonuses and other incentives etc.
- We dont believe in locking in people with large notice periods. You will stay here because you love the company. We have only a 15 days notice period.
Roles and Responsibilities
● Managing Availability, Performance, Capacity of infrastructure and applications.
● Building and implementing observability for applications health/performance/capacity.
● Optimizing On-call rotations and processes.
● Documenting “tribal” knowledge.
● Managing Infra-platforms like
- Mesos/Kubernetes
- CICD
- Observability(Prometheus/New Relic/ELK)
- Cloud Platforms ( AWS/ Azure )
- Databases
- Data Platforms Infrastructure
● Providing help in onboarding new services with the production readiness review process.
● Providing reports on services SLO/Error Budgets/Alerts and Operational Overhead.
● Working with Dev and Product teams to define SLO/Error Budgets/Alerts.
● Working with the Dev team to have an in-depth understanding of the application architecture and its bottlenecks.
● Identifying observability gaps in product services, infrastructure and working with stake owners to fix it.
● Managing Outages and doing detailed RCA with developers and identifying ways to avoid that situation.
● Managing/Automating upgrades of the infrastructure services.
● Automate toil work.
Experience & Skills
● 3+ Years of experience as an SRE/DevOps/Infrastructure Engineer on large scale microservices and infrastructure.
● A collaborative spirit with the ability to work across disciplines to influence, learn, and deliver.
● A deep understanding of computer science, software development, and networking principles.
● Demonstrated experience with languages, such as Python, Java, Golang etc.
● Extensive experience with Linux administration and good understanding of the various linux kernel subsystems (memory, storage, network etc).
● Extensive experience in DNS, TCP/IP, UDP, GRPC, Routing and Load Balancing.
● Expertise in GitOps, Infrastructure as a Code tools such as Terraform etc.. and Configuration Management Tools such as Chef, Puppet, Saltstack, Ansible.
● Expertise of Amazon Web Services (AWS) and/or other relevant Cloud Infrastructure solutions like Microsoft Azure or Google Cloud.
● Experience in building CI/CD solutions with tools such as Jenkins, GitLab, Spinnaker, Argo etc.
● Experience in managing and deploying containerized environments using Docker,
Mesos/Kubernetes is a plus.
● Experience with multiple datastores is a plus (MySQL, PostgreSQL, Aerospike,
Couchbase, Scylla, Cassandra, Elasticsearch).
● Experience with data platforms tech stacks like Hadoop, Hive, Presto etc is a plus
About Us:
100ms is building a Platform-as-a-Service for developers integrating video-conferencing experiences into their apps. Our SDKs enable developers to add gold standard audio-video quality conferencing with much faster shipping times.
We are a team uniquely placed to work on this problem. We have built world-record scale live video infrastructure powering billions of live video minutes in a day. We are a remote-first global team with engineers who've built video teams at Facebook and Hotstar.
As part of the infrastructure team, you will be mainly responsible for looking after the cloud infrastructure.
You Will Be:
- Building and setting up new development tools and infrastructure
- Understanding the needs of stakeholders and conveying this to developers
- Driving centralized solutions like logging, rate limiting, service discovery
- Working on ways to automate and improve development and release processes
- Ensuring that systems are safe and secure against cybersecurity threats
You Have:
- Bachelor's degree or equivalent practical experience
- 4 years of professional software development experience, or 2 years with an advanced degree
- Expertise in managing large scale Cloud infrastructure, preferable AWS and Kubernetes
- Experience in developing applications using programming languages like Python, Golang and Ruby
- Hands on experience with prometheus, grafana, fluentd, splunk etc.
Good To Have:
- Knowledge of Terraform, Chef, Helm etc.,
- Ability to take on complex and ambiguous problems
- Strong inclination to keep up-to-date with latest trends, learn new concepts, or contribute to open-source projects and would be eager to talk about ideas in internal or external forum
You Will Gain:
- You'll be part of a small team at a fast-growing engineering-first startup
- You'll work with engineers across the globe with experience at Facebook and Hotstar
- You can grow as an individual contributor or as a team leader - freedom to set your own goals
- You'll work on problems at the cutting-edge of real-time video communication technology at massive scale
- 5+ years hands-on experience with designing, deploying and managing core AWS services and infrastructure
- Proficiency in scripting using Bash, Python, Ruby, Groovy, or similar languages
- Experience in source control management, specifically with Git
- Hands-on experience in Unix/Linux and bash scripting
- Experience building, managing Helm-based build and release CI-CD pipelines for Kubernetes platforms (EKS, Openshift, GKE)
- Strong experience with orchestration and config management tools such as Terraform, Ansible or Cloudformation
- Ability to debug, analyze issues leveraging tools like App Dynamics, New Relic and Sumologic
- Knowledge of Agile Methodologies and principles
- Good writing and documentation skills
- Strong collaborator with the ability to work well with core teammates and our colleagues across STS
Objectives of this Role
Improve reliability, quality, and time-to-market of our suite of software solutions
- Run the production environment by monitoring availability and taking a holistic view of system health
- Measure and optimize system performance, with an eye toward pushing our capabilities forward, getting ahead of customer - needs, and innovating to continually improve
- Provide primary operational support and engineering for multiple large distributed software applications
- Participate in system design consulting, platform management, and capacity planning
- Languages: Python, Java, Ruby DSL, Bash
- Databases : MySQL, Cassandra , Elastic Search
- Deployment: AWS CloudFormation
Essential Criteria:
- 8 or more years administrating production Linux systems in a 24x7 environment
- 3 or more years’ experience in a DevOps/ SRE role as an engineer or technical lead
- At least 1 year of team leadership experience
- Significant knowledge of Amazon Web Services (CLI/APIs, EC2, EBS, S3, VPCs, IAM, AWS Lambda)
- Experience deploying services into containerized orchestration environments such as Kubernetes
- Experience with infrastructure automation tools like CloudFormation, Terraform, etc.
- Experience with at least one of Python, Bash, Ruby, or equivalent
- Experience creating and managing CI/CD pipeline like Jenkins or Spinnaker
- Familiar with version control using Git
- Solid understanding of common security principles
Nice to Have:
- Preference for hands on experience with Serverless Architecture, Kubernetes and Docker
- Strong experience with open-source configuration management tools
- Managing distributed systems spanning multiple AWS regions / data-centers
- Experience with bootstrapping solutions
- Open source contributor
- We’re committed to client success: There are over 6,200 brand and retail websites in the Bazaarvoice network. Our clients represent some of the world’s leading companies across a wide range of industries including retail, apparel, automotive, consumer electronics and travel.
- We’re leaders in consumer-generated content: Each month, more than one billion consumers view and share authentic consumer-generated content, such as ratings and reviews, curated photos, social posts and videos, about products in our network. Thousands upon thousands or reviews are added to the Bazaarvoice network everyday.
- Our network delivers: Network analytics provide insights that help marketers and advertisers provide more engaging experiences that drive brand awareness, consideration, sales, and loyalty.
- We’re a great place to work: We pride ourselves on our unique culture. Join a company that values passion, innovation, authenticity, generosity, respect, teamwork, and performance.