SRE - DevOps Technical Lead

at Srijan Technologies

DP
Posted by Adyasha Satpathy
icon
Remote only
icon
5 - 12 yrs
icon
₹20L - ₹32L / yr
icon
Full time
Skills
Kubernetes
Docker
Ansible
Terraform
Amazon Web Services (AWS)
Jenkins
CI/CD
Monitoring
Linux/Unix
DevOps
Azure

SRE - Tech Lead (DevOps):

Location: Permanent Work From Home Option
Notice: Candidates with a notice period of 30 days and less and preferred

SRE-DevOps- Tech Lead - JD:

 

Srijan is hiring for Site Reliability Engineering (SRE), We are looking for SRE/DevOps- Tech Lead or Sr. Tech Lead with strong automation skills and a good understanding of how to build & run secure & reliable platforms for cloud-native applications. Please find below the detailed job description and kindly go through the same for reference:-



Minimum Experience: 6+ years in DevOps/SRE

Permanent WFH option

Job Description:-

The focus of this role is to build scalable, resilient, secure infrastructure for cloud-native applications whilst automating every mundane task you could think of and build observability dashboards, set up alerts, etc to provide optics to relevant stakeholders. In a nutshell: “You are keepers of Production environments”. You must be a problem solver with the ability to multitask and come with strong collaboration and communication skills.



Key Responsibilities:-

  • Proactively monitor and review application performance

  • Handle on-call and emergency support

  • Ensure software has good logging and diagnostics

  • Create and maintain operational runbooks

  • Contribute in Solution Designing and evaluating Technical Debt

  • Set right practices for Well-Defined Architecture & to minimize toil.

  • Own SLI, SLO configuration as per Error Budget

  • Maintain production services through measuring and monitoring availability, latency, and overall system health.

  • Practice sustainable incident response and blameless postmortems.

  • Not be afraid to contribute changes back to the Software engineering team to improve the systems.

  • Managing the delivery pipeline into production.

  • Able to mentor junior members on regular basis

  • Troubleshooting issues with web applications

  • Understanding of security principles and best practices

  • Ensuring that critical data is backed up

  • Configuration of monitoring systems including infrastructure monitoring and Application Performance Monitoring systems such as New Relic.

  • Ensuring that web application infrastructure is built

  • Ability to act as Customer Technical Advocate and negotiate well with peers on technical fronts.

  • Flexible enough to work in different Shifts for hyper business requirement

  • Ability to handle multiple global clients on tech front and generate desired reports to represent health of SRE Delivery.



Skills/Experience:-

  • A key skill of a SRE Tech Lead is that they have a deep knowledge of the application, the code, and how it runs, is configured, and scales. That knowledge is what makes them so valuable at also monitoring and supporting it as site reliability engineers.

  • System administration, security, and networking

  • The SRE Tech Lead expected to have a good understanding of system administration (Linux or Windows) and networking.

  • Essential commands

  • User and Group Management

  • Knowledge of networking concepts (DNS, TCP/IP, and Firewalls)

  • Service Configuration

  • Storage Management

  • Good grasp of fundamental security concepts

  • Good understanding of infrastructure as code principles.

  • Knowledge of a scripting language such as Bash

  • Ability to configure infrastructure using a Configuration Management technology such as Puppet, Chef, or Ansible.

  • Familiarity with Jenkins or any other CI/CD tool

  • Proficiency in a high-level programming language such as Python or Go.

  • Understanding of container technologies such as Docker, Kubernetes

  • 2 yrs+ hands on experience with container orchestration technologies such as ECS, EKS, AKS or Kubernetes would be beneficial.

  • Use Terraform and other IaC to deploy cloud infrastructure.







Cloud technologies:-

  • Experience designing available, cost-efficient, fault-tolerant, and scalable distributed systems on AWS/Azure

  • Hands-on experience using compute, networking, storage, and database AWS/Azure services

  • Hands-on experience of 4 yrs+ with AWS/Azure deployment and management services

  • Ability to identify and define technical requirements for an AWS/AZURE-based application

  • Ability to identify which AWS/AZURE services meet a given technical requirement

  • Knowledge of recommended best practices for building secure and reliable applications on the AWS/AZURE platform

  • An understanding of the AWS/AZURE global infrastructure

  • An understanding of network technologies as they relate to AWS/AZURE

  • An understanding of security features and tools that AWS/AZURE provides and how they relate to traditional services







 

About Srijan Technologies


Srijan is today the largest pure-play Drupal agency in Asia. Srijan specializes in building high-traffic websites and complex web applications in Drupal and has been serving clients across USA, Asia, Europe, Australia and the Middle East.

Srijan Technologies is a 17-year-old technology services firm.

For a large part of its life, Srijan has specialised in building content management systems with expertise in PHP-based open-source CMS’, specifically Drupal. In recent years Srijan has diversified into
i) Data Engineering using NodeJS and Python,
ii) Data Science -- Analytics and Machine Learning and
iii) API Management using APIGEE.

Services offered by Srijan:-  

 

Digital experience brings content management systems that mirror the way your organization should work. 

Product Engineering bridges the gap between concept and market, and

Platform Modernisation will create modular, flexible infrastructures for your business that anticipate change.



 
Founded
2002
Type
Products & Services
Size
100-1000 employees
Stage
Profitable
View full company details
Why apply to jobs via Cutshort
Personalized job matches
Stop wasting time. Get matched with jobs that meet your skills, aspirations and preferences.
Verified hiring teams
See actual hiring teams, find common social connections or connect with them directly. No 3rd party agencies here.
Move faster with AI
We use AI to get you faster responses, recommendations and unmatched user experience.
2101133
Matches delivered
3712187
Network size
15000
Companies hiring

Similar jobs

Site Reliability Engineer

at Vonage (A Ericsson Company)

Agency job
via AVI Consulting LLP
Terraform
Chef
Ansible
Docker
Kubernetes
CI/CD
KMS
Hashikorp Vault
Grafana
ELK
Datadog
icon
Remote only
icon
4 - 12 yrs
icon
₹15L - ₹25L / yr
http://www.vonage.com" target="_blank">www.vonage.com

Site Reliability Engineer (SRE)
Vonage Engineering Mission: Vonage is the emerging leader in the $100B+ cloud communications platform (CPaaS) market.

Customers like Airbnb, Viber, Whatsapp, Snapchat, and many others depend on our APIs and SDKs to connect with their customers all over the world. As businesses continue to shift to a real-time, customer-centric communications model, we are experiencing a time of impressive growth.

Why this role matters:
Vonage, a leader in cloud communications, is looking to build a new SRE team in Bangalore.

We believe that there shouldn’t be walls between operations and development and we have embraced the DevOps movement.

As a Site Reliability Engineer, you will work as part of the development team to build automation and tools to deploy, monitor and maintain the platform's health, targeted SLO and SLAs.

What you'll do
● Lead the effort in ensuring reliability of the platform.
● Create Software and Tooling that improves performance, stability, and reliability of the
platform.
● Ability to work as part of a Development Team.
● Monitor Application Metrics to help with improving software performance.
● Build solutions that are highly resilient, scalable, and secure.
● Have a wide breadth of knowledge from software, infrastructure, and security.
● Adopt best practices and champion an engineering culture emphasizing Agile.
What's required for application
● Proven experience building, supporting, and architecting high-availability cloud
infrastructure.
● Experience working on monitoring, logging. and alerting solutions and used tools.
● Experience with tooling such as Terraform, Ansible, Docker, Kubernetes, and Chef.
● Fluent and comfortable working with Cloud Infrastructure.
● Ability to read, write, and troubleshoot software code.
● Good understanding of CI/CD tools.
● Champion of devsecops using tools such as Hashicorp Vault, KMS, Secrets Manager,
● Experience with software development, algorithms, data structures, and systems design.
● Understand monitoring tools such as DataDog, ELK, and Grafana.
● Bachelor's degree (or higher) in Computer Science and/or related
work experience.

www.vonage.com

Nice to have, but not required
● Working knowledge on other AWS services like Glacier, Elastic Container Service (ECS),
● Elastic MapReduce (EMR), DynamoDB etc.
● Automation and Orchestration tools such as Jenkins
● Ruby or Java development skills
● Data Pipeline knowledge, especially with tools like MapReduce, Kafka and ELK stack
Job posted by
Ashesh Shah

Platform and Infra Engineer SDE3

at Lummo

Founded 2019  •  Product  •  100-500 employees  •  Raised funding
Kubernetes
Cloud Native
DevOps
Infrastructure
Amazon Web Services (AWS)
Google Cloud Platform (GCP)
Python
Go Programming (Golang)
icon
Remote only
icon
5 - 7 yrs
icon
₹20L - ₹40L / yr

Role: Platform and Infrastructure Engineer SDE3

Title: Platform and Infrastructure Engineer SDE3

Location: We are open to candidates working from anywhere in India/across the globe. We are fully remote.

About Us:

Lummo (formerly Bukukas) is a SaaS startup seeking to empower entrepreneurs and brands in SEA to accelerate their growth and to serve their customers by giving them the best technology and partner solutions. Lummo offers localized solutions made for SEA, thereby shining the spotlight on entrepreneurs and brands, enabling them to discover all possibilities to grow their business. Lummo was founded as BukuKas in 2019 by serial entrepreneurs Krishnan Menon and Lorenzo Peracchione.


Our Products

The journey started with BukuKas, an app to digitize the physical record-keeping books by enabling micro and small enterprises to record their sales, expenses, and cash transactions at ease using their smartphone.

Lummo's flagship product, LummoSHOP (formerly Tokko), helps growth-oriented entrepreneurs and brands unlock their full potential by helping them build a strong relationship with their consumers by selling to them directly (D2C), maximize operational efficiency across multiple channels & build their own brand online.


Funding:

Backed by top venture capital firms including Sequoia Capital, Tiger Global, CapitalG (Google’s venture fund), Credit Saison, Speedinvest, and other prominent investors and entrepreneurs like Gokul Rajaram (DoorDash), Taavet Hinrikus (Founder, TransferWise), Sandeep Tandon (FreeCharge), Santiago Sosa (Founder, Nuvemshop), Nipun Mehra (Ula, Sequoia), and Amrish Rao (Pinelabs, Citrus pay). 

Having raised more than $150 Million in funding with the backing of marquee global investors, Lummo has built a world-class team with top talent from across the world and is well poised to become a legendary SaaS company that will last beyond our lifetimes

We have recently received C series funding in January 2022, read more about us here


Requirements / Responsibilities

  • You have experience of 7-8 years in building high-performance consumer-facing mobile applications at Product companies of a decent scale.
  • You have experience developing products on Kubernetes and cloud providers like GCP and AWS.
  • You know and have worked on service meshes like Istio, Linkerd.
  • You can write, code and have experience in writing platform-level components. [ex Golang, python]
  • You have experience with debugging production issues and writing RCAs.
  • You have demonstrable stories of being on-call and how outages have been handled.
  • You understand change management in-depth and are opinionated on the steps to push the change to production.
  • You have worked with Cloud Native (CNCF) technologies.
  • You have worked on Distributed Systems.
  • You are an excellent collaborator & communicator. You know that start-ups are a team sport. You listen to others, aren’t afraid to speak your mind and always try to ask the right questions.
  • You are excited by the prospect of working in a distributed team and company.


What do we offer?

  • The ability for you to make an impact and lay a foundation for the upcoming fin-tech innovations
  • A multicultural and diverse team of colleagues from all over the globe
  • Mission-driven and fast-paced, entrepreneurial environment
  • Competitive salary and flexible leave policy
  • A collaborative and flat company culture


What’s in it for you?

Do you truly want to make a difference and revolutionize the lives of millions of business owners? Do you thrive in an environment where moving at light speed and embracing new challenges every day is essential? If yes, Lummo is the perfect place for you!

place for you!

Job posted by
Swetha Venugopal

Site Reliability Engineer

at A startup company providing AI based software platforms

Agency job
via zyoin
Site Reliability
DevOps
Docker
Kubernetes
Python
Amazon Web Services (AWS)
Reliability engineering
icon
Remote, Bengaluru (Bangalore)
icon
3 - 7 yrs
icon
₹10L - ₹30L / yr

Who You Are

  • Creative thinker and strong problem solver with meticulous attention to detail
  • Highly organized, creative, motivated, and passionate about achieving results
  • Able to balance multiple tasks and projects effectively and quickly adapt to new situations and technologies
  • Able to work both independently and as part of a team
  • Systematic problem-solver, coupled with a strong sense of ownership and drive

 

What you need

  • 3-7 years of experience as a Site Reliability Engineer or a mix of a software engineer and DevOps.
  • Strong hands-on knowledge of Linux fundamentals, System administration scripting, performance tuning/scalability, troubleshooting.
  • Write great quality code using SOLID principles including unit and integration tests.
  • Hands-on development experience in an object-orientated programming language like Python.
  • Hands-on experience developing task automations
  • Experience using tools to create and manage CI (continuous integration) and CD (continuous delivery) pipelines.
  • Familiarity with software development tools: source code management (SCM systems), code review systems, issue tracking tools, build tools, test frameworks, code quality tools.
  • Experience implementing open-source observability and alerting tools, like Prometheus, Grafana, Cortex, Thanos, Alertmanager etc
  • Have decent knowledge on networking (VPC, VNet, DNS etc) and of the TCP/IP stack, internet routing and load balancing.
  • Worked with log and configuration management tool
  • Prior experience of working with AWS, Azure, GCP is a plus
  • Prior experience of working with Kubernetes, Docker and containers is plus
  • Strong interpersonal communication skills (including listening, speaking, and writing) and ability to work well in a diverse, team-focused environment with other SREs, Engineers, Product Managers, etc.
  • Documenting your work should be in your DNA

 

What you get

  • A chance to develop and build something (probably from scratch) which you can be proud of
  • Build and Implement modern systems observability solutions including monitoring, alerting, metrics, logging, and APM & distributed tracing.
  • Scale systems sustainably through automation and evolve systems by pushing for changes that improve reliability and velocity.
  • Maintain business continuity by identifying and driving opportunities to make systems highly resilient and human-free.
  • Closely work with the software engineering team to ensure accurate monitoring and metrics are being built into applications before going to production.
  • Develop and maintain software modules for use and re-use in cloud and on-premise systems automation.
  • Identify process gaps and implement process improvements to increase operational reliability
  • Drive standardization efforts across the services, infrastructure, systems, and practices
  • Develop Systems & Tools to help with Development team to uphold the Reliability principles
Job posted by
RAKESH RANJAN

DATA LEAD

at Innovapptive Inc

Founded 2012  •  Product  •  100-500 employees  •  Profitable
Amazon S3
Amazon EBS
Amazon Web Services (AWS)
icon
Hyderabad
icon
7 - 12 yrs
icon
₹5L - ₹15L / yr

The Role

The role Data Lead is responsible for handling the data journey in a product, handling aspects related to data security, data acquisition/retrieval, data massaging etc.

How You Will Make an Impact:

Reasonable accommodations may be made to enable individuals with disabilities to perform the essential functions.  

Ensuring the Innovapptive products to be data enrich & data-efficient.

What You Bring to the Team:

A seasoned data engineer with a solid understanding of how data-rich SAAS products retrieve and consume data. 

To be successful in this role, we believe that you need to possess the following attributes.

  • Bachelor's Degree in IT or Computers Engineering or equivalent degree in Computer Science
  • 7-12 years of relevant experience
  • This position addresses cloud data operations and classical database developer needs.
  • Cloud Data Operations: Hands-on experience with Cloud Data Services on AWS (AWS RDS (MySQL, SQL Server) knowledge of latest cloud database service like Aurora server less DB etc.
  • Hands-on experience in: Design stable, reliable and effective databases
  • Provisioning cloud (AWS) DB services.
  • Installing DB servers on AWS (IAAS model).
  • Blob storage (S3, EBS EFS etc.)
  • Optimizing DB services.
  • Performance tuning, DB service optimization.
  • Building fault-tolerant cloud data services.
  • Experience with NoSQL technologies (documentDB, NoSQL), creating maintaining and consuming on cloud (AWS)
  • Cloud Data security
  • Hands-on experience with handling large data sets/transactions and operations.
  • Exposure to data analytics and associated tools (Athena)
  • Experience in handling Data Strategies, data life cycles in SAAS products.
  • Exposure to cloud (AWS) networking.
  • Query planning and optimization.
  • SQL
  • Knowledge of GDPR, physical/logical/conceptual data segregation in multi-tenant applications.
  • Data Modeling
  • Enforcing the appropriate security compliance in Customer environments as agreed with the client’s Information Security Council
  • Excellent verbal and written communication skills
Job posted by
Abhilash Pulupula

Observability Systems Engineer

at Top Global Hedge Fund

Agency job
via Bullhorn Consultants
Kubernetes
Apache Kafka
prometheus
ELK
ELK Stack
Amazon Web Services (AWS)
Linux/Unix
Ansible
Systems analysis and design
icon
Gurugram, Delhi, Noida, Ghaziabad, Faridabad
icon
3 - 8 yrs
icon
₹4L - ₹15L / yr
Experience in Kubernetes as a systems engineer
(deployment, troubleshooting, maintenance,
Helm charts) and Deployment and administration
of one or more of: ELK stack, Kafka, Prometheus
or Grafana with Working knowledge of at least
one cloud platform (GCP, AWS or Azure) & some
configuration management system (such as Salt
or Ansible).Good understanding of networking
concepts (architecture, components, protocols)
& Solid understanding of OS concepts and
internals of Linux is a must.
Job posted by
Hemant Singh

Senior Engineer - Cloud Reliability

at Searce Inc

Founded 2004  •  Products & Services  •  100-1000 employees  •  Profitable
DevOps
Terraform
Ansible
Puppet
Reliability engineering
Docker
Software deployment
Application server
IT infrastructure
Technical support
Amazon Web Services (AWS)
Google Cloud Platform (GCP)
icon
Pune
icon
5 - 8 yrs
icon
₹10L - ₹17L / yr
Experience :
● 4-8 years experience in Cloud Infrastructure and Operations domains
● Experience with Linux systems and/OR Windows servers
● Specialize in one or two cloud deployment platforms: AWS, GCP, Azure
● Hands on experience with AWS services (EKS, ECS, EC2, VPC, RDS, Lambda, GKE, Compute Engine)
● Experience with one or more programming languages (Python, JavaScript, Ruby, Java,
.Net)
● Good understanding of Apache Web Server, Nginx, MySQL, MongoDB, Nagios
● Logging and Monitoring tools (ELK, Stackdriver, CloudWatch)
● DevOps Technologies
● Knowledge on Configuration Management tools such as Ansible, Terraform, Puppet,
Chef
● Experience working with deployment and orchestration technologies (such as Docker,
Kubernetes, Mesos)
Job posted by
Reena Bandekar

Site Reliability Engineer

at Uniphore Software Systems

Founded 2008  •  Product  •  500-1000 employees  •  Raised funding
SRE
Site Reliability Engineer
Reliability engineering
DevOps
Kubernetes
Terraform
Linux/Unix
Amazon Web Services (AWS)
Java
Python
icon
Bengaluru (Bangalore)
icon
5 - 10 yrs
icon
₹25L - ₹40L / yr
Your Responsibilities
  • We are looking for a Senior SRE with a proven track record of success leading complex cloud-hybrid environments. You will have:
  • Strong sense of Being an Owner, Wearing the Customer Shoes, with the ability to Empower Others demonstrated through clear
  • communication and collaboration.
  • Skills to work independently with multiple global teams, developing, configuring, deploying, and operating our global infrastructure on AWS and on-prem.
  • Operational experience in complex distributed and real-time systems, including experience with SLO/SLAs towards high availability,reliability and DR goals.
  • DevOps experience in building tools and frameworks, with an understanding of continuous deployment processes.
  • Ability to think at scale, bringing a focus on continuous delivery methodologies from design through deployment and operations.
  • Experience building and managing systems with tools including Kubernetes, Chef/Ansible/Puppet, Kafka, Docker, and Terraform.
Required Skill
  • 5+ years experience in a Software and/or Site Reliability Engineering role
  • Experience writing automation code in GoLang, Python or Java
  • Experience developing and operating large scale distributed systems with Kubernetes and Docker
  • Experience in running real time and low latency high available applications (Kafka, gRPC, RTP)
  • Experience running public cloud environments on AWS
  • Experience running hybrid clouds and on-prem infrastructures on Red Hat Enterprise Linux / CentOS
  • Bachelor degree in Engineering, Computer Science or equivalent experience
  • The ability to lead, partner, and collaborate cross functionally across an engineering organization
Job posted by
Sandesh HS

Site Reliability Engineering

at Coredgeio

Founded 2020  •  Product  •  20-100 employees  •  Raised funding
Reliability engineering
Docker
Kubernetes
DevOps
Site reliability
Cloud Computing
Amazon Web Services (AWS)
VMware vSphere
OpenStack
openshift
Google Cloud Platform (GCP)
icon
Remote, Noida, Bengaluru (Bangalore), NCR (Delhi | Gurgaon | Noida)
icon
6 - 11 yrs
icon
₹16L - ₹25L / yr
What are we looking for:
● Research, propose and evaluate with a 5-year vision, the architecture, design, technologies,
processes and profiles related to Telco Cloud.
● Participate in the creation of a realistic technical-strategic roadmap of the network to transform
it to Telco Cloud and be prepared for 5G.
● Using your deep technical expertise, you will provide detailed feedback to Product Management
and Engineering, as well as contribute directly to the platform code base to enhance both the
Customer experience of the service, as well as the SRE quality of life.
● The individual must be aware of trends in network infrastructure as well as within the network
engineering and OSS community. What technologies are being developed or launched?
● The individual should stay current with infrastructure trends in the telco network cloud domain.
● Be responsible for the Engineering of Lab and Production Telco Cloud environments, including
patches, upgrades, and reliability and performance improvements.
Required Minimum Qualifications: (Education and Technical Skills/Knowledge)
● Software Engineering degree, MS in Computer Science or equivalent experience
● Years of experiences as an SRE, DevOps, Development and/or Support related role
● 0-5 years of professional experience for a junior position
● At least 8 years of professional experience for a senior position
● Unix server administration and tuning : Linux / RedHat / CentOS / Ubuntu
● You have deep knowledge in Networking Layers 1-4
● Cloud / Virtualization (at least two): Helm, Docker, Kubernetes, AWS, Azure, Google Cloud,
OpenStack, OpenShift, VMware vSphere / Tanzu
● You have in-depth knowledge of cloud storage solutions on top of AWS, GCP, Azure and/or
on-prem private cloud, such as Ceph, CephFS, GlusterFS
● DevOps: Jenkins, Git, Azure DevOps, Ansible, Terraform
● Backend Knowledge Bash, Python, Go (other knowledge of Scripting Language is a plus).
● PaaS Level solutions such as Keycloak for IAM, Prometheus, Grafana, ELK, DBaaS (such as MySQL,
Cassandra)
About the Organisation:
The team at Coredge.io is a combination of experienced and young professionals alike having
many years of experience in working with Edge computing, Telecom application development
and Kubernetes. The company has continuously collaborated with the open source community,
universities and major industry players in furthering its goal of providing the industry with an
indispensable tool to offer improved services to its customers. Coredge.io has a global market
presence with its offices in US and New Delhi, India.
Job posted by
Abhimanyu Bhatter
site reliability
cloudformation
Terraform
Ansible
Cloud Automation
Software Development
AWS CloudFormation
Algorithms
Data Structures
Python
Powershell
DynamoDB
MySQL
icon
Hyderabad
icon
5 - 11 yrs
icon
₹10L - ₹20L / yr
  • 5+ years of software development or site reliability engineering or equivalent experience
  • Skilled at problem solving, algorithms, and data structures
  • Building tools and scripting frameworks from scratch
  • Working with Cloud Automation tools like CloudFormation, Terraform, CDK, aws-cli
  • Scripting languages like Python, Groovy, PowerShell, Bash, Perl etc.
  • Configuration automation using Ansible or equivalent tools
  • Exposure to Windows, Linux administration skills
  • Project management tools like Jira, Trello
  • Prior experience in dealing with Datastore technologies like Postgres, MySQL, SQL, DynamoDB is desirable
  • Familiarity with basic networking, security and cloud engineering concepts
  • Team player who is eager to help others to succeed through mentoring and leading by example
  • Highly collaborative with effective written and verbal communication skills
Job posted by
Pradeep Kumar Burra

Site Reliability Engineer

at Shuttl

Founded 2015  •  Product  •  100-500 employees  •  Raised funding
Terraform
Kubernetes
Ansible
icon
NCR (Delhi | Gurgaon | Noida)
icon
3 - 6 yrs
icon
₹10L - ₹21L / yr
WHAT WILL I DO? You will work as a Site Reliability Engineer responsible for the availability, performance, monitoring, and incident response, among other things, of the platforms and services used and owned by Shuttl. The SRE Team works alongside the Engineering team and owns every aspect of service availability as well as disaster recovery and business continuity plans. You will work with other Site Reliability Engineers and report to the Lead of Site Reliability Engineering Team. HOW DO WE WORK? Our engineering process is a five step process which consists of phases for planning, developing, testing & profiling, releasing and monitoring. The planning phase consists of documenting of the feature/task to be done followed by various discussions. These discussions cover product, delivery estimates, release plan, monitoring plan, test plans, architecture, code design, technology choices and best practice adoption. The development and testing phase coexist and involve writing code, unit tests, performance tests, profiling, stress testing, code reviews and QA testing. This phase is punctuated with daily scrums and standups. The release phase is largely about managing and communicating the release to customers and internal stakeholders and activating features. The last phase is the monitoring phase where relevant metrics and exceptions are tracked and any critical refinement for the delivered feature is undertaken. This phase culminates with a retrospective. SREs get involved in this process as early as possible to provide general guidance, recommendations and help with designing the application to be in compliance with community standards such as CNCF and 12 Factor. SRE involvement and influence tends to increase during mid to final stages of development where the application is primed for beta evaluation and all the tooling and instrumentation is finalized. WHAT SKILLS SHOULD I HAVE? For this role we expect you to have 3+ years of experience working as a DevOps Engineer or SRE. You should have a good grasp of Unix like systems, access control, networking nuances, process isolation by the means of kernel provided features, distributed applications and algorithms, job schedulers and secret management among other things. At Shuttl we are a big proponent of Immutable infrastructure. All our infrastructure is hosted with Amazon Web Services and we use Hashicorp's Terraform to manage the infrastructure as code. A good handle on AWS and Terraform is therefore a definitive plus. Since SREs are expected to write a lot of code, you are also expected to be skillful in a programming language, preferably Python or Go.
Job posted by
Tanika Monga
Did not find a job you were looking for?
icon
Search for relevant jobs from 10000+ companies such as Google, Amazon & Uber actively hiring on Cutshort.
Get to hear about interesting companies hiring right now
iconFollow Cutshort
Want to apply to this role at Srijan Technologies?
Why apply via Cutshort?
Connect with actual hiring teams and get their fast response. No spam.
Learn more
Get to hear about interesting companies hiring right now
iconFollow Cutshort