Cutshort logo
Srijan Technologies logo
SRE - DevOps Technical Lead
SRE - DevOps Technical Lead
Srijan Technologies's logo

SRE - DevOps Technical Lead

Adyasha Satpathy's profile picture
Posted by Adyasha Satpathy
5 - 12 yrs
₹20L - ₹32L / yr
Remote only
Skills
skill iconKubernetes
skill iconDocker
Ansible
Terraform
skill iconAmazon Web Services (AWS)
skill iconJenkins
CI/CD
Monitoring
Linux/Unix
DevOps
Azure

SRE - Tech Lead (DevOps):

Location: Permanent Work From Home Option
Notice: Candidates with a notice period of 30 days and less and preferred

SRE-DevOps- Tech Lead - JD:

 

Srijan is hiring for Site Reliability Engineering (SRE), We are looking for SRE/DevOps- Tech Lead or Sr. Tech Lead with strong automation skills and a good understanding of how to build & run secure & reliable platforms for cloud-native applications. Please find below the detailed job description and kindly go through the same for reference:-



Minimum Experience: 6+ years in DevOps/SRE

Permanent WFH option

Job Description:-

The focus of this role is to build scalable, resilient, secure infrastructure for cloud-native applications whilst automating every mundane task you could think of and build observability dashboards, set up alerts, etc to provide optics to relevant stakeholders. In a nutshell: “You are keepers of Production environments”. You must be a problem solver with the ability to multitask and come with strong collaboration and communication skills.



Key Responsibilities:-

  • Proactively monitor and review application performance

  • Handle on-call and emergency support

  • Ensure software has good logging and diagnostics

  • Create and maintain operational runbooks

  • Contribute in Solution Designing and evaluating Technical Debt

  • Set right practices for Well-Defined Architecture & to minimize toil.

  • Own SLI, SLO configuration as per Error Budget

  • Maintain production services through measuring and monitoring availability, latency, and overall system health.

  • Practice sustainable incident response and blameless postmortems.

  • Not be afraid to contribute changes back to the Software engineering team to improve the systems.

  • Managing the delivery pipeline into production.

  • Able to mentor junior members on regular basis

  • Troubleshooting issues with web applications

  • Understanding of security principles and best practices

  • Ensuring that critical data is backed up

  • Configuration of monitoring systems including infrastructure monitoring and Application Performance Monitoring systems such as New Relic.

  • Ensuring that web application infrastructure is built

  • Ability to act as Customer Technical Advocate and negotiate well with peers on technical fronts.

  • Flexible enough to work in different Shifts for hyper business requirement

  • Ability to handle multiple global clients on tech front and generate desired reports to represent health of SRE Delivery.



Skills/Experience:-

  • A key skill of a SRE Tech Lead is that they have a deep knowledge of the application, the code, and how it runs, is configured, and scales. That knowledge is what makes them so valuable at also monitoring and supporting it as site reliability engineers.

  • System administration, security, and networking

  • The SRE Tech Lead expected to have a good understanding of system administration (Linux or Windows) and networking.

  • Essential commands

  • User and Group Management

  • Knowledge of networking concepts (DNS, TCP/IP, and Firewalls)

  • Service Configuration

  • Storage Management

  • Good grasp of fundamental security concepts

  • Good understanding of infrastructure as code principles.

  • Knowledge of a scripting language such as Bash

  • Ability to configure infrastructure using a Configuration Management technology such as Puppet, Chef, or Ansible.

  • Familiarity with Jenkins or any other CI/CD tool

  • Proficiency in a high-level programming language such as Python or Go.

  • Understanding of container technologies such as Docker, Kubernetes

  • 2 yrs+ hands on experience with container orchestration technologies such as ECS, EKS, AKS or Kubernetes would be beneficial.

  • Use Terraform and other IaC to deploy cloud infrastructure.







Cloud technologies:-

  • Experience designing available, cost-efficient, fault-tolerant, and scalable distributed systems on AWS/Azure

  • Hands-on experience using compute, networking, storage, and database AWS/Azure services

  • Hands-on experience of 4 yrs+ with AWS/Azure deployment and management services

  • Ability to identify and define technical requirements for an AWS/AZURE-based application

  • Ability to identify which AWS/AZURE services meet a given technical requirement

  • Knowledge of recommended best practices for building secure and reliable applications on the AWS/AZURE platform

  • An understanding of the AWS/AZURE global infrastructure

  • An understanding of network technologies as they relate to AWS/AZURE

  • An understanding of security features and tools that AWS/AZURE provides and how they relate to traditional services







 

Read more
Users love Cutshort
Read about what our users have to say about finding their next opportunity on Cutshort.
Subodh Popalwar's profile image

Subodh Popalwar

Software Engineer, Memorres
For 2 years, I had trouble finding a company with good work culture and a role that will help me grow in my career. Soon after I started using Cutshort, I had access to information about the work culture, compensation and what each company was clearly offering.
Companies hiring on Cutshort
companies logos

About Srijan Technologies

Founded :
2002
Type
Size :
100-1000
Stage :
Profitable
About
Empowering enterprises in modernizing their digital systems Srijan has been leading the change for the last 15+ years. A global leader in open-source technologies, our industry forte but not limited to, is media, travel, healthcare, telecom, retail, pharmaceutical, eCommerce and financial sector. Staffed by a team of over 300+ engineers, our teams are based out of India, USA, Japan, Philippines, United Kingdom, Australia, and Germany. With 75+ Acquia Certified Drupal enginers, we are among the top 3 certified companies globally in Drupal community. Our diverse clientele includes Estee Lauder, FlightCentre, Hindawi, Vodafone, Crain, Diversey, PTT Global, and names in Fortune 500. We are an Advanced AWS consulting partner, Acquia preferred partner, and Apigee consulting partner.
Read more
Connect with the team
Profile picture
Vidhie Gupta
Profile picture
Ashish Rao
Profile picture
Shruti Gupta
Profile picture
PriyaSaini
Profile picture
Devendra Singh
Profile picture
Adyasha Satpathy
Company social profiles
linkedintwitter

Similar jobs

DeepIntent
at DeepIntent
2 candid answers
17 recruiters
Indrajeet Deshmukh
Posted by Indrajeet Deshmukh
Pune
3 - 6 yrs
Best in industry
skill iconKubernetes
skill iconGit
MySQL
skill iconAmazon Web Services (AWS)
CI/CD
+3 more

With a core belief that advertising technology can measurably improve the lives of patients, DeepIntent is leading the healthcare advertising industry into the future. Built purposefully for the healthcare industry, the DeepIntent Healthcare Advertising Platform is proven to drive higher audience quality and script performance with patented technology and the industry’s most comprehensive health data. DeepIntent is trusted by 600+ pharmaceutical brands and all the leading healthcare agencies to reach the most relevant healthcare provider and patient audiences across all channels and devices. For more information, visit DeepIntent.com or find us on LinkedIn.


We are seeking a skilled and experienced Site Reliability Engineer (SRE) to join our dynamic team. The ideal candidate will have a minimum of 3 years of hands-on experience in managing and maintaining production systems, with a focus on reliability, scalability, and performance. As an SRE at Deepintent, you will play a crucial role in ensuring the stability and efficiency of our infrastructure, as well as contributing to the development of automation and monitoring tools.


Responsibilities:

  • Deploy, configure, and maintain Kubernetes clusters for our microservices architecture.
  • Utilize Git and Helm for version control and deployment management.
  • Implement and manage monitoring solutions using Prometheus and Grafana.
  • Work on continuous integration and continuous deployment (CI/CD) pipelines.
  • Containerize applications using Docker and manage orchestration.
  • Manage and optimize AWS services, including but not limited to EC2, S3, RDS, and AWS CDN.
  • Maintain and optimize MySQL databases, Airflow, and Redis instances.
  • Write automation scripts in Bash or Python for system administration tasks.
  • Perform Linux administration tasks and troubleshoot system issues.
  • Utilize Ansible and Terraform for configuration management and infrastructure as code.
  • Demonstrate knowledge of networking and load-balancing principles.
  • Collaborate with development teams to ensure applications meet reliability and performance standards.


Additional Skills (Good to Know):

  • Familiarity with ClickHouse and Druid for data storage and analytics.
  • Experience with Jenkins for continuous integration.
  • Basic understanding of Google Cloud Platform (GCP) and data center operations.


Qualifications:

  • Minimum 3 years of experience in a Site Reliability Engineer role or similar.
  • Proven experience with Kubernetes, Git, Helm, Prometheus, Grafana, CI/CD, Docker, and microservices architecture.
  • Strong knowledge of AWS services, MySQL, Airflow, Redis, AWS CDN.
  • Proficient in scripting languages such as Bash or Python.
  • Hands-on experience with Linux administration.
  • Familiarity with Ansible and Terraform for infrastructure management.
  • Understanding of networking principles and load balancing.


Education:

Bachelor's degree in Computer Science, Information Technology, or a related field.


DeepIntent is committed to bringing together individuals from different backgrounds and perspectives. We strive to create an inclusive environment where everyone can thrive, feel a sense of belonging, and do great work together.

DeepIntent is an Equal Opportunity Employer, providing equal employment and advancement opportunities to all individuals. We recruit, hire and promote into all job levels the most qualified applicants without regard to race, color, creed, national origin, religion, sex (including pregnancy, childbirth and related medical conditions), parental status, age, disability, genetic information, citizenship status, veteran status, gender identity or expression, transgender status, sexual orientation, marital, family or partnership status, political affiliation or activities, military service, immigration status, or any other status protected under applicable federal, state and local laws. If you have a disability or special need that requires accommodation, please let us know in advance.

DeepIntent’s commitment to providing equal employment opportunities extends to all aspects of employment, including job assignment, compensation, discipline and access to benefits and training.

Read more
CodeCraft Technologies Private Limited
Priyanka Praveen
Posted by Priyanka Praveen
Bengaluru (Bangalore), Mangalore
7 - 12 yrs
Best in industry
CI/CD
skill iconGitHub
DevOps

Position: SRE/ DevOps

Experience: 6-10 Years

Location: Bengaluru/Mangalore

 

CodeCraft Technologies is a multi-award-winning creative engineering company offering design and technology solutions on mobile, web and cloud platforms.

 

We are seeking a highly skilled and motivated Site Reliability Engineer (SRE) to join our dynamic team. As an SRE, you will play a crucial role in ensuring the reliability, availability, and performance of our systems and applications. You will work closely with the development team to build and maintain scalable infrastructure, implement best practices in CI/CD, and contribute to the overall stability of our technology stack.

 

 

Roles and Responsibilities:

·       CI/CD and DevOps:

o  Implement and maintain robust Continuous Integration/Continuous Deployment (CI/CD) pipelines to ensure efficient and reliable software delivery.

o  Collaborate with development teams to integrate DevOps principles into the software development lifecycle.

o  Experience with pipelines such as Github actions, GitLab, Azure DevOps,CircleCI is a plus.

·       Test Automation:

o  Develop and maintain automated testing frameworks to validate system functionality, performance, and reliability.

o  Collaborate with QA teams to enhance test coverage and improve overall testing efficiency.

·       Logging/Monitoring:

o  Design, implement, and manage logging and monitoring solutions to proactively identify and address potential issues.

o  Respond to incidents and alerts to ensure system uptime and performance.

·       Infrastructure as Code (IaC):

o  Utilize Terraform (or other tools) to define and manage infrastructure as code, ensuring scalability, security, and consistency across environments.

·       Elastic Stack:

o  Implement and manage Elastic Stack (ELK) for log and data analysis to gain insights into system performance and troubleshoot issues effectively.

·       Cloud Platforms:

o  Work with cloud platforms such as AWS, GCP, and Azure to deploy and manage scalable and resilient infrastructure.

o  Optimize cloud resources for cost efficiency and performance.

·       Vulnerability Management:

o  Conduct regular vulnerability assessments and implement measures to address and remediate identified vulnerabilities.

o  Collaborate with security teams to ensure a robust security posture.

·       Security Assessment:

o  Perform security assessments and audits to identify and address potential security risks.

o  Implement security best practices and stay current with industry trends and emerging threats.

o  Experience with tools such as GCP Security Command Center, and AWS Security Hub is a plus.

·       Third-Party Hardware Providers:

o  Collaborate with third-party hardware providers to integrate and support hardware components within the infrastructure.


Desired Profile:

·       The candidate should be willing to work in the EST time zone, i.e. from 6 PM to 2 AM.

·       Excellent communication and interpersonal skills

·       Bachelor’s Degree

·       Certifications related to this field shall be an added advantage.


Read more
Nvizion Solutions
at Nvizion Solutions
1 recruiter
Anshita Abhilasha
Posted by Anshita Abhilasha
Remote only
3 - 6 yrs
₹6L - ₹15L / yr
DevOps
Google Cloud Platform (GCP)
skill iconAmazon Web Services (AWS)
Linux/Unix
JIRA
+3 more

Nvizion Solutions is looking for the position of Site Reliability Engineer.

 

If interested, kindly share your resume along with contact details.

 

 

Title: Site Reliability Engineer

No. of job openings: 2

Location:Gurgaon/ Hyderabad/ Bengaluru/ Mumbai/Chennai ( Remote location)

Remuneration:Best in the Industry

 

 

·      Experience required: 2 to 4 yrs in the industry

·      Ensuring overall System's reliability

·      Add automation and alerting in the system

·      Providing Troubleshooting support

·      Cross team communications. Working closely with Product team and Customer success team.

·      Proactive support - to ensures the system is back to the healthy state

·      R&D for new tools/technologies to support product and support team

·      Good verbal/written communication to connect with the client.

·      Good team player with a zeal to learn new technologies.

·      The candidate will be part of the team responsible for 24X7 monitoring of distributed global platform.

  • Linux Scripting
  • CI/CD knowledge (Jenkins/ BitBucket Pipelie /GitOps)
  • Version Control
  • Cloud platform knowledge (GCP/AWS/Azure/Digital Ocean)
  • Docker, Kubernetes

 

Read more
Smarsh
at Smarsh
1 recruiter
Nichell Dsouza
Posted by Nichell Dsouza
Bengaluru (Bangalore)
9 - 15 yrs
₹40L - ₹50L / yr
Reliability engineering
skill iconKubernetes
IT infrastructure

Company Description

Smarsh is the leader in communications compliance, archiving, and analytics. We provide compliance across the broadest set of communications channels with insights on what’s being captured. Smarsh customers manage over 500 million daily conversations across 80 channels and growing. Customers include the top 10 U.S., top 8 European, top 5 Canadian, and top 3 Asian banks. The Smarsh advantage is customers stay ahead of compliance and uncover patterns and relationships hidden within their data.

At Smarsh , we’ve been helping our customers manage new forms of communication since 1998. We work closely with regulators including the SEC, FINRA, IIROC, and the PRA and FCA, and with our customers, to ensure that they understand the capabilities of today’s technology and that our platform meets their most stringent requirements. Our products include Connected Capture, Connected Archive, Web Archive & Business Solutions.

 

About the team

Are you an SRE with excellent Observability, Containerization and Orchestration skills? As a Site Reliability Engineer (SRE) in the Smarsh SaaS Operations team, you'll be part of a team who measures and improves production performance reliability through sustainable engineering practices for our suite of applications. Toil will be your number one enemy, observability your closest friend and your mission will be to drive operational burden as close to zero as you can.

Responsibilities

  • Responsible for technical direction at the platform solutions level. Is able to weigh the pros and cons of various solutions and credibly argue for the best path
  • Work closely with Product Management and the rest of the engineering team to define features and their implementations with careful attention to quality, scalability, and maintainability
  • Can break down complex technical solutions into abstractions that the rest of the team and understand
  • Can investigate and solve complex bugs, performance, and scalability issues
  • Collaborates with multiple agile teams to ensure their solutions integrate effectively
  • Track work in ticketing system (JIRA)
  • Participate in Pull Request reviews. Provide and receive feedback to continuously improve.
  • Other duties as assigned.

Desired skills & experience

  • A minimum 10+ years industry experience
  • Masters in CS or equivalent
  • Must have experience in Azure or AWS, either running some large-scale app there or migrating to Azure/AWS. 
  • Experience operating Cloud Foundry in production environments 
  • Experience managing CI/CD systems (Concourse, Jenkins, TravisCI etc.) 
  • Experience deploying and/or operating ELK stack 
  • Experience with container technologies and orchestration platforms (Docker, Kubernetes, Cloud Foundry) 
  • Experience working with monitoring and observability tools (We use Datadog and New Relic) 
  • Familiarity with working with PostgreSQL and MongoDB 
  • Background working in a multi-platform environment (Linux, Windows) 
  • Experience with running on a cloud platform, AWS preferred (S3, RDS, SQS) 
  • Familiarity with Agile/Scrum/Kanban methodologies 
  • Familiarity with programming/scripting languages (ie. Python, Bash, PowerShell, Go, etc.) 

Additional Skills

  • Expert programming skills in relevant languages
  • Exceptional analytical and problem-solving skills
  • Strong communication and collaboration skills
  • Deep understanding of modern software architecture
  • Deep domain knowledge of the industry, platform, and existing processes
  • Fault-tolerant design & maintenance
  • Knowledge and understanding of modern software programming/engineering.
  • Product delivery lifecycle - requirement refinement through ops

 

Why Smarsh?

Ready to join a thriving tech company that’s redefining digital archiving and business intelligence?

Smarsh is the leading comprehensive archiving platform. Recognized as one of today’s fastest growing companies in the U.S., Smarsh delivers innovative cloud-based solutions that help organizations manage and enforce flexible and secure records retention and compliance strategies for electronic communications, including social media and enterprise social networks (Yammer, Chatter, Facebook, LinkedIn and more).

Our motto is ‘People First. Inspire Confidence. Embrace the Impossible.’ We hire lifelong learners who have a passion for their discipline and a track record of excellence. To learn more about us, visit www.smarsh.com/careers

 


Read more
Remote only
3 - 10 yrs
₹5L - ₹15L / yr
skill iconPython
skill iconAmazon Web Services (AWS)
skill iconMongoDB
MySQL
skill iconDjango
+9 more

A network of the world's best developers - full-time, long-term remote software jobs with better compensation and career growth.  We enable our clients to accelerate their Cloud Offering and Capitalize on Cloud.  We have our own IoT/AI platform and we provide professional services on that platform to build custom clouds for their IoT devices.  We also build mobile apps, run 24x7 DevOps/site reliability engineering for our clients.

We are looking for a friendly, very hands-on technical, and dependable professional with plenty of experience as a backend & cloud engineer to provide site reliability services to our internal teams and end customers. We expect you to deliver with TOP quality & high speed. You must have experience developing and designing amazing UI screens.

 

This person MUST have:

  • BE Computer Science or equivalent
  • Cloud app development experience.
  • Strong Troubleshooting and debugging skills
  • A strong passion for writing simple, clean, and efficient code.
  • 3 years of experience with the Django framework and other backend technologies.
  • Knowledge of NodeJS
  • Experience with building, modifying, and extending API endpoints (REST or GraphQL) for data retrieval and persistence.
  • Understand how to use a database like Postgres (preferred choice), SQLite, MongoDB, MySQL.
  • Experience creating high-performance applications.
  • Experience with messaging and broker tools - Rabbitmq, MQTT
  • Experience with SQL and NoSQL databases
  • Experience with the full software development life cycle, including requirements collection, design, implementation, testing, and operational support.
  • Knowledge of web services
  • Proficient understanding of code versioning tools Git.
  • Hands-on experience deploying and managing infrastructure with CloudFormation/Terraform
  • Experience managing AWS infrastructure.
  • Hands-on experience in Linux environment.
  • Basic understanding of Kubernetes/Docker orchestration.
  • Manges existing infrastructure/Pipelines/Engineering tools (On-Prem or  AWS) for the engineering team (Build servers/Jenkins nodes etc.)
  • Experience with scrum or other agile software development methodology.
  • Excellent verbal and written communication, teamwork, decision making and influencing skills.
  • Handle customer calls/emails regarding technical issues for end-users.
  • Strong communication skills
  • Attention to detail.

 

 

Experience:

  • Min 3 year experience

 

Location:

  • Ahmedabad Office Or,
  • Work from home



Timings:

  • 40 hours a week with a rotational shift every month.

Position:

  • Full time/Direct
  • We have great benefits such as PF, medical insurance, 12 annual company holidays, 12 PTO leaves per year, annual increments, Diwali bonus, spot bonuses and other incentives, etc.
  • We don't believe in locking in people with large notice periods.  You will stay here because you love the company.  We have only a 30 days notice period
Read more
Remote, Bengaluru (Bangalore)
3 - 7 yrs
₹10L - ₹30L / yr
Site Reliability
DevOps
skill iconDocker
skill iconKubernetes
skill iconPython
+2 more

Who You Are

  • Creative thinker and strong problem solver with meticulous attention to detail
  • Highly organized, creative, motivated, and passionate about achieving results
  • Able to balance multiple tasks and projects effectively and quickly adapt to new situations and technologies
  • Able to work both independently and as part of a team
  • Systematic problem-solver, coupled with a strong sense of ownership and drive

 

What you need

  • 3-7 years of experience as a Site Reliability Engineer or a mix of a software engineer and DevOps.
  • Strong hands-on knowledge of Linux fundamentals, System administration scripting, performance tuning/scalability, troubleshooting.
  • Write great quality code using SOLID principles including unit and integration tests.
  • Hands-on development experience in an object-orientated programming language like Python.
  • Hands-on experience developing task automations
  • Experience using tools to create and manage CI (continuous integration) and CD (continuous delivery) pipelines.
  • Familiarity with software development tools: source code management (SCM systems), code review systems, issue tracking tools, build tools, test frameworks, code quality tools.
  • Experience implementing open-source observability and alerting tools, like Prometheus, Grafana, Cortex, Thanos, Alertmanager etc
  • Have decent knowledge on networking (VPC, VNet, DNS etc) and of the TCP/IP stack, internet routing and load balancing.
  • Worked with log and configuration management tool
  • Prior experience of working with AWS, Azure, GCP is a plus
  • Prior experience of working with Kubernetes, Docker and containers is plus
  • Strong interpersonal communication skills (including listening, speaking, and writing) and ability to work well in a diverse, team-focused environment with other SREs, Engineers, Product Managers, etc.
  • Documenting your work should be in your DNA

 

What you get

  • A chance to develop and build something (probably from scratch) which you can be proud of
  • Build and Implement modern systems observability solutions including monitoring, alerting, metrics, logging, and APM & distributed tracing.
  • Scale systems sustainably through automation and evolve systems by pushing for changes that improve reliability and velocity.
  • Maintain business continuity by identifying and driving opportunities to make systems highly resilient and human-free.
  • Closely work with the software engineering team to ensure accurate monitoring and metrics are being built into applications before going to production.
  • Develop and maintain software modules for use and re-use in cloud and on-premise systems automation.
  • Identify process gaps and implement process improvements to increase operational reliability
  • Drive standardization efforts across the services, infrastructure, systems, and practices
  • Develop Systems & Tools to help with Development team to uphold the Reliability principles
Read more
Coredgeio
at Coredgeio
1 recruiter
Abhimanyu Bhatter
Posted by Abhimanyu Bhatter
Remote, Noida, Bengaluru (Bangalore), NCR (Delhi | Gurgaon | Noida)
6 - 11 yrs
₹16L - ₹25L / yr
Reliability engineering
skill iconDocker
skill iconKubernetes
DevOps
Site reliability
+6 more
What are we looking for:
● Research, propose and evaluate with a 5-year vision, the architecture, design, technologies,
processes and profiles related to Telco Cloud.
● Participate in the creation of a realistic technical-strategic roadmap of the network to transform
it to Telco Cloud and be prepared for 5G.
● Using your deep technical expertise, you will provide detailed feedback to Product Management
and Engineering, as well as contribute directly to the platform code base to enhance both the
Customer experience of the service, as well as the SRE quality of life.
● The individual must be aware of trends in network infrastructure as well as within the network
engineering and OSS community. What technologies are being developed or launched?
● The individual should stay current with infrastructure trends in the telco network cloud domain.
● Be responsible for the Engineering of Lab and Production Telco Cloud environments, including
patches, upgrades, and reliability and performance improvements.
Required Minimum Qualifications: (Education and Technical Skills/Knowledge)
● Software Engineering degree, MS in Computer Science or equivalent experience
● Years of experiences as an SRE, DevOps, Development and/or Support related role
● 0-5 years of professional experience for a junior position
● At least 8 years of professional experience for a senior position
● Unix server administration and tuning : Linux / RedHat / CentOS / Ubuntu
● You have deep knowledge in Networking Layers 1-4
● Cloud / Virtualization (at least two): Helm, Docker, Kubernetes, AWS, Azure, Google Cloud,
OpenStack, OpenShift, VMware vSphere / Tanzu
● You have in-depth knowledge of cloud storage solutions on top of AWS, GCP, Azure and/or
on-prem private cloud, such as Ceph, CephFS, GlusterFS
● DevOps: Jenkins, Git, Azure DevOps, Ansible, Terraform
● Backend Knowledge Bash, Python, Go (other knowledge of Scripting Language is a plus).
● PaaS Level solutions such as Keycloak for IAM, Prometheus, Grafana, ELK, DBaaS (such as MySQL,
Cassandra)
About the Organisation:
The team at Coredge.io is a combination of experienced and young professionals alike having
many years of experience in working with Edge computing, Telecom application development
and Kubernetes. The company has continuously collaborated with the open source community,
universities and major industry players in furthering its goal of providing the industry with an
indispensable tool to offer improved services to its customers. Coredge.io has a global market
presence with its offices in US and New Delhi, India.
Read more
ScienceLogic
Remote only
5 - 11 yrs
₹10L - ₹17L / yr
AWS CloudFormation
cloud automation
site reliability
cloudformation
Ansible
+9 more
  • 5+ years of software development or site reliability engineering or equivalent experience
  • Skilled at problem solving, algorithms, and data structures
  • Building tools and scripting frameworks from scratch
  • Working with Cloud Automation tools like CloudFormation, Terraform, CDK, aws-cli
  • Scripting languages like Python, Groovy, PowerShell, Bash, Perl etc.
  • Configuration automation using Ansible or equivalent tools
  • Exposure to Windows, Linux administration skills
  • Project management tools like Jira, Trello
  • Prior experience in dealing with Datastore technologies like Postgres, MySQL, SQL, DynamoDB is desirable
  • Familiarity with basic networking, security and cloud engineering concepts
  • Team player who is eager to help others to succeed through mentoring and leading by example
  • Highly collaborative with effective written and verbal communication skills
Read more
Zycus
at Zycus
10 recruiters
Varsha Gupta
Posted by Varsha Gupta
Mumbai
6 - 12 yrs
₹15L - ₹25L / yr
Monitoring
Reliability engineering
AppDynamics
Dynatrace
HTTP
+2 more

Requirements

Technical Skills

  • Ability to solution & deliver all of Operations/SRE services & processes including managing L2 Environment Support
  • 5-12 years of overall environment support experience with 5+ years of experience as support / SRE engineer
  • Experience in implementing Monitoring solutions using APM tools( Example: AppDynamics, Graylog, Dynatrace, Datadog etc.) set up and test proactive monitoring alerts
  • Have a broad knowledge profile and really excel in some areas, such as HTTP/TLS, DNS, networking or containerization
  • Comfortable with large scale production systems and technologies, for example load balancing, monitoring, distributed systems, microservices, and configuration management.

Process Skills

  • Systematic problem-solving approach, coupled with strong communication skills and a sense of ownership and drive
  • Interest in designing, analyzing and troubleshooting large-scale distributed systems.

Behavioral Skills

  • Practice sustainable incident response and blameless postmortems.
  • Proven ability in developing relationships with stakeholders, communicating project/program status, and understanding detailed business requirements across multiple project initiatives
  • This role requires candidates to work in rotational shifts. 24*7 support

Benefits

LOCATION: Mumbai

COMPENSATION: Competitive

WHY ZYCUS? :

  • Be a part of one of the fastest growing product Company in India
  • Come join a young, dynamic & enterprising team
  • Work on the latest technologies
  • Flexible working hours (As per business requirement).

Zycus Global Leader Procurement: https://www.zycus.com/newsroom/press-releases.html" target="_blank">https://www.zycus.com/newsroom/press-releases.html

Read more
Shuttl
at Shuttl
8 recruiters
Tanika Monga
Posted by Tanika Monga
NCR (Delhi | Gurgaon | Noida)
3 - 6 yrs
₹10L - ₹21L / yr
Terraform
skill iconKubernetes
Ansible
WHAT WILL I DO? You will work as a Site Reliability Engineer responsible for the availability, performance, monitoring, and incident response, among other things, of the platforms and services used and owned by Shuttl. The SRE Team works alongside the Engineering team and owns every aspect of service availability as well as disaster recovery and business continuity plans. You will work with other Site Reliability Engineers and report to the Lead of Site Reliability Engineering Team. HOW DO WE WORK? Our engineering process is a five step process which consists of phases for planning, developing, testing & profiling, releasing and monitoring. The planning phase consists of documenting of the feature/task to be done followed by various discussions. These discussions cover product, delivery estimates, release plan, monitoring plan, test plans, architecture, code design, technology choices and best practice adoption. The development and testing phase coexist and involve writing code, unit tests, performance tests, profiling, stress testing, code reviews and QA testing. This phase is punctuated with daily scrums and standups. The release phase is largely about managing and communicating the release to customers and internal stakeholders and activating features. The last phase is the monitoring phase where relevant metrics and exceptions are tracked and any critical refinement for the delivered feature is undertaken. This phase culminates with a retrospective. SREs get involved in this process as early as possible to provide general guidance, recommendations and help with designing the application to be in compliance with community standards such as CNCF and 12 Factor. SRE involvement and influence tends to increase during mid to final stages of development where the application is primed for beta evaluation and all the tooling and instrumentation is finalized. WHAT SKILLS SHOULD I HAVE? For this role we expect you to have 3+ years of experience working as a DevOps Engineer or SRE. You should have a good grasp of Unix like systems, access control, networking nuances, process isolation by the means of kernel provided features, distributed applications and algorithms, job schedulers and secret management among other things. At Shuttl we are a big proponent of Immutable infrastructure. All our infrastructure is hosted with Amazon Web Services and we use Hashicorp's Terraform to manage the infrastructure as code. A good handle on AWS and Terraform is therefore a definitive plus. Since SREs are expected to write a lot of code, you are also expected to be skillful in a programming language, preferably Python or Go.
Read more
Why apply to jobs via Cutshort
people_solving_puzzle
Personalized job matches
Stop wasting time. Get matched with jobs that meet your skills, aspirations and preferences.
people_verifying_people
Verified hiring teams
See actual hiring teams, find common social connections or connect with them directly. No 3rd party agencies here.
ai_chip
Move faster with AI
We use AI to get you faster responses, recommendations and unmatched user experience.
21,01,133
Matches delivered
37,12,187
Network size
15,000
Companies hiring
Did not find a job you were looking for?
icon
Search for relevant jobs from 10000+ companies such as Google, Amazon & Uber actively hiring on Cutshort.
companies logo
companies logo
companies logo
companies logo
companies logo
Get to hear about interesting companies hiring right now
Company logo
Company logo
Company logo
Company logo
Company logo
Linkedin iconFollow Cutshort
Users love Cutshort
Read about what our users have to say about finding their next opportunity on Cutshort.
Subodh Popalwar's profile image

Subodh Popalwar

Software Engineer, Memorres
For 2 years, I had trouble finding a company with good work culture and a role that will help me grow in my career. Soon after I started using Cutshort, I had access to information about the work culture, compensation and what each company was clearly offering.
Companies hiring on Cutshort
companies logos