What You'll Do:
- Lead the entire DevOps practice of mavQ and lead the team in managing the infrastructure and CI/CD ecosystem.
- Creating suitable DevOps channels across the organization.
- Conceptualize and manage tools and services to be used by the organization and by external users of the platform.
- Automate all operational and repetitive tasks to improve the efficiency and productivity of all development teams.
- Research and propose new solutions to improve the mavQ platform in aspects of speed, scalability, and security.
- Automate and manage the cloud infrastructure of the organization distributed across the globe and across multiple cloud providers such as Google Cloud and AWS.
- Manage CI/CD, Source Control and IAM services for the organization.
- Ensure thorough logging, monitoring, and alerting for all services and code running in the organization.
- Work with development teams on communications and protocols for distributed microservices.
- Oversee the SRE practices and define KPIs and SLAs for various services provided by the DevOps team.
- Must possess ample knowledge and experience in system automation, deployment, and implementation.
- Must have hands-on and thorough knowledge of using Linux, Kubernetes and Docker, Jenkins.
What You’ll Bring:
- Experience maintaining and deploying highly-available, fault-tolerant systems at scale
- Practical experience with containerization and clustering (Kubernetes / ECS / Docker / OpenShift)
- Version control system experience (e.g. Git, SVN)
- Experience implementing CI/CD (e.g. Jenkins, TravisCI)
- Experience with configuration management tools (e.g. Ansible, Chef)
- Experience with infrastructure-as-code (e.g. Terraform, Cloudformation)
- Expertise with AWS (e.g. IAM, EC2, VPC, ELB, ALB, Autoscaling, Lambda)
- Container Registry Solutions (Harbor, JFrog, Quay etc)
- Operational (e.g. HA/Backups) NoSQL experience (e.g. Cassandra, MongoDB, Redis)
- Good understanding on Kubernetes Networking & Security best practices
- Monitoring Tools like DataDog, or any other open-source tool like Prometheus, Nagios
- Load Balancer Knowledge (AVI Networks, NGINX)
Who You Are:
- 8 to 12 years of experience in Application development using Java, Spring and frontend angular etc., with experience in deploying to and managing it on Cloud infrastructure mainly AWS / GCP.
- Experience with managing cloud platforms using Terraform, Ansible and Packers.
- Ability to provide expert strategic advice for cloud application development/ deployment, private versus public cloud options and virtualization.
- Strong understanding of DevOps practices/tools including CICD Pipelines, IaC, SSO, Monitoring, Orchestration.
- Experience in managing the delivery of a high-availability infrastructure to support mission-critical systems and services.
- Architecting design, deploying, ensuring high scalability and high availability, cloud security implementation.
- Review and/or analyze and develop architectural requirements at domain level within product portfolio, team, or partnership engagement.
- Experience in handling Complex IT Infrastructure Solution Design & Implementation.
- Identify, document, triage and track issues to ensure resolution.
- Excellent communication & interpersonal skills, effective problem-solving skills and logical thinking ability and a strong commitment to professional and customer service excellence.
- Excellent teamwork skills & the ability to direct efforts of cross-functional teams for collaborative propositions.
Good to have:
- Experience in the software development process and tools and languages like Shell scripting, GoLang, Python, and Git.
- Knowledge in handling distributed services like ElasticSearch, Kafka, MongoDB, etc.
- Experience working in a consulting environment.
- Synthesis and analytical skills.
What we offer:
- Group Medical Insurance (Family Floater Plan - Self + Spouse + 2 Dependent Children)
- Sum Insured: INR 5,00,000/-
- Maternity cover upto two children
- Inclusive of COVID-19 Coverage
- Cashless & Reimbursement facility
- Access to free online doctor consultation
- Personal Accident Policy (Disability Insurance) -
- Sum Insured: INR. 25,00,000/- Per Employee
- Accidental Death and Permanent Total Disability is covered up to 100% of Sum Insured
- Permanent Partial Disability is covered as per the scale of benefits decided by the Insurer
- Temporary Total Disability is covered
- An option of Paytm Food Wallet (up to Rs. 2500) as a tax saver benefit
- Monthly Internet Reimbursement of upto Rs. 1,000
- Opportunity to pursue Executive Programs / courses at top universities globally
- Professional Development opportunities through various sponsored certifications on multiple technology stacks including Google Cloud, Amazon & others.
● Good understanding of how the web works
● Experience with at least one language like Java, Python etc
● Good with Shell scripting
● Experience with *Nix based operating systems
● Experience with k8s, containers
● Fairly good understanding of AWS/GCP/Azure
● Troubleshoot and fix outages and performance issues in infrastructure stack
● Identify gap and design automation tools for all feasible functions in infrastructure
● Good verbal and written communication skills
● Drive SLA/SLO of team
This is an opportunity to work on a fairly complex set of systems and improve
them. You will get a chance to learn things like “how to think about code
simplicity”, “how to write for maintainability” and several other things.
● Comprehensive health insurance policy.
● Flexible working hours and a very friendly work environment.
● Flexibility to work either in the office (post Covid) or remotely.
- Automate, manage and maintain Shopalyst's global AWS infrastructure
- Monitor infrastructure health, utilization, and performance. Implement/maintain automated alerting mechanisms for handling infra outages, document and maintain infra recovery procedures, ensure successful backup of critical data, continuously monitor infra utilization and alert on when to scale up/scale down instances, Services-wise infrastructure cost monitoring analysis, and optimization
- Monitor application health and implement/maintain alerting mechanisms to ensure minimum downtime
- Evaluate new cloud services, analyze the feasibility of adoption
- Ensure infrastructure security. Implementation and maintenance of different security/process certifications (PCI DSS, ISO 27002, SOC2, etc)
Required Skills and Experience:
- AWS: 2+ years experience with using a broad range of AWS technologies (e.g. EC2, S3, ELB, VPC, Route 53, IAM, CloudWatch, Lambda, Glacier) to develop and maintain an Amazon AWS based cloud solution
- DevOps: Solid experience as a DevOps Engineer in a 24x7 uptime Amazon AWS environment, including automation experience with configuration management tools.
- Scripting Skills: Strong scripting (e.g. Python) and automation skills.
- Operating Systems: Linux system administration and strong shell scripting skills
- Monitoring Tools: Experience with web servers (e.g. Nginx).
- Problem Solving: Ability to analyze and resolve complex infrastructure resource and application deployment issues.
- Version Control: Experience administrating version control systems such as Git
- DB Skills: Basic DB administration experience - Cassandra
- Search engines: Experience with search engines such as Apache Solr/Elastic Search
What are you going to do?
- Deploying, automating, maintaining and managing AWS and GCP cloud based production systems, to ensure the availability, performance, scalability and security of production systems.
- Build, release and configuration management of testing and production systems.
- System troubleshooting and problem solving across platform and application domains.
- Suggesting architecture improvements, recommending process improvements.
- Evaluate new technology options and products
- Ensuring critical system security through the use of best in class cloud security solutions.
- Keeping an update related to developer tools,DevOps cloud computing, Continuous Integration, Continuous Deployment, Blue Green Deployment, Continuous Monitoring, Automate Infrastructure, Continuous Delivery and Continuous Build, Continuous Testing.
You need to have:
- 1+ years experience with using a broad range of cloud technologies which will be covering AWS EC2, RDS, ELB, EBD, S3, VPC, Glacier, IAM, CloudWatch, Docker, Lambda etc. (or equivalents in GCP and/or Azure) to develop and maintain cloud solutions, with focus on practicing cloud security.
- Solid experience as a DevOps Engineer in a 24x7 uptime cloud environment, including automation experience with configuration management tools.
- Scripting Skills: Strong scripting and automation skills.
- Operating Systems: Windows and Linux system administration.
- Problem Solving: Ability to analyze and resolve complex infrastructure resource and application deployment issues.
- Hands on experience in AWS provisioning of AWS services like EC2, S3,EBS, AMI, VPC, ELB, RDS, Auto scaling groups, Cloud Formation.
- Good experience on Build and release process and extensively involved in the CICD using
- Experienced on configuration management tools like Ansible.
- Designing, implementing and supporting fully automated Jenkins CI/CD
- Extensively worked on Jenkins for continuous Integration and for end to end Automation for all Builds and Deployments.
- Proficient with Docker based container deployments to create shelf environments for dev teams and containerization of environment delivery for releases.
- Experience working on Docker hub, creating Docker images and handling multiple images primarily for middleware installations and domain configuration.
- Good knowledge in version control system in Git and GitHub.
- Good experience in build tools
- Implemented CI/CD pipeline using Jenkins, Ansible, Docker, Kubernetes ,YAML and Manifest
- 7-10 years experience with secure SDLC/DevSecOps practices such as automating security processes within CI/CD pipeline.
- At least 4 yrs. experience designing, and securing Data Lake & Web applications deployed to AWS, Azure, Scripting/Automation skills on Python, Shell, YAML, JSON
- At least 4 years of hands-on experience with software development lifecycle, Agile project management (e.g. Jira, Confluence), source code management (e.g. Git), build automation (e.g. Jenkins), code linting and code quality (e.g. SonarQube), test automation (e.g. Selenium)
- Hand-on & Solid understanding of Amazon Web Services & Azure-based Infra & applications
- Experience writing cloud formation templates, Jenkins, Kubernetes, Docker, and microservice application architecture and deployment.
- Strong know-how on VA/PT integration in CI/CD pipeline.
- Experience in handling financial solutions & customer-facing applications
- Accelerate enterprise cloud adoption while enabling rapid and stable delivery of capabilities using continuous integration and continuous deployment principles, methodologies, and technologies
- Manage & deliver diverse cloud [AWS, Azure, GCP] DevSecOps journeys
- Identify, prototype, engineer, and deploy emerging software engineering methodologies and tools
- Maximize automation and enhance DevSecOps pipelines and other tasks
- Define and promote enterprise software engineering and DevSecOps standards, practices, and behaviors
- Operate and support a suite of enterprise DevSecOps services
- Implement security automation to decrease the loop between the development and deployment processes.
- Support project teams to adopt & integrate the DevSecOps environment
- Managing application vulnerabilities, Data security, encryption, tokenization, access management, Secure SDLC, SAST/DAST
- Coordinate with development and operations teams for practical automation solutions and custom flows.
- Own DevSecOps initiatives by providing objective, practical and relevant ideas, insights, and advice.
- Act as Release gatekeeper with an understanding of OWASP top 10 lists of vulnerabilities, NIST SP-800-xx, NVD, CVSS scoring, etc concepts
- Build workflows to ensure a successful DevSecOps journey for various enterprise applications.
- Understand the strategic direction to reach business goals across multiple projects & teams
- Collaborate with development teams to understand project deliverables and promote DevSecOps culture
- Formulate & deploy cloud automation strategies and tools
- Knowledge of the DevSecOps culture and principles.
- An understanding of cloud technologies & components
- A flair for programming languages such as Shell, Python, Java Scripts,
- Strong teamwork and communication skills.
- Knowledge of threat modeling and risk assessment techniques.
- Up-to-date knowledge of cybersecurity threats, current best practices, and the latest software.
- An understanding of programs such as Puppet, Chef, ThreatModeler, Checkmarx, Immunio, and Aqua.
- Strong know-how of Kubernetes, Docker, AWS, Azure-based deployments
- On the job learning for new programming languages, automation tools, deployment architectures
This person MUST have:
- B.E Computer Science or equivalent
- 2+ Years of hands-on experience troubleshooting/setting up of the Linux environment, who can write shell scripts for any given requirement.
- 1+ Years of hands-on experience setting up/configuring AWS or GCP services from SCRATCH and maintaining them.
- 1+ Years of hands-on experience setting up/configuring Kubernetes & EKS and ensuring high availability of container orchestration.
- 1+ Years of hands-on experience setting up CICD from SCRATCH in Jenkins & Gitlab.
- Experience configuring/maintaining one monitoring tool.
- Excellent verbal & written communication skills.
- Candidates with certifications - AWS, GCP, CKA, etc will be preferred
- Hands-on experience with databases (Cassandra, MongoDB, MySQL, RDS).
- Min 3 years of experience as SRE automation engineer building, running, and maintaining production sites. Not looking for candidates who have experience only as L1/L2.
- Remotely, anywhere in India
- The person is expected to deliver with both high speed and high quality as well as work for 40 Hours per week (~6.5 hours per day, 6 days per week) in shifts which will rotate every month.
- Full time/Direct
- We have great benefits such as PF, medical insurance, 12 annual company holidays, 12 PTO leaves per year, annual increments, Diwali bonus, spot bonuses and other incentives etc.
- We dont believe in locking in people with large notice periods. You will stay here because you love the company. We have only a 15 days notice period.
● Building and managing multiple application environments on AWS using automation tools like Terraform or
● Deploy applications with zero downtime via automation with configuration management tools such as Ansible.
● Setting up Infrastructure monitoring tools such as Prometheus, Grafana
● Setting up centralised logging using tools such as ELK.
● Containerisation of applications/microservices.
● Ensure application availability to 99.9% with highly available infrastructure.
● Monitoring performance of applications and databases.
● Ensuring that systems are safe and secure against cyber security threats.
● Working with software developers to ensure that release cycle and deployment processes are followed.
● Evaluating existing applications and platforms, give recommendations for enhancing performance via gap analysis,
identifying the most practical alternative solutions and assisting with modifications.
● Strong knowledge of AWS Managed Services such as EC2, RDS, ECS, ECR, S3, Cloudfront, SES, Redshift, Elastic Cache,
● Experience in handling production workloads.
● Experience with Nginx web server.
● Experience with NoSql and Sql Databases such as MongoDB, Postgresql etc.
● Experience with Containerisation of applications/micro services using Docker.
● Understanding of system administration in Linux environments.
● Strong Knowledge of Infrastructure as a Code such as Terraform, Cloudformation etc.
● Strong knowledge of configuration management tools such as Ansible, Chef etc.
● Familiarity with tools such as GitLab, Jenkins, Vercel, JIRA etc.
● Proficiency in scripting languages including Bash, Python etc.
● Full understanding of software development lifecycle best practices and agile methodology
● Strong communication and documentation skills.
● An ability to drive to goals and milestones while valuing and maintaining a strong attention to detail
● Excellent judgment, analytical thinking, and problem-solving skills
● Self-motivated individual that possesses excellent time management and organizational skills
You'd be meticulously analyzing project requirements and carry forward the development of highly robust, scalable and easily maintainable backend applications, work independently, and you'll have the support & opportunity to thrive in a fast-paced environment.
Responsibilities and Duties:
- building and setting up new development tools and infrastructure
- understanding the needs of stakeholders and conveying this to developers
- working on ways to automate and improve development and release processes
- testing and examining code written by others and analysing results
- ensuring that systems are safe and secure against cybersecurity threats
- identifying technical problems and developing software updates and ‘fixes’
- working with software developers and software engineers to ensure that development follows established processes and works as intended
- planning out projects and being involved in project management decisions
- Managing GitHub (example: - creating branches for test, QA, development and production, creating Release tags, resolve merge conflict)
- Setting up of the servers based on the projects in either AWS or Azure (test, development, QA, staging and production)
- AWS S3 configuring and s3 web hosting, Archiving data from s3 to s3-glacier
- Deploying the build(application) to the servers using AWS CI/CD and Jenkins (Automated and manual)
- AWS Networking and Content delivery (VPC, Route 53 and CloudFront)
- Managing databases like RDS, Snowflake, Athena, Redis and Elasticsearch
- Managing IAM roles and policies for the functions like Lambda, SNS, aws cognito, secret manager, certificate manager, Guard Duty, Inspector EC2 and S3.
- AWS Analytics (Elasticsearch, Athena, Glue and kinesis).
- AWS containers (elastic container registry, elastic container service, elastic Kubernetes service, Docker Hub and Docker compose
- AWS Auto scaling group (launch configuration, launch template) and load balancer
- EBS (snapshots, volumes and AMI.)
- AWS CI/CD build spec scripting, Jenkins groovy scripting, shell scripting and python scripting.
- Sagemaker, Textract, forecast, LightSail
- Android and IOS automation building
- Monitoring tools like cloudwatch, cloudwatch log group, Alarm, metric dashboard, SNS(simple notification service), SES(simple email service)
- Amazon MQ
- Operating system Linux and windows
- X-Ray, Cloud9, Codestar
- Fluent Shell Scripting
- Soft Skills
- Knowledge On Various DevOps Tools And Technologies
Qualifications and Skills
Job Type: Full-time
Experience: 4 - 7 yrs
Qualification: BE/ BTech/MCA.
Location: Bengaluru, Karnataka
As an Infrastructure Engineer at Navi, you will be building a resilient infrastructure platform, using modern Infrastructure engineering practices.
You will be responsible for the availability, scaling, security, performance and monitoring of the navi Cloud platform. You’ll be joining a team that follows best practices in infrastructure as code
Your Key Responsibilities
- Build out the Infrastructure components like API Gateway, Service Mesh, Service Discovery, container orchestration platform like kubernetes.
- Developing reusable Infrastructure code and testing frameworks
- Build meaningful abstractions to hide the complexities of provisioning modern infrastructure components
- Design a scalable Centralized Logging and Metrics platform
- Drive solutions to reduce Mean Time To Recovery(MTTR), enable High Availability.
What to Bring
- Good to have experience in managing large scale cloud infrastructure, preferable AWS and Kubernetes
- Experience in developing applications using programming languages like Java, Python and Go
- Experience in handling logs and metrics at a high scale.
- Systematic problem-solving approach, coupled with strong communication skills and a sense of ownership and drive.
2. Has done Infrastructure coding using Cloudformation/Terraform and Configuration also understands it very clearly
3. Deep understanding of the microservice design and aware of centralized Caching(Redis),centralized configuration(Consul/Zookeeper)
4. Hands-on experience of working on containers and its orchestration using Kubernetes
5. Hands-on experience of Linux and Windows Operating System
6. Worked on NoSQL Databases like Cassandra, Aerospike, Mongo or
Couchbase, Central Logging, monitoring and Caching using stacks like ELK(Elastic) on the cloud, Prometheus, etc.
7. Has good knowledge of Network Security, Security Architecture and Secured SDLC practices