- 2+ years of demonstrable experience leading site reliability and performance in large-scale, high-traffic environments
- 2+ years of hands-on experience as a DevOps engineer
- Strong leadership, communication and interpersonal skills geared to getting things done
- Developing themselves and the talent within their charge – fostering and creating opportunity for the team
- Strong understanding of SRE concepts and the DevOps culture. Set the direction and strategy for your team, and help shape the overall SRE program for the company
- Be able to lead complicated technical issues and communicating status updates/RCA with management and customers.
- Own site stability, performance, capacity planning, DevOps recruitment.
About Vernacular.ai
Similar jobs
Job description
- Driving the cloud Solutioning ( AWS, Azure , Hybrid Cloud) activities for large complex deals which involve multiple service lines and / or technology domains
- Drive the translation of complex business initiatives into innovative business- technology solutions and ensure consistency across traditional solution boundaries
- Work with Internal stakeholders, Customer stakeholders and Project Managers to understand inefficiencies in clients existing business processes and applications and recommend solutions
- Supports the Global Sales Lead in engaging with senior level customers in either first meetings, or early stages to help shape and design early propositions, assisting to build the pipeline
- Ensure that the solution translated from business objectives is fit for purpose and clearly demonstrates value for money. The solution executive should be able to be confidently explain this to CxO level customer
- Lead a bid team, combining on- shore and off- shore solution architects to design an affordable, innovative solution which meets a clients requirements and business needs. This solution should fit within the affordability target set together with the Global Sales Leads
- Define solution value proposition and transformational direction which build on the synergies and benefits across service offers
- Provide expertise on commercially structuring deals to differentiate from the competition
- Excellent understanding of the competitor landscape, providing insight into the sales plan on how to beat competition
- Work alongside Global Sales Leads, generating future pipeline
- Ensure that the proposed solution covers strategy, partners (such as AWS, Azure, Google, Hybrid Cloud), stakeholder management as well as the actual solution covering Business, Application and Infrastructure as well as commercial aspects (in terms of value for money and not commercial costing etc.)
- Consultative approach, strong business acumen and commercial awareness, with the ability to translate business issues into relevant technical solutions and competitive propositions
- Recent experience in working for a Tier 1/2 Technology Services Provider or major Cloud Services provider in a pre- sales solutioning role
- Proficient in the Pre- Sales Solutioning Process lead by 3rd Party Advisors
- Prior experience leading, costing and implementing large complex Infrastructure Technology Outsourcing (ITO) pursuits, preferably 50M TCV with a large technology transformation component i.e. workload migration to Public Cloud, data center consolidations, etc.
- Strong proficiency creating business willing solutions aligned with key market growth areas; Public/Hybrid Cloud, Cyber security
- Demonstrated ability to communicate (written verbal) effectively and to influence at CxO level
Cloud Skills
- Experience and/or Certification: AWS - Solution Architect, Microsoft - MCSA/MCSE would be advantageous
- AWS , AZURE , Google Cloud Hybrid Cloud , Cloud Infrastructure , Private Cloud
Infra360 Solutions is a services company specializing in Cloud, DevSecOps, Security, and Observability solutions. We help technology companies adapt DevOps culture in their organization by focusing on long-term DevOps roadmap. We focus on identifying technical and cultural issues in the journey of successfully implementing the DevOps practices in the organization and work with respective teams to fix issues to increase overall productivity. We also do training sessions for the developers and make them realize the importance of DevOps. We provide these services - DevOps, DevSecOps, FinOps, Cost Optimizations, CI/CD, Observability, Cloud Security, Containerization, Cloud Migration, Site Reliability, Performance Optimizations, SIEM and SecOps, Serverless automation, Well-Architected Review, MLOps, Governance, Risk & Compliance. We do assessments of technology architecture, security, governance, compliance, and DevOps maturity model for any technology company and help them optimize their cloud cost, streamline their technology architecture, and set up processes to improve the availability and reliability of their website and applications. We set up tools for monitoring, logging, and observability. We focus on bringing the DevOps culture to the organization to improve its efficiency and delivery.
Job Description
Our Mission
Our mission is to help customers achieve their business objectives by providing innovative, best-in-class consulting, IT solutions and services and to make it a joy for all stakeholders to work with us. We function as a full stakeholder in business, offering a consulting-led approach with an integrated portfolio of technology-led solutions that encompass the entire Enterprise value chain.
Our Customer-centric Engagement Model defines how we engage with you, offering specialized services and solutions that meet the distinct needs of your business.
Our Culture
Culture forms the core of our foundation and our effort towards creating an engaging workplace has resulted in Infra360 Solution Pvt Ltd.
Our Tech-Stack:
- Azure DevOps, Azure Kubernetes Service, Docker, Active Directory (Microsoft Entra)
- Azure IAM and managed identity, Virtual network, VM Scale Set, App Service, Cosmos
- Azure, MySQL Scripting (PowerShell, Python, Bash),
- Azure Security, Security Documentation, Security Compliance,
- AKS, Blob Storage, Azure functions, Virtual Machines, Azure SQL
- AWS - IAM, EC2, EKS, Lambda, ECS, Route53, Cloud formation, Cloud front, S3
- GCP - GKE, Compute Engine, App Engine, SCC
- Kubernetes, Linux, Docker & Microservices Architecture
- Terraform & Terragrunt
- Jenkins & Argocd
- Ansible, Vault, Vagrant, SaltStack
- CloudFront, Apache, Nginx, Varnish, Akamai
- Mysql, Aurora, Postgres, AWS RedShift, MongoDB
- ElasticSearch, Redis, Aerospike, Memcache, Solr
- ELK, Fluentd, Elastic APM & Prometheus Grafana Stack
- Java (Spring/Hibernate/JPA/REST), Nodejs, Ruby, Rails, Erlang, Python
What does this role hold for you…??
- Infrastructure as a code (IaC)
- CI/CD and configuration management
- Managing Azure Active Directory (Entra)
- Keeping the cost of the infrastructure to the minimum
- Doing RCA of production issues and providing resolution
- Setting up failover, DR, backups, logging, monitoring, and alerting
- Containerizing different applications on the Kubernetes platform
- Capacity planning of different environments infrastructure
- Ensuring zero outages of critical services
- Database administration of SQL and NoSQL databases
- Setting up the right set of security measures
Requirements
Apply if you have…
- A graduation/post-graduation degree in Computer Science and related fields
- 2-4 years of strong DevOps experience in Azure with the Linux environment.
- Strong interest in working in our tech stack
- Excellent communication skills
- Worked with minimal supervision and love to work as a self-starter
- Hands-on experience with at least one of the scripting languages - Bash, Python, Go etc
- Experience with version control systems like Git
- Understanding of Azure cloud computing services and cloud computing delivery models (IaaS, PaaS, and SaaS)
- Strong scripting or programming skills for automating tasks (PowerShell/Bash)
- Knowledge and experience with CI/CD tools: Azure DevOps, Jenkins, Gitlab etc.
- Knowledge and experience in IaC at least one (ARM Templates/ Terraform)
- Strong experience with managing the Production Systems day in and day out
- Experience in finding issues in different layers of architecture in a production environment and fixing them
- Experience in automation tools like Ansible/SaltStack and Jenkins
- Experience in Docker/Kubernetes platform and managing OpenStack (desirable)
- Experience with Hashicorp tools i.e. Vault, Vagrant, Terraform, Consul, VirtualBox etc. (desirable)
- Experience in Monitoring tools like Prometheus/Grafana/Elastic APM.
- Experience in logging tools Like ELK/Loki.
- Experience in using Microsoft Azure Cloud services
If you are passionate about infrastructure, and cloud technologies, and want to contribute to innovative projects, we encourage you to apply. Infra360 offers a dynamic work environment and opportunities for professional growth.
Interview Process
Application Screening=>Test/Assessment=>2 Rounds of Tech Interview=>CEO Round=>Final Discussion
Must have | Proficient exp of minimum 4 years into DevOps with at least one devops end to end project implementation. Strong expertise on DevOps concepts like Continuous Integration (CI), Continuous delivery (CD) and Infrastructure as Code, Cloud deployments. Minimum exp of 2.5-3 years of Configuration, development and deployment with their underlying technologies including Docker/Kubernetes and Prometheus. Should have implemented an end to end devops pipeline using Jenkins or any similar framework. Experience with Microservices architecture. Sould have sound knowledge in branching and merging strategies. Experience working with cloud computing technologies like Oracle Cloud *(preferred) /GCP/AWS/OpenStack Strong experience in AWS/Azure/GCP/open stack , deployment process, dockerization. Good experience in release management tools like JIRA or similar tools. |
Good to have | Knowledge of Infra automation tools Terraform/CHEF/ANSIBLE (Preferred) Experience in test automation tools like selenium/cucumber/postman Good communication skills to present devops solutions to the client and drive the implementation. Experience in creating and managing custom operational and monitoring scripts. Good knowledge in source control tools like Subversion, Git,bitbucket, clearcase. Experience in system architecture design |
YOUR ‘OKR’ SUMMARY
OKR means Objective and Key Results.
As a Cloud Engineer, you will understand the overall movement of data in the entire platform, find bottlenecks,define solutions, develop key pieces, write APIs and own deployment of those. You will work with internal and external development teams to discover these opportunities and to solve hard problems. You will also guide engineers in solving the complex problems, developing your acceptance tests for those and reviewing the work and
the test results.
What you will do
- End to End Complete RHOSP Deployment (Under and Overcloud) considered as a NFVi Deployment
- Installing Red Hat’s OpenStack technology using the OSP-director
- Deploying a Red Hat OpenStack Platform based on Red Hat's reference architecture
- Deploying managed hosts with required OpenStack parameters
- Deploying Three (3) Node Highly available (HA) Controller Hosts using Pacemaker and HAP Roxy
- Deploying all the supplied compute hosts that will be hosting multiple VNFs (SRIOV & DPDK)
- Implementing CEPH
- Integrate Software-defined storage (CEPH) with RHOSP, Red Hat OpenStack Platform Operation Tools as
per Industry standard & best practices
- Detailed Network configuration and implementation using Neutron networking - (VXLAN) network type
and Modular Layer 2 (ML2) Open Switch plugin
- Integrating Monitoring Solution with RHOSP
- Design and deployment of common Alarm and Performance Management solution
- Red Hat OpenStack management & monitoring
- VM Alarm and Performance management
- Cloud Management Platform will be configured for day-to-day operational tools to measure uponCPU/Mem/Network utilization etc. from VM level
- Baseline Security Standard) & VA (Vulnerability Assessment) for RHOSP.
Additional Advantage:
- Deep understanding of technology and passionate about what you do.
- Background in designing high performant scalable software systems with strong focus to optimize hardware cost.
- Solid collaborative and interpersonal skills, specifically a proven ability to effectively guide and influence within a dynamic environment.
- Strong commitment to get the most performance out of a system being worked on.
- Prior development of a large software project using service-oriented architecture operating with real time constraints.
What's In It for You
- You will get a chance to work on cloud-native and hyper-scale products
- You will be working with industry leaders in cloud.
- You can expect a steep learning curve.
- You will get the experience of solving real time problems, eventually you become a problem solver.
Benefits & Perks
- Competitive Salary
- Health Insurance
- Open Learning - 100% Reimbursement for online technical courses.
- Fast Growth - opportunities to grow quickly and surely
- Creative Freedom + Flat hierarchy
- Sponsorship to all those employees who represent company in events and meet ups.
- Flexible working hours
- 5 days week
- Hybrid Working model (Office and WFH)
Our Hiring Process
Candidates for this position can expect the hiring process as follows (subject to successful clearing of every round)
- Initial Resume screening call with our Recruiting team
- Next, candidates will be invited to solve coding exercises.
- Next, candidates will be invited for first technical interview
- Next, candidates will be invited for final technical interview
- Finally, candidates will be invited for Culture Plus interview with HR
- Candidates may be asked to interview with the Leadership team
- Successful candidates will subsequently be made an offer via email
As always, the interviews and screening call will be conducted via a mix of telephonic and video call.
So, if you are looking at an opportunity to really make a difference- make it with us…
Coredge.io provides equal employment opportunities to all employees and applicants for employment and prohibits
discrimination and harassment of any type without regard to race, color, religion, age, sex, national origin, disability
status, genetics, protected veteran status, sexual orientation, gender identity or expression, or any other
characteristic protected by applicable central, state or local laws.
- Provision Dev Test Prod Infrastructure as code using IaC (Infrastructure as Code)
- Good knowledge on Terraform
- In-depth knowledge of security and IAM / Role Based Access Controls in Azure, management of Azure Application/Network Security Groups, Azure Policy, and Azure Management Groups and Subscriptions.
- Experience with Azure and GCP compute, storage and networking (we can also look for GCP )
- Experience in working with ADLS Gen2, Databricks and Synapse Workspace
- Experience supporting cloud development pipelines using Git, CI/CD tooling, Terraform and other Infrastructure as Code tooling as appropriate
- Configuration Management (e.g. Jenkins, Ansible, Git, etc...)
- General automation including Azure CLI, or Python, PowerShell and Bash scripting
- Experience with Continuous Integration/Continuous Delivery models
- Knowledge of and experience in resolving configuration issues
- Understanding of software and infrastructure architecture
- Experience in Paas, Terraform and AKS
- Monitoring, alerting and logging tools, and build/release processes Understanding of computing technologies across Windows and Linux
Mandatory:
● A minimum of 1 year of development, system design or engineering experience ●
Excellent social, communication, and technical skills
● In-depth knowledge of Linux systems
● Development experience in at least two of the following languages: Php, Go, Python,
JavaScript, C/C++, Bash
● In depth knowledge of web servers (Apache, NgNix preferred)
● Strong in using DevOps tools - Ansible, Jenkins, Docker, ELK
● Knowledge to use APM tools, NewRelic is preferred
● Ability to learn quickly, master our existing systems and identify areas of improvement
● Self-starter that enjoys and takes pride in the engineering work of their team ● Tried
and Tested Real-world Cloud Computing experience - AWS/ GCP/ Azure ● Strong
Understanding of Resilient Systems design
● Experience in Network Design and Management
Roles and Responsibilities
● Managing Availability, Performance, Capacity of infrastructure and applications.
● Building and implementing observability for applications health/performance/capacity.
● Optimizing On-call rotations and processes.
● Documenting “tribal” knowledge.
● Managing Infra-platforms like
- Mesos/Kubernetes
- CICD
- Observability(Prometheus/New Relic/ELK)
- Cloud Platforms ( AWS/ Azure )
- Databases
- Data Platforms Infrastructure
● Providing help in onboarding new services with the production readiness review process.
● Providing reports on services SLO/Error Budgets/Alerts and Operational Overhead.
● Working with Dev and Product teams to define SLO/Error Budgets/Alerts.
● Working with the Dev team to have an in-depth understanding of the application architecture and its bottlenecks.
● Identifying observability gaps in product services, infrastructure and working with stake owners to fix it.
● Managing Outages and doing detailed RCA with developers and identifying ways to avoid that situation.
● Managing/Automating upgrades of the infrastructure services.
● Automate toil work.
Experience & Skills
● 3+ Years of experience as an SRE/DevOps/Infrastructure Engineer on large scale microservices and infrastructure.
● A collaborative spirit with the ability to work across disciplines to influence, learn, and deliver.
● A deep understanding of computer science, software development, and networking principles.
● Demonstrated experience with languages, such as Python, Java, Golang etc.
● Extensive experience with Linux administration and good understanding of the various linux kernel subsystems (memory, storage, network etc).
● Extensive experience in DNS, TCP/IP, UDP, GRPC, Routing and Load Balancing.
● Expertise in GitOps, Infrastructure as a Code tools such as Terraform etc.. and Configuration Management Tools such as Chef, Puppet, Saltstack, Ansible.
● Expertise of Amazon Web Services (AWS) and/or other relevant Cloud Infrastructure solutions like Microsoft Azure or Google Cloud.
● Experience in building CI/CD solutions with tools such as Jenkins, GitLab, Spinnaker, Argo etc.
● Experience in managing and deploying containerized environments using Docker,
Mesos/Kubernetes is a plus.
● Experience with multiple datastores is a plus (MySQL, PostgreSQL, Aerospike,
Couchbase, Scylla, Cassandra, Elasticsearch).
● Experience with data platforms tech stacks like Hadoop, Hive, Presto etc is a plus