
Lead Devops Engineer
at Cloud infrastructure solutions and support company. (SE1)
Specific responsibilities commensurate with experience and include:
- Ability to react quickly and effectively to identify and resolve issues that heavily impact CI/CD system (immediate mitigation of impact, long-term resolution including strategies for risk mitigation/monitoring/alert for proactive resolution of potential future occurrences)
- Design, develop, unit test, and implement build automation scripts including environment configuration validation processes
- Automate and improve development process by evaluation and introduction of new tools and scripts, and manage their life cycle and validation
- Determine branching strategy and maintain branches for various components, products, and product lines
- Come up with solutions to open-ended problems that focus on workflow improvements for the Software department
- Address issues with well-defined requirements efficiently; come up with short-term and long-term solutions and staged deployment strategies
- Self-driven-- takes action to move tickets from start to completion with minimal oversight
- Ability to communicate with and consider perspectives of stakeholders including but not limited to: IT, software development, verification
- Ability to break down a problem into smaller components and solve them in a logical, controlled, clearly explainable approach
- Lead the creation and maintenance of a pre-production environment as a testbed for build process improvements and changes before deployment to the production environment
- Gather metrics via direct input, data based on analysis of developer working habits analysis and pain points to assess current state and areas requiring further improvement
- Define chain of communication and immediate paths of action in the case of a build fault state
- Ability to work within constraints of the internal network without access to commercial cloud solutions
- Create metrics that define ‘efficiency’ and ‘reliability’ in measurable terms, and track them
- Perform static code and security analysis
- Design and execute unit tests and perform code coverage analysis
- Able to work in Agile development team environment
Key Requirement & Qualifications:
- Bachelor’s degree (or higher) in Electrical Engineering, Computer Engineering, Computer Science or equivalent
- 6+ years (minimum) experience handling Build, Release, and Deployment of software on Windows and/or Linux environments (on-premise)
- Experience with the development and deployment of CM processes and tools
- Build automation for .NET using TeamCity (Jenkins is an asset)
- Scripting languages: Windows batch scripting, Powershell, Ant/NAnt
- Source control systems usage, branching strategies, and workflow (Git preferred, Subversion)
- 6+ years of hands-on programming experience with C# and .NET (both Framework and Core)
- Troubleshooting and debugging-- what information to gather when there are issues with CI/CD system, and how to gather it (i.e., analyzing network communication? Windows crash dumps, java logs, etc.)
- 6+ years (minimum) in web/desktop application software development experience
- Excellent problem solving, critical and analytical thinking
- Strong team player who understands SDLC and QA methodologies
- A professional, results-oriented individual with a high degree of self-motivation
- Excellent written and verbal communication skills and the ability to coordinate work/activities with multiple software/IT teams
- Working with virtual machines and build management on virtual machines (VMware preferred).
- Managing configurations for multiple build environments
- OS administration and scripting experience (Windows is a must, Linux desired)
- Experience with test automation tools (NUnit, customer inhouse frameworks) and strategies is an asset
- Creation and maintenance of monitoring and alert systems (Zabbix)
- Familiarity with databases (SQL-based) - create, modify, optimize (via script)
- Data and metrics gathering, aggregation, and reporting
- Experience with work management and documentation tools: JIRA and Confluence

Similar jobs
DevOps Lead Engineer
We are seeking a skilled DevOps Lead Engineer with 8 to 10 yrs. of experience who handles the entire DevOps lifecycle and is accountable for the implementation of the process. A DevOps Lead Engineer is liable for automating all the manual tasks for developing and deploying code and data to implement continuous deployment and continuous integration frameworks. They are also held responsible for maintaining high availability of production and non-production work environments.
Essential Requirements (must have):
• Bachelor's degree preferable in Engineering.
• Solid 5+ experience with AWS, DevOps, and related technologies
Skills Required:
Cloud Performance Engineering
• Performance scaling in a Micro-Services environment
• Horizontal scaling architecture
• Containerization (such as Dockers) & Deployment
• Container Orchestration (such as Kubernetes) & Scaling
DevOps Automation
• End to end release automation.
• Solid Experience in DevOps tools like GIT, Jenkins, Docker, Kubernetes, Terraform, Ansible, CFN etc.
• Solid experience in Infra Automation (Infrastructure as Code), Deployment, and Implementation.
• Candidates must possess experience in using Linux, Jenkins, and ample experience in Configuring and automating the monitoring tools.
• Strong scripting knowledge
• Strong analytical and problem-solving skills.
• Cloud and On-prem deployments
Infrastructure Design & Provisioning
• Infra provisioning.
• Infrastructure Sizing
• Infra Cost Optimization
• Infra security
• Infra monitoring & site reliability.
Job Responsibilities:
• Responsible for creating software deployment strategies that are essential for the successful
deployment of software in the work environment and provide stable environment for delivery of
quality.
• The DevOps Lead Engineer is accountable for designing, building, configuring, and optimizing
automation systems that help to execute business web and data infrastructure platforms.
• The DevOps Lead Engineer is involved in creating technology infrastructure, automation tools,
and maintaining configuration management.
• The Lead DevOps Engineer oversees and leads the activities of the DevOps team. They are
accountable for conducting training sessions for the juniors in the team, mentoring, career
support. They are also answerable for the architecture and technical leadership of the complete
DevOps infrastructure.
JOB DETAILS
What You'll Do
This company is a network of the world's best developers - full-time, long-term remote software jobs with better compensation and career growth. We enable our clients to accelerate their Cloud Offering, and Capitalize on Cloud. We have our own IOT/AI platform and we provide professional services on that platform to build custom clouds for their IOT devices. We also build mobile apps, run 24x7 devops/site reliability engineering for our clients.
We are looking for very hands-on SRE (Site Reliability Engineering) engineers with 3 to 6 years of experience. The person will be part of team that is responsible for designing & implementing automation from scratch for medium to large scale cloud infrastructure and providing 24x7 services to our North American / European customers. This also includes ensuring ~100% uptime for almost 50+ internal sites. The person is expected to deliver with both high speed and high quality as well as work for 40 Hours per week (~6.5 hours per day, 6 days per week) in shifts which will rotate every month.
This person MUST have:
- B.E Computer Science or equivalent
- 2+ Years of hands-on experience troubleshooting/setting up of the Linux environment, who can write shell scripts for any given requirement.
- 1+ Years of hands-on experience setting up/configuring AWS or GCP services from SCRATCH and maintaining them.
- 1+ Years of hands-on experience setting up/configuring Kubernetes & EKS and ensuring high availability of container orchestration.
- 1+ Years of hands-on experience setting up CICD from SCRATCH in Jenkins & Gitlab.
- Experience configuring/maintaining one monitoring tool.
- Excellent verbal & written communication skills.
- Candidates with certifications - AWS, GCP, CKA, etc will be preferred
- Hands-on experience with databases (Cassandra, MongoDB, MySQL, RDS).
Experience:
- Min 3 years of experience as SRE automation engineer building, running, and maintaining production sites. Not looking for candidates who have experience only as L1/L2 or Build & Deploy..
Location:
- Remotely, anywhere in India
Timings:
- The person is expected to deliver with both high speed and high quality as well as work for 40 Hours per week (~6.5 hours per day, 6 days per week) in shifts which will rotate every month.
Position:
- Full time/Direct
- We have great benefits such as PF, medical insurance, 12 annual company holidays, 12 PTO leaves per year, annual increments, Diwali bonus, spot bonuses and other incentives etc.
- We dont believe in locking in people with large notice periods. You will stay here because you love the company. We have only a 15 days notice period.
Numerator is a data and technology company reinventing market research. Headquartered in Chicago, IL, Numerator has 1,600 employees worldwide. The company blends proprietary data with advanced technology to create unique insights for the market research industry that has been slow to change. The majority of Fortune 100 companies are Numerator clients.
Job Description
What We Do and How?
We are a market research company, revolutionizing how it's done! We mix fast paced development and unique approaches to bring best practices and strategy to our technology. Our tech stack is deep, leveraging several languages and frameworks including Python, C#, Java, Kotlin, React, Angular, and Django among others. Our engineering hurdles sit at the intersection of technologies ranging from mobile, computer vision and crowdsourcing, to machine learning and big data analytics.
Our Team
From San Francisco to Chicago to Ottawa, our R&D team is comprised of talented individuals spanning across a robust tech stack. The R&D team is comprised of product, data analytics, engineers across Front End, Back End, DevOps, Business Intelligence, ETL, Data Science, Mobile Apps, and much more. Across these different groups we work towards one common goal: To build products into efficient and seamless user experiences that help our clients succeed.
Numerator is looking for a Infrastructure Engineer to join our growing team. This is a unique opportunity where you will get a chance to work with an established and rapidly evolving platforms that handles millions of requests and massive amounts of data. In this position, you will be responsible for taking on new initiatives to automate, enhance, maintain, and scale services in a rapidly-scaling SaaS environment.
As a member of our team, you will make an immediate impact as you help build out and expand our technology platforms across several software products. This is a fast-paced role with high growth, visibility, impact, and where many of the decisions for new projects will be driven by you and your team from inception through production.
Some of the technologies we frequently use include: Terraform, Ansible, SumoLogic, Kubernetes, and many AWS-native services.
• Develop and test the cloud infrastructure to scale a rapidly growing ecosystem.
• Monitor and improve DevOps tools and processes, automate mundane tasks, and improve system reliability.
• Provide deep expertise to help steer scalability and stability improvements early in the life-cycle of development while working with the rest of the team to automate existing processes that deploy, test, and lead our production environments.
• Train teams to improve self-healing and self-service cloud-based ecosystems in an evolving AWS infrastructure.
• Build internal tools to demonstrate performance and operational efficiency.
• Develop comprehensive monitoring solutions to provide full visibility to the different platform components using tools and services like Kubernetes, Sumologic, Prometheus, Grafana.
• Identify and troubleshoot any availability and performance issues at multiple layers of deployment, from hardware, operating environment, network, and application.
• Work cross-functionally with various teams to improve Numerator’s infrastructure through automation.
• Work with other teams to assist with issue resolutions related to application configuration, deployment, or debugging.
• Lead by example and evangelize DevOps best practice within other engineering teams at Numerator.
Skills & Requirements
What you bring
• A minimum of 3 years of work experience in backend software, DevOps, or a related field.
• A passion for software engineering, automation and operations and are excited about reliability, availability and performance.
• Availability to participate in after-hours on-call support with your fellow engineers.
• Strong analytical and problem-solving mindset combined with experience troubleshooting large scale systems.
• Fundamental knowledge in networking; operating systems; package build system (IP subnets and routing, ACL’s, Core Ubuntu, PIP and NPM).
• Experience with automation technologies to build, deploy and integrate both infrastructure and applications (e.g., Terraform, Ansible).
• Experience using scripting languages like Python and *nix tools (Bash, sed/awk, Make).
• You enjoy developing and managing real-time distributed platforms and services that scale billions of requests.
• Have the ability to manage multiple systems across stratified environments.
• A deep enthusiasm for the Cloud and DevOps and keen to get other people involved.
• Experience with scaling and operationalizing distributed data stores, file systems and services.
• Running services in AWS or other cloud platforms, strong experience with Linux systems.
• Experience in modern software paradigms including cloud applications and serverless architectures.
• You look ahead to identify opportunities and foster a culture of innovation.
• BS, MS or Ph.D. in Computer Science or a related field, or equivalent work experience.
Nice to haves
• Previous experience working with a geographically distributed software engineering team.
• Experience working with Jenkins or Circle-CI
• Experience with storage optimizations and management
• Solid understanding of building scalable, highly performant systems and services
• Expertise with big data, analytics, machine learning, and personalization.
• Start-up or CPG industry experience
If this sounds like something you would like to be part of, we’d love for you to apply! Don't worry if you think that you don't meet all the qualifications here. The tools, technology, and methodologies we use are constantly changing and we value talent and interest over specific experience.
Disclaimer: We do not charge any fee for employment and the same applies to the Recruitment Partners who we work with. Numerator is an equal opportunity employer. Employment decisions are based on merit. Additionally, we do not ask for any refundable security deposit to be paid in bank accounts for employment purposes. We request candidates to be cautious of misleading communications and not pay any fee/ deposit to individuals/ agencies/ employment portals on the pretext of attending Numerator interview process or seeking employment with us. These would be fraudulent in nature. Anyone dealing with such individuals/agencies/
We are an equal opportunity employer and all qualified applicants will receive consideration for employment without regard to race, color, religion, sex, national origin, disability status, protected veteran status, or any other characteristic protected by law.
As an Infrastructure Engineer at Navi, you will be building a resilient infrastructure platform, using modern Infrastructure engineering practices.
You will be responsible for the availability, scaling, security, performance and monitoring of the navi Cloud platform. You’ll be joining a team that follows best practices in infrastructure as code
Your Key Responsibilities
- Build out the Infrastructure components like API Gateway, Service Mesh, Service Discovery, container orchestration platform like kubernetes.
- Developing reusable Infrastructure code and testing frameworks
- Build meaningful abstractions to hide the complexities of provisioning modern infrastructure components
- Design a scalable Centralized Logging and Metrics platform
- Drive solutions to reduce Mean Time To Recovery(MTTR), enable High Availability.
What to Bring
- Good to have experience in managing large scale cloud infrastructure, preferable AWS and Kubernetes
- Experience in developing applications using programming languages like Java, Python and Go
- Experience in handling logs and metrics at a high scale.
- Systematic problem-solving approach, coupled with strong communication skills and a sense of ownership and drive.
REVOS is a smart micro-mobility platform that works with enterprises across the automotive shared mobility value chain to enable and accelerate their smart vehicle journeys. Founded in 2017, it aims to empower all 2 and 3 wheeler vehicles through AI-integrated IoT solutions that will make them smart, safe, connected. We are backed by investors like USV and Prime Venture.
Duties and Responsibilities :
- Automating various tasks in cloud operations, deployment, monitoring, and performance optimization for big data stack.
- Build, release, and configuration management of production systems.
- System troubleshooting and problem-solving across platform and application domains.
- Suggesting architecture improvements, recommending process improvements.
- Evaluate new technology options and vendor products.
- Function well in a fast-paced, rapidly-changing environment
- Communicate effectively with people at all levels of the organization
Qualifications and Required Skills:
- Overall 3+ years of experience in various software engineering roles.
- 3+ years of experience in building applications and tools in any tech stack, preferably deployed on cloud
- Recent 3 years’ experience must be on Serverless/cloud-native development in AWS (preferred)/Azure
- Expertise in any of the programming languages – (NodeJS or Python preferable)
- Must have hands-on experience in using AWS/Azure - SDK/APIs.
- Must have experience in deploying, releasing, and managing production systems
- MCA or a degree in engineering in Computer Science, IT, or Electronics stream
• At least 4 years of hands-on experience with cloud infrastructure on GCP
• Hands-on-Experience on Kubernetes is a mandate
• Exposure to configuration management and orchestration tools at scale (e.g. Terraform, Ansible, Packer)
• Knowledge and hand-on-experience in DevOps tools (e.g. Jenkins, Groovy, and Gradle)
• Knowledge and hand-on-experience on the various platforms (e.g. Gitlab, CircleCl and Spinnakar)
• Familiarity with monitoring and alerting tools (e.g. CloudWatch, ELK stack, Prometheus)
• Proven ability to work independently or as an integral member of a team
Preferable Skills:
• Familiarity with standard IT security practices such as encryption,
credentials and key management.
• Proven experience on various coding languages (Java, Python-) to
• support DevOps operation and cloud transformation
• Familiarity and knowledge of the web standards (e.g. REST APIs, web security mechanisms)
• Hands on experience with GCP
• Experience in performance tuning, services outage management and troubleshooting.
Attributes:
• Good verbal and written communication skills
• Exceptional leadership, time management, and organizational skill Ability to operate independently and make decisions with little direct supervision
Job Dsecription:
○ Develop best practices for team and also responsible for the architecture
○ solutions and documentation operations in order to meet the engineering departments quality and standards
○ Participate in production outage and handle complex issues and works towards Resolution
○ Develop custom tools and integration with existing tools to increase engineering Productivity
Required Experience and Expertise
○ Having a good knowledge of Terraform + someone who has worked on large TF code bases.
○ Deep understanding of Terraform with best practices & writing TF modules.
○ Hands-on experience of GCP and AWS and knowledge on AWS Services like VPC and VPC related services like (route tables, vpc endpoints, privatelinks) EKS, S3, IAM. Cost aware mindset towards Cloud services.
○ Deep understanding of Kernel, Networking and OS fundamentals
NOTICE PERIOD - Max - 30 days

