As a Senior Production Support Lead you will predominantly be involved in supporting high visibility applications including administrative platforms and reporting systems. You'll stretch your skills and grow your career as a leader. This is a unique role where your efforts will make an impact across different teams within the organization. The following points depict the important roles and responsibilities of a Senior Production Support Lead: Roles & Responsibilities: - Troubleshoot incidents as we encounter them. Figure out root causes of problem tickets and implement long-term, permanent fixes so that we do not encounter similar issues again and again. - Collaborate with Engineering and Sustaining teams so that we can design our systems for better resilience and maintainability. That way, we can pre-empt issues even before we encounter them. - Automate existing processes so that we gain an order of magnitude efficiency and effectiveness gains. - Build & manage a high-performance team of bright engineers. Lead by example through technical thought leadership and flawless execution. - Ensure troubleshooting skills and capability of the team is improved which can be quantified with the no of incidents that the team can resolve independently.- - Accountable and technical owner for ensuring OPS readiness for new modules that need to be supported from various angles like monitoring, adequate technical onboarding training and preparedness to handle incidents. - Drive various automation initiatives and ensure efficiency improvement by automation of manual/routine tasks/ SOP. Decrease turnaround times, streamline work processes and work cooperatively and jointly to provide quality seamless customer service. Develop, enhance and maintain various tools in this regard. - Take up current monitoring two notches higher with his self-expertize and ensure operations team to be able to detect all critical issues before the customer. - Provide management of On-call support for 24x7 coverage - Ensure systems stay running in a stable state and are meeting SLA requirements. - Set and maintain alerts within application monitoring software to ensure performance anomalies are reported immediately - Review new issue items (problems, incidents) assigned to the development group and ensure that sufficient information is available with each ticket to proceed with analysis and resolution - Assign tickets to team members or to another group when applicable - Coordinate business approval of production migrations - Facilitate the hand-off of new release functionality from the development team to the Production Support team Key Skills: - 3 to 10 years of experience in Application production support environment with an ability to solve complex problems. - Adept on Linux platform - Programming experience in PHP, Python - Exposure to Networking, load balancers, Messaging Queue and strong database fundamentals (MySQL, MsSQL, Cassandra), AWS, Kafka - Deep knowledge of Incident and problem management.
Qualifications - Bachelors or Masters in engineering (preferably in Computer Science) - 2+ years of experience - Experience in a startup is a plus What we value - Passionate about building solutions in a high-tech mobile start-up. - Always look for solutions that scale in performance. Ability to see elegant solutions. - Very high ownership, self-starter. Identify and define problems better than others. Technical Skills - Ability to devise solution at internet scale with real time performance - Proficient in LAMP (Linux, Apache, MySQL, PHP) architecture - Experience with Amazon Elastic Compute Cloud (EC2) is a plus - Expertise in application development, including: - Complete understanding of MySQL and relational databases. - OO programming, patterns and aspects in PHP5. - Website MVC frameworks with AJAX/REST and JSON messaging. - Producer-consumer relationships in inter-process messaging. - Distributed multi-server architecture and asynchronous operations. - Consistent usage of, and a firm belief in, automated testing. - Proven capability to build applications scaling to large base of users.