![Multinational Company providing energy & Automation digital's logo](/_next/image?url=https%3A%2F%2Fcdn.cutshort.io%2Fpublic%2Fimages%2Fdefault_company_picture.jpg&w=3840&q=75)
Sr Hadoop Operations Engineer
at Multinational Company providing energy & Automation digital
Skills
![companies logos](/_next/image?url=https%3A%2F%2Fcdn.cutshort.io%2Fpublic%2Fimages%2Fhiring_companies_logos-v2.webp&w=3840&q=80)
Similar jobs
![skill icon](/_next/image?url=https%3A%2F%2Fcdn.cutshort.io%2Fpublic%2Fimages%2Fskill_icons%2Fpython.png&w=32&q=75)
Job Description
Mandatory Requirements
-
Experience in AWS Glue
-
Experience in Apache Parquet
-
Proficient in AWS S3 and data lake
-
Knowledge of Snowflake
-
Understanding of file-based ingestion best practices.
-
Scripting language - Python & pyspark
CORE RESPONSIBILITIES
-
Create and manage cloud resources in AWS
-
Data ingestion from different data sources which exposes data using different technologies, such as: RDBMS, flat files, Streams, and Time series data based on various proprietary systems. Implement data ingestion and processing with the help of Big Data technologies
-
Data processing/transformation using various technologies such as Spark and Cloud Services. You will need to understand your part of business logic and implement it using the language supported by the base data platform
-
Develop automated data quality check to make sure right data enters the platform and verifying the results of the calculations
-
Develop an infrastructure to collect, transform, combine and publish/distribute customer data.
-
Define process improvement opportunities to optimize data collection, insights and displays.
-
Ensure data and results are accessible, scalable, efficient, accurate, complete and flexible
-
Identify and interpret trends and patterns from complex data sets
-
Construct a framework utilizing data visualization tools and techniques to present consolidated analytical and actionable results to relevant stakeholders.
-
Key participant in regular Scrum ceremonies with the agile teams
-
Proficient at developing queries, writing reports and presenting findings
-
Mentor junior members and bring best industry practices.
QUALIFICATIONS
-
5-7+ years’ experience as data engineer in consumer finance or equivalent industry (consumer loans, collections, servicing, optional product, and insurance sales)
-
Strong background in math, statistics, computer science, data science or related discipline
-
Advanced knowledge one of language: Java, Scala, Python, C#
-
Production experience with: HDFS, YARN, Hive, Spark, Kafka, Oozie / Airflow, Amazon Web Services (AWS), Docker / Kubernetes, Snowflake
-
Proficient with
-
Data mining/programming tools (e.g. SAS, SQL, R, Python)
-
Database technologies (e.g. PostgreSQL, Redshift, Snowflake. and Greenplum)
-
Data visualization (e.g. Tableau, Looker, MicroStrategy)
-
Comfortable learning about and deploying new technologies and tools.
-
Organizational skills and the ability to handle multiple projects and priorities simultaneously and meet established deadlines.
-
Good written and oral communication skills and ability to present results to non-technical audiences
-
Knowledge of business intelligence and analytical tools, technologies and techniques.
Familiarity and experience in the following is a plus:
-
AWS certification
-
Spark Streaming
-
Kafka Streaming / Kafka Connect
-
ELK Stack
-
Cassandra / MongoDB
-
CI/CD: Jenkins, GitLab, Jira, Confluence other related tools
![skill icon](/_next/image?url=https%3A%2F%2Fcdn.cutshort.io%2Fpublic%2Fimages%2Fskill_icons%2Fscala.png&w=32&q=75)
Responsibilities:
- Designing and implementing fine-tuned production ready data/ML pipelines in Hadoop platform.
- Driving optimization, testing and tooling to improve quality.
- Reviewing and approving high level & amp; detailed design to ensure that the solution delivers to the business needs and aligns to the data & analytics architecture principles and roadmap.
- Understanding business requirements and solution design to develop and implement solutions that adhere to big data architectural guidelines and address business requirements.
- Following proper SDLC (Code review, sprint process).
- Identifying, designing, and implementing internal process improvements: automating manual processes, optimizing data delivery, etc.
- Building robust and scalable data infrastructure (both batch processing and real-time) to support needs from internal and external users.
- Understanding various data security standards and using secure data security tools to apply and adhere to the required data controls for user access in the Hadoop platform.
- Supporting and contributing to development guidelines and standards for data ingestion.
- Working with a data scientist and business analytics team to assist in data ingestion and data related technical issues.
- Designing and documenting the development & deployment flow.
Requirements:
- Experience in developing rest API services using one of the Scala frameworks.
- Ability to troubleshoot and optimize complex queries on the Spark platform
- Expert in building and optimizing ‘big data’ data/ML pipelines, architectures and data sets.
- Knowledge in modelling unstructured to structured data design.
- Experience in Big Data access and storage techniques.
- Experience in doing cost estimation based on the design and development.
- Excellent debugging skills for the technical stack mentioned above which even includes analyzing server logs and application logs.
- Highly organized, self-motivated, proactive, and ability to propose best design solutions.
- Good time management and multitasking skills to work to deadlines by working independently and as a part of a team.
- KSQL
- Data Engineering spectrum (Java/Spark)
- Spark Scala / Kafka Streaming
- Confluent Kafka components
- Basic understanding of Hadoop
Experience – 3 – 12 yrs
Budget - Open
Location - PAN India (Noida/Bangaluru/Hyderabad/Chennai)
Presto Developer (4)
Understanding of distributed SQL query engine running on Hadoop
Design and develop core components for Presto
Contribute to the ongoing Presto development by implementing new features, bug fixes, and other improvements
Develop new and extend existing Presto connectors to various data sources
Lead complex and technically challenging projects from concept to completion
Write tests and contribute to ongoing automation infrastructure development
Run and analyze software performance metrics
Collaborate with teams globally across multiple time zones and operate in an Agile development environment
Hands-on experience and interest with Hadoop
Familiar with the MicroStrategy architecture, Admin Certification Preferred
· Familiar with administrative functions, using Object Manager, Command Manager, installation/configuration of MSTR in clustered architecture, applying patches, hot-fixes
· Monitor and manage existing Business Intelligence development/production systems
· MicroStrategy installation, upgrade and administration on Windows and Linux platform
· Ability to support and administer multi-tenant MicroStrategy infrastructure including server security troubleshooting and general system maintenance.
· Analyze application and system logs while troubleshooting and root cause analysis
· Work on operations like deploy and manage packages, User Management, Schedule Management, Governing Settings best practices, database instance and security configuration.
· Monitor, report and investigate solutions to improve report performance.
· Continuously improve the platform through tuning, optimization, governance, automation, and troubleshooting.
· Provide support for the platform, report execution and implementation, user community and data investigations.
· Identify improvement areas in Environment hosting and upgrade processes.
· Identify automation opportunities and participate in automation implementations
· Provide on-call support for Business Intelligence issues
· Experience of working on MSTR 2021, MSTR 2021 including knowledge of working on Enterprise Manager and new features like Platform Analytics, Hyper Intelligence, Collaboration, MSTR Library, etc.
· Familiar with AWS, Linux Scripting
· Knowledge of MSTR Mobile
· Knowledge of capacity planning and system’s scaling needs
![skill icon](/_next/image?url=https%3A%2F%2Fcdn.cutshort.io%2Fpublic%2Fimages%2Fskill_icons%2Fpython.png&w=32&q=75)
Python + Data scientist : |
• Build data-driven models to understand the characteristics of engineering systems |
• Train, tune, validate, and monitor predictive models |
• Sound knowledge on Statistics |
• Experience in developing data processing tasks using PySpark such as reading, merging, enrichment, loading of data from external systems to target data destinations |
• Working knowledge on Big Data or/and Hadoop environments |
• Experience creating CI/CD Pipelines using Jenkins or like tools |
• Practiced in eXtreme Programming (XP) disciplines |
![skill icon](/_next/image?url=https%3A%2F%2Fcdn.cutshort.io%2Fpublic%2Fimages%2Fskill_icons%2Fpython.png&w=32&q=75)
Job Description
-
Design, development and deployment of highly-available and fault-tolerant enterprise business software at scale.
-
Demonstrate tech expertise to go very deep or broad in solving classes of problems or creating broadly leverage-able solutions.
-
Execute large-scale projects - Provide technical leadership in architecting and building product solutions.
-
Collaborate across teams to deliver a result, from hardworking team members within your group, through smart technologists across lines of business.
-
Be a role model on acting with good judgment and responsibility, helping teams to commit and move forward.
-
Be a humble mentor and trusted advisor for both our talented team members and passionate leaders alike. Deal with differences in opinion in a mature and fair way.
-
Raise the bar by improving standard methodologies, producing best-in-class efficient solutions, code, documentation, testing, and monitoring.
Qualifications
• 15+ years of relevant engineering experience.
-
Proven record of building and productionizing highly reliable products at scale.
-
Experience with Java and Python
-
Experience with the Big Data technologie is a plus.
-
Ability to assess new technologies and make pragmatic choices that help guide us towards a long-term vision
-
Can collaborate well with several other engineering orgs to articulate requirements and system design
Additional Information
Professional Attributes:
• Team player!
• Great interpersonal skills, deep technical ability, and a portfolio of successful execution.
• Excellent written and verbal communication skills, including the ability to write detailed technical documents.
• Passionate about helping teams grow by inspiring and mentoring engineers.
Must Have Skills:
![skill icon](/_next/image?url=https%3A%2F%2Fcdn.cutshort.io%2Fpublic%2Fimages%2Fskill_icons%2Fmachine_learning.png&w=32&q=75)
![skill icon](/_next/image?url=https%3A%2F%2Fcdn.cutshort.io%2Fpublic%2Fimages%2Fskill_icons%2Fdata_science.png&w=32&q=75)
![skill icon](/_next/image?url=https%3A%2F%2Fcdn.cutshort.io%2Fpublic%2Fimages%2Fskill_icons%2Fdata_science.png&w=32&q=75)
![skill icon](/_next/image?url=https%3A%2F%2Fcdn.cutshort.io%2Fpublic%2Fimages%2Fskill_icons%2Fpython.png&w=32&q=75)
![skill icon](/_next/image?url=https%3A%2F%2Fcdn.cutshort.io%2Fpublic%2Fimages%2Fskill_icons%2Fmachine_learning.png&w=32&q=75)
![icon](/_next/image?url=https%3A%2F%2Fcdn.cutshort.io%2Fpublic%2Fimages%2Fsearch.png&w=48&q=75)
![companies logos](/_next/image?url=https%3A%2F%2Fcdn.cutshort.io%2Fpublic%2Fimages%2Fhiring_companies_logos-v2.webp&w=3840&q=80)