Senior Data Engineer
Responsibilities:
● Clean, prepare and optimize data at scale for ingestion and consumption by machine learning models
● Drive the implementation of new data management projects and re-structure of the current data architecture
● Implement complex automated workflows and routines using workflow scheduling tools
● Build continuous integration, test-driven development and production deployment frameworks
● Drive collaborative reviews of design, code, test plans and dataset implementation performed by other data engineers in support of maintaining data engineering standards
● Anticipate, identify and solve issues concerning data management to improve data quality
● Design and build reusable components, frameworks and libraries at scale to support machine learning products
● Design and implement product features in collaboration with business and Technology stakeholders
● Analyze and profile data for the purpose of designing scalable solutions
● Troubleshoot complex data issues and perform root cause analysis to proactively resolve product and operational issues
● Mentor and develop other data engineers in adopting best practices
● Able to influence and communicate effectively, both verbally and written, with team members and business stakeholders
Qualifications:
● 8+ years of experience developing scalable Big Data applications or solutions on distributed platforms
● Experience in Google Cloud Platform (GCP) and good to have other cloud platform tools
● Experience working with Data warehousing tools, including DynamoDB, SQL, and Snowflake
● Experience architecting data products in Streaming, Serverless and Microservices Architecture and platform.
● Experience with Spark (Scala/Python/Java) and Kafka
● Work experience with using Databricks (Data Engineering and Delta Lake components)
● Experience working with Big Data platforms, including Dataproc, Data Bricks etc
● Experience working with distributed technology tools including Spark, Presto, Databricks, Airflow
● Working knowledge of Data warehousing, Data modeling
● Experience working in Agile and Scrum development process
● Bachelor's degree in Computer Science, Information Systems, Business, or other relevant subject area
Role:
Senior Data Engineer
Total No. of Years:
8+ years of relevant experience
To be onboarded by:
Immediate
Notice Period:
Skills
Mandatory / Desirable
Min years (Project Exp)
Max years (Project Exp)
GCP Exposure
Mandatory Min 3 to 7
BigQuery, Dataflow, Dataproc, AI Building Blocks, Looker, Cloud Data Fusion, Dataprep .Spark and PySpark
Mandatory Min 5 to 9
Relational SQL
Mandatory Min 4 to 8
Shell scripting language
Mandatory Min 4 to 8
Python /scala language
Mandatory Min 4 to 8
Airflow/Kubeflow workflow scheduling tool
Mandatory Min 3 to 7
Kubernetes
Desirable 1 to 6
Scala
Mandatory Min 2 to 6
Databricks
Desirable Min 1 to 6
Google Cloud Functions
Mandatory Min 2 to 6
GitHub source control tool
Mandatory Min 4 to 8
Machine Learning
Desirable 1 to 6
Deep Learning
Desirable Min 1to 6
Data structures and algorithms
Mandatory Min 4 to 8