4+ ADF Jobs in Mumbai | ADF Job openings in Mumbai
Apply to 4+ ADF Jobs in Mumbai on CutShort.io. Explore the latest ADF Job opportunities across top companies like Google, Amazon & Adobe.
* Python (3 to 6 years): Strong expertise in data workflows and automation
* Spark (PySpark): Hands-on experience with large-scale data processing
* Pandas: For detailed data analysis and validation
* Delta Lake: Managing structured and semi-structured datasets at scale
* SQL: Querying and performing operations on Delta tables
* Azure Cloud: Compute and storage services
* Orchestrator: Good experience with either ADF or Airflow
ROLES AND RESPONSIBILITIES:
You will be responsible for architecting, implementing, and optimizing Dremio-based data Lakehouse environments integrated with cloud storage, BI, and data engineering ecosystems. The role requires a strong balance of architecture design, data modeling, query optimization, and governance enablement in large-scale analytical environments.
- Design and implement Dremio lakehouse architecture on cloud (AWS/Azure/Snowflake/Databricks ecosystem).
- Define data ingestion, curation, and semantic modeling strategies to support analytics and AI workloads.
- Optimize Dremio reflections, caching, and query performance for diverse data consumption patterns.
- Collaborate with data engineering teams to integrate data sources via APIs, JDBC, Delta/Parquet, and object storage layers (S3/ADLS).
- Establish best practices for data security, lineage, and access control aligned with enterprise governance policies.
- Support self-service analytics by enabling governed data products and semantic layers.
- Develop reusable design patterns, documentation, and standards for Dremio deployment, monitoring, and scaling.
- Work closely with BI and data science teams to ensure fast, reliable, and well-modeled access to enterprise data.
IDEAL CANDIDATE:
- Bachelor’s or Master’s in Computer Science, Information Systems, or related field.
- 5+ years in data architecture and engineering, with 3+ years in Dremio or modern lakehouse platforms.
- Strong expertise in SQL optimization, data modeling, and performance tuning within Dremio or similar query engines (Presto, Trino, Athena).
- Hands-on experience with cloud storage (S3, ADLS, GCS), Parquet/Delta/Iceberg formats, and distributed query planning.
- Knowledge of data integration tools and pipelines (Airflow, DBT, Kafka, Spark, etc.).
- Familiarity with enterprise data governance, metadata management, and role-based access control (RBAC).
- Excellent problem-solving, documentation, and stakeholder communication skills.
PREFERRED:
- Experience integrating Dremio with BI tools (Tableau, Power BI, Looker) and data catalogs (Collibra, Alation, Purview).
- Exposure to Snowflake, Databricks, or BigQuery environments.
- Experience in high-tech, manufacturing, or enterprise data modernization programs.
- Experience and expertise in Python Development and its different libraries like Pyspark, pandas, NumPy
- Expertise in ADF, Databricks.
- Creating and maintaining data interfaces across a number of different protocols (file, API.).
- Creating and maintaining internal business process solutions to keep our corporate system data in sync and reduce manual processes where appropriate.
- Creating and maintaining monitoring and alerting workflows to improve system transparency.
- Facilitate the development of our Azure cloud infrastructure relative to Data and Application systems.
- Design and lead development of our data infrastructure including data warehouses, data marts, and operational data stores.
- Experience in using Azure services such as ADLS Gen 2, Azure Functions, Azure messaging services, Azure SQL Server, Azure KeyVault, Azure Cognitive services etc.
- Data pre-processing, data transformation, data analysis, and feature engineering
- Performance optimization of scripts (code) and Productionizing of code (SQL, Pandas, Python or PySpark, etc.)
- Required skills:
- Bachelors in - in Computer Science, Data Science, Computer Engineering, IT or equivalent
- Fluency in Python (Pandas), PySpark, SQL, or similar
- Azure data factory experience (min 12 months)
- Able to write efficient code using traditional, OO concepts, modular programming following the SDLC process.
- Experience in production optimization and end-to-end performance tracing (technical root cause analysis)
- Ability to work independently with demonstrated experience in project or program management
- Azure experience ability to translate data scientist code in Python and make it efficient (production) for cloud deployment


