5+ Data governance Jobs in Mumbai | Data governance Job openings in Mumbai
Apply to 5+ Data governance Jobs in Mumbai on CutShort.io. Explore the latest Data governance Job opportunities across top companies like Google, Amazon & Adobe.
ROLES AND RESPONSIBILITIES:
You will be responsible for architecting, implementing, and optimizing Dremio-based data Lakehouse environments integrated with cloud storage, BI, and data engineering ecosystems. The role requires a strong balance of architecture design, data modeling, query optimization, and governance enablement in large-scale analytical environments.
- Design and implement Dremio lakehouse architecture on cloud (AWS/Azure/Snowflake/Databricks ecosystem).
- Define data ingestion, curation, and semantic modeling strategies to support analytics and AI workloads.
- Optimize Dremio reflections, caching, and query performance for diverse data consumption patterns.
- Collaborate with data engineering teams to integrate data sources via APIs, JDBC, Delta/Parquet, and object storage layers (S3/ADLS).
- Establish best practices for data security, lineage, and access control aligned with enterprise governance policies.
- Support self-service analytics by enabling governed data products and semantic layers.
- Develop reusable design patterns, documentation, and standards for Dremio deployment, monitoring, and scaling.
- Work closely with BI and data science teams to ensure fast, reliable, and well-modeled access to enterprise data.
IDEAL CANDIDATE:
- Bachelor’s or Master’s in Computer Science, Information Systems, or related field.
- 5+ years in data architecture and engineering, with 3+ years in Dremio or modern lakehouse platforms.
- Strong expertise in SQL optimization, data modeling, and performance tuning within Dremio or similar query engines (Presto, Trino, Athena).
- Hands-on experience with cloud storage (S3, ADLS, GCS), Parquet/Delta/Iceberg formats, and distributed query planning.
- Knowledge of data integration tools and pipelines (Airflow, DBT, Kafka, Spark, etc.).
- Familiarity with enterprise data governance, metadata management, and role-based access control (RBAC).
- Excellent problem-solving, documentation, and stakeholder communication skills.
PREFERRED:
- Experience integrating Dremio with BI tools (Tableau, Power BI, Looker) and data catalogs (Collibra, Alation, Purview).
- Exposure to Snowflake, Databricks, or BigQuery environments.
- Experience in high-tech, manufacturing, or enterprise data modernization programs.
Review Criteria
- Strong Dremio / Lakehouse Data Architect profile
- 5+ years of experience in Data Architecture / Data Engineering, with minimum 3+ years hands-on in Dremio
- Strong expertise in SQL optimization, data modeling, query performance tuning, and designing analytical schemas for large-scale systems
- Deep experience with cloud object storage (S3 / ADLS / GCS) and file formats such as Parquet, Delta, Iceberg along with distributed query planning concepts
- Hands-on experience integrating data via APIs, JDBC, Delta/Parquet, object storage, and coordinating with data engineering pipelines (Airflow, DBT, Kafka, Spark, etc.)
- Proven experience designing and implementing lakehouse architecture including ingestion, curation, semantic modeling, reflections/caching optimization, and enabling governed analytics
- Strong understanding of data governance, lineage, RBAC-based access control, and enterprise security best practices
- Excellent communication skills with ability to work closely with BI, data science, and engineering teams; strong documentation discipline
- Candidates must come from enterprise data modernization, cloud-native, or analytics-driven companies
Preferred
- Preferred (Nice-to-have) – Experience integrating Dremio with BI tools (Tableau, Power BI, Looker) or data catalogs (Collibra, Alation, Purview); familiarity with Snowflake, Databricks, or BigQuery environments
Job Specific Criteria
- CV Attachment is mandatory
- How many years of experience you have with Dremio?
- Which is your preferred job location (Mumbai / Bengaluru / Hyderabad / Gurgaon)?
- Are you okay with 3 Days WFO?
- Virtual Interview requires video to be on, are you okay with it?
Role & Responsibilities
You will be responsible for architecting, implementing, and optimizing Dremio-based data lakehouse environments integrated with cloud storage, BI, and data engineering ecosystems. The role requires a strong balance of architecture design, data modeling, query optimization, and governance enablement in large-scale analytical environments.
- Design and implement Dremio lakehouse architecture on cloud (AWS/Azure/Snowflake/Databricks ecosystem).
- Define data ingestion, curation, and semantic modeling strategies to support analytics and AI workloads.
- Optimize Dremio reflections, caching, and query performance for diverse data consumption patterns.
- Collaborate with data engineering teams to integrate data sources via APIs, JDBC, Delta/Parquet, and object storage layers (S3/ADLS).
- Establish best practices for data security, lineage, and access control aligned with enterprise governance policies.
- Support self-service analytics by enabling governed data products and semantic layers.
- Develop reusable design patterns, documentation, and standards for Dremio deployment, monitoring, and scaling.
- Work closely with BI and data science teams to ensure fast, reliable, and well-modeled access to enterprise data.
Ideal Candidate
- Bachelor’s or master’s in computer science, Information Systems, or related field.
- 5+ years in data architecture and engineering, with 3+ years in Dremio or modern lakehouse platforms.
- Strong expertise in SQL optimization, data modeling, and performance tuning within Dremio or similar query engines (Presto, Trino, Athena).
- Hands-on experience with cloud storage (S3, ADLS, GCS), Parquet/Delta/Iceberg formats, and distributed query planning.
- Knowledge of data integration tools and pipelines (Airflow, DBT, Kafka, Spark, etc.).
- Familiarity with enterprise data governance, metadata management, and role-based access control (RBAC).
- Excellent problem-solving, documentation, and stakeholder communication skills.
Review Criteria
- Strong Product Manager Profiles
- 4+ years of product management experience, of which 2+ years in healthcare, pharmaceutical, life sciences, or AdTech domains
- Must have built or scaled products involving Data Science, Machine Learning, or AI
- Must have experience working end-to-end on product lifecycle — strategy, roadmap, development execution, stakeholder alignment, user research, product optimization, and adoption (0 to 1 product experience is preferred)
- Hands-on experience collaborating with engineering, data science, design, supply teams, and demand-side teams on parallel product initiatives
- Strong understanding of demand-side and supply-side mechanisms, programmatic advertising, data intelligence products, or marketplace platforms
- Experience in companies serving Healthcare Professionals (HCPs), Pharma, Life Sciences, or HealthTech advertising is a must
- Product companies (preferably in HealthTech)
- CTC includes 20% variable
- HealthTech exposure is a must (current or past experience)
- It’s an IC role
Preferred
- Experience working on AI-driven features such as predictive models, segmentation, personalization, or automated optimization.
Job Specific Criteria
- CV Attachment is mandatory
- What is your preferred location — Noida or Mumbai?
- If you’re based in Mumbai, are you comfortable traveling to the Noida office for one week each month?
- Are you available for an in-person interview for one of the rounds?
- Which HealthTech company(ies) you have worked for?
Role & Responsibilities
We are seeking a strategic and innovative Product Manager to lead the development and growth of our DataIQ and Marketplace products. This role is pivotal in driving the vision, strategy, and execution of our data intelligence and digital commerce platforms, ensuring they deliver exceptional value to our users and stakeholders.
Key Responsibilities-
Product Strategy & Vision:
- Define and articulate the product vision and roadmap for DataIQ and Marketplace, aligning with company objectives and market needs.
- Conduct market research and competitive analysis to identify opportunities for innovation and differentiation.
- Collaborate with stakeholders to prioritize features and initiatives that drive business impact and user satisfaction.
Product Development & Execution:
- Lead the end-to-end product development lifecycle, from ideation through to launch and iteration.
- Work closely with engineering, design, and data teams to deliver high-quality products on time and within scope.
- Develop clear and concise product documentation, including PRDs, user stories, and acceptance criteria.
User Experience & Enablement:
- Ensure a seamless and intuitive user experience across both DataIQ and Marketplace platforms.
- Collaborate with UX/UI teams to design user-centric interfaces that enhance engagement and usability.
- Provide training and support materials to enable users to maximize the value of our products.
Performance Monitoring & Optimization:
- Define and track key performance indicators (KPIs) to measure product success and inform decision-making.
- Analyze user feedback and product data to identify areas for improvement and optimization.
- Continuously iterate on product features and functionalities to enhance performance and user satisfaction.
Ideal Candidate
Experience & Skills:
- Bachelor's or Master's degree in Computer Science, Engineering, Business, or a related field.
- 4+ years of experience, with a proven track record in data intelligence or digital commerce products.
- Strong understanding of data analytics, cloud technologies, and e-commerce platforms.
- Excellent communication and collaboration skills, with the ability to work effectively across cross-functional teams.
- Analytical mindset with the ability to leverage data to drive product decisions.
Nice-to-Haves:
- Experience with machine learning or AI-driven product features.
- Familiarity with data governance and privacy regulations.
- Knowledge of marketplace dynamics and seller/buyer ecosystems.
- Key responsibility is to design and develop a data pipeline including the architecture, prototyping, and development of data extraction, transformation/processing, cleansing/standardizing, and loading in Data Warehouse at real-time/near the real-time frequency. Source data can be structured, semi-structured, and/or unstructured format.
- Provide technical expertise to design efficient data ingestion solutions to consolidate data from RDBMS, APIs, Messaging queues, weblogs, images, audios, documents, etc of Enterprise Applications, SAAS applications, external 3rd party sites or APIs, etc through ETL/ELT, API integrations, Change Data Capture, Robotic Process Automation, Custom Python/Java Coding, etc
- Development of complex data transformation using Talend (BigData edition), Python/Java transformation in Talend, SQL/Python/Java UDXs, AWS S3, etc to load in OLAP Data Warehouse in Structured/Semi-structured form
- Development of data model and creating transformation logic to populate models for faster data consumption with simple SQL.
- Implementing automated Audit & Quality assurance checks in Data Pipeline
- Document & maintain data lineage to enable data governance
- Coordination with BIU, IT, and other stakeholders to provide best-in-class data pipeline solutions, exposing data via APIs, loading in down streams, No-SQL Databases, etc
Requirements
- Programming experience using Python / Java, to create functions / UDX
- Extensive technical experience with SQL on RDBMS (Oracle/MySQL/Postgresql etc) including code optimization techniques
- Strong ETL/ELT skillset using Talend BigData Edition. Experience in Talend CDC & MDM functionality will be an advantage.
- Experience & expertise in implementing complex data pipelines, including semi-structured & unstructured data processing
- Expertise to design efficient data ingestion solutions to consolidate data from RDBMS, APIs, Messaging queues, weblogs, images, audios, documents, etc of Enterprise Applications, SAAS applications, external 3rd party sites or APIs, etc through ETL/ELT, API integrations, Change Data Capture, Robotic Process Automation, Custom Python/Java Coding, etc
- Good understanding & working experience in OLAP Data Warehousing solutions (Redshift, Synapse, Snowflake, Teradata, Vertica, etc) and cloud-native Data Lake (S3, ADLS, BigQuery, etc) solutions
- Familiarity with AWS tool stack for Storage & Processing. Able to recommend the right tools/solutions available to address a technical problem
- Good knowledge of database performance and tuning, troubleshooting, query optimization, and tuning
- Good analytical skills with the ability to synthesize data to design and deliver meaningful information
- Good knowledge of Design, Development & Performance tuning of 3NF/Flat/Hybrid Data Model
- Know-how on any No-SQL DB (DynamoDB, MongoDB, CosmosDB, etc) will be an advantage.
- Ability to understand business functionality, processes, and flows
- Good combination of technical and interpersonal skills with strong written and verbal communication; detail-oriented with the ability to work independently
Functional knowledge
- Data Governance & Quality Assurance
- Distributed computing
- Linux
- Data structures and algorithm
- Unstructured Data Processing
- Key responsibility is to design & develop a data pipeline for real-time data integration, processing, executing of the model (if required), and exposing output via MQ / API / No-SQL DB for consumption
- Provide technical expertise to design efficient data ingestion solutions to store & process unstructured data, such as Documents, audio, images, weblogs, etc
- Developing API services to provide data as a service
- Prototyping Solutions for complex data processing problems using AWS cloud-native solutions
- Implementing automated Audit & Quality assurance Checks in Data Pipeline
- Document & maintain data lineage from various sources to enable data governance
- Coordination with BIU, IT, and other stakeholders to provide best-in-class data pipeline solutions, exposing data via APIs, loading in down streams, No-SQL Databases, etc
Skills
- Programming experience using Python & SQL
- Extensive working experience in Data Engineering projects, using AWS Kinesys, AWS S3, DynamoDB, EMR, Lambda, Athena, etc for event processing
- Experience & expertise in implementing complex data pipeline
- Strong Familiarity with AWS Toolset for Storage & Processing. Able to recommend the right tools/solutions available to address specific data processing problems
- Hands-on experience in Unstructured (Audio, Image, Documents, Weblogs, etc) Data processing.
- Good analytical skills with the ability to synthesize data to design and deliver meaningful information
- Know-how on any No-SQL DB (DynamoDB, MongoDB, CosmosDB, etc) will be an advantage.
- Ability to understand business functionality, processes, and flows
- Good combination of technical and interpersonal skills with strong written and verbal communication; detail-oriented with the ability to work independently
Functional knowledge
- Real-time Event Processing
- Data Governance & Quality assurance
- Containerized deployment
- Linux
- Unstructured Data Processing
- AWS Toolsets for Storage & Processing
- Data Security

