8+ Data architecture Jobs in Mumbai | Data architecture Job openings in Mumbai
Apply to 8+ Data architecture Jobs in Mumbai on CutShort.io. Explore the latest Data architecture Job opportunities across top companies like Google, Amazon & Adobe.
Lead Data Engineer
What are we looking for
real solver?
Solver? Absolutely. But not the usual kind. We're searching for the architects of the audacious & the pioneers of the possible. If you're the type to dismantle assumptions, re-engineer ‘best practices,’ and build solutions that make the future possible NOW, then you're speaking our language.
Your Responsibilities
What you will wake up to solve.
- Lead Technical Design & Data Architecture: Architect and lead the end-to-end development of scalable, cloud-native data platforms. You’ll guide the squad on critical architectural decisions—choosing between Batch vs. Streaming or ETL vs. ELT—while remaining 100% hands-on, contributing high-quality, production-grade code.
- Build High-Velocity Data Pipelines: Drive the implementation of robust data transports and ingestion frameworks using Python, SQL, and Spark. You will build integration layers that connect heterogeneous sources (SaaS, RDBMS, NoSQL) into unified, high-availability environments like BigQuery, Snowflake, or Redshift.
- Mentor & Elevate the Squad: Foster a culture of technical excellence by mentoring and inspiring a team of data analysts and engineers. Lead deep-dive code reviews, promote best-practice data modeling (Star/Snowflake schema), and ensure the squad adopts modern engineering standards like CI/CD for data.
- Drive AI-Ready Data Strategy: Be the expert in designing data foundations optimized for AI and Machine Learning. You will champion the use of GCP (Dataflow, Pub/Sub, BigQuery) and AWS (Lambda, Glue, EMR) to create "clean room" environments that fuel advanced analytics and generative AI models.
- Partner with Clients as a Technical DRI: Act as the Directly Responsible Individual for client success. Translate ambiguous business questions into elegant data services, manage project deliverables using Agile methodologies, and ensure that the data provided is accurate, consistent, and mission-critical.
- Troubleshoot & Optimize for Scale: Own the reliability of the reporting layer. You will proactively monitor pipelines, troubleshoot complex transformation bottlenecks, and propose ways to improve platform performance and cost-efficiency.
- Innovate and Build Reusable IP: Spearhead the creation of reusable data frameworks, custom operators, and transformation libraries that accelerate future projects and establish Searce’s unique technical advantage in the market.
Welcome to Searce
The AI-Native tech consultancy that's rewriting the rules.
Searce is an AI-native, engineering-led, modern tech consultancy that empowers clients to futurify their business by delivering intelligent, impactful, real business outcomes. Searce solvers co-innovate with clients as their trusted transformational partners ensuring sustained competitive advantage. Searce clients realize smarter, faster, better business outcomes delivered by AI-native Searce solver squads.
Functional Skills
the solver personas.
- The Data Architect: This persona deconstructs ambiguous business goals into scalable, elegant data blueprints. They don't just move data; they design the foundation—from schema design to partitioning strategies—that allows data scientists and analysts to thrive, foreseeing technical bottlenecks and making pragmatic trade-offs.
- The Player-Coach: As a hands-on leader, this persona leads from the front by writing exemplary, production-grade SQL and Python while simultaneously mentoring and elevating the skills of the squad. Their success is measured by the team's ability to deliver high-quality, maintainable code and their growth as engineers.
- The Pragmatic Innovator: This individual balances a passion for modern data tech (like Generative AI and Real-time Streaming) with a sharp focus on business outcomes. They champion new tools where they add real value but are disciplined enough to choose stable, cost-effective solutions to meet deadlines and deliver robust products.
- The Client-Facing Technologist: This persona acts as the crucial technical bridge between the data squad and the client. They build trust by listening actively, explaining complex data concepts (like data latency or idempotency) in simple terms, and demonstrating how engineering decisions align with the client’s strategic goals.
- The Quality Craftsman: This individual possesses an unwavering commitment to data integrity and treats data engineering as a craft. They are the guardian of the reporting layer, advocating for robust testing, data validation frameworks, and clean, modular code to ensure the long-term reliability of the data platform.
Experience & Relevance
- Engineering Depth: 7-10 years of professional experience in end-to-end data product development. You have a portfolio that proves your ability to build complex, high-velocity pipelines for both Batch and Streaming workloads.
- Cloud-Native Fluency: Deep, hands-on experience designing and deploying scalable data solutions on at least one major cloud platform (AWS, GCP, or Azure). You are comfortable navigating the nuances of EMR, BigQuery, or Synapse at scale.
- AI-Native Workflow: You don’t just build for AI; you build with AI. You must be proficient in using AI coding assistants (e.g., GitHub Copilot) to accelerate your delivery and have a track record of building the data foundations required for Generative AI.
- Architectural Portfolio: Evidence of leading 2-3 large-scale transformations—including platform migrations, data lakehouse builds, or real-time analytics architectures.
- Client-Facing Acumen: You have direct experience in a consultative, client-facing role. You can confidently translate a CEO’s business vision into a Lead Engineer’s technical specification without losing anything in translation.
Join the ‘real solvers’
ready to futurify?
If you are excited by the possibilities of what an AI-native engineering-led, modern tech consultancy can do to futurify businesses, apply here and experience the ‘Art of the possible’. Don’t Just Send a Resume. Send a Statement.
Director - Data engineering
What are we looking for
real solver?
Solver? Absolutely. But not the usual kind. We're searching for the architects of the audacious & the pioneers of the possible. If you're the type to dismantle assumptions, re-engineer ‘best practices,’ and build solutions that make the future possible NOW, then you're speaking our language.
Your Responsibilities
what you will wake up to solve.
1. Delivery & Tactical Rigor
- Methodology Implementation: Implement and manage a unified, 'DataOps-First' methodology for data engineering delivery (ETL/ELT pipelines, Data Modeling, MLOps, Data Governance) within assigned business units. This ensures predictable outcomes and trusted data integrity by reducing architecture variability at the project level.
- Operational Stewardship: Drive initiatives to optimize team utilization and enhance operational efficiency within the practice. You manage the commercial success of your squads, ensuring data delivery models (from migration to modern data stack implementation) are executed profitably, scalably, and cost-effectively.
- Execution & Technical Resolution
- Technical Escalation: Serve as the primary escalation point for delivery issues, personally leading the resolution of complex data integration bottlenecks and pipeline failures to protect client timelines and data reliability standards.
- Quality Enforcement
- Quality Oversight: Execute and monitor technical data quality standards, ensuring engineering teams adhere to strict policies regarding data lineage, automated quality checks (observability), security/privacy compliance (GDPR/CCPA/PII), and active catalog management.
2. Strategic Growth & Practice Scaling
- Talent & Scaling Execution: Execute the strategy for data engineering talent acquisition and development within your business units. Implement objective metrics to assess and grow the 'Data-Native' DNA of your teams, ensuring squads are consistently equipped to handle petabyte-scale environments and high-impact delivery.
- Offerings Alignment: Drive the adoption of standardized regional offerings (e.g., Modern Data Platform, Data Mesh, Lakehouse Implementation). Ensure your teams leverage the profitable frameworks defined by the practice to accelerate time-to-insight and eliminate architectural fragmentation in client environments.
- Innovation & IP Development: Lead the practical integration of Vector Databases and LLM-ready architectures into project delivery. Champion the hands-on development of IP and reusable accelerators (e.g., automated ingestion engines) that improve delivery speed and enhance data availability across your portfolio.
3. Leadership & Unit Management
- Unit Leadership: Directly lead, mentor, and manage the Engineering Managers and Lead Architects within your business unit. Hold your teams accountable for project-level operational consistency, technical talent development, and strict adherence to the practice's data governance standards.
- Stakeholder Communication: Clearly articulate the business unit’s operational performance, technical quality metrics, and delivery progress to the C-suite Stakeholders and regional client leadership, bridging the gap between technical execution and business value.
- Ecosystem Alignment: Maintain strong technical relationships with key partner contacts (Snowflake, Databricks, AWS/GCP). Align team delivery capabilities with current product roadmaps and ensure squad-level participation in training, certifications, and partner-led enablement opportunities.
Welcome to Searce
The ‘process-first’, AI-native modern tech consultancy that's rewriting the rules.
We don’t do traditional.
As an engineering-led consultancy, we are dedicated to relentlessly improving the real business outcomes. Our solvers co-innovate with clients to futurify operations and make processes smarter, faster & better.
Functional Skills
1. Delivery Management & Operational Excellence
- Methodology Execution: Expert capability in implementing and enforcing a unified delivery methodology (DataOps, Agile, Mesh Principles) within specific business units. Proven track record of auditing squad-level adherence to ensure consistency across the project lifecycle.
- Operational Performance: High proficiency in managing day-to-day operational metrics, including squad utilization, resource forecasting, and productivity tracking. Skilled at optimizing team performance to meet profitability and efficiency targets.
- SOW & Risk Mitigation: Proven experience in operationalizing Statement of Work (SOW) requirements and identifying technical delivery risks early. Expert at mitigating scope creep and data-specific bottlenecks (e.g., latency, ingestion gaps) before they impact client outcomes.
- Technical Escalation Leadership: Demonstrated ability to lead "war room" efforts to resolve complex pipeline failures or data integrity issues. Skilled at providing clear, rapid remediation plans and communicating technical status directly to regional stakeholders.
2. Architectural Implementation & Technical Oversight
- Modern Stack Proficiency: Deep, hands-on expertise in implementing Cloud-Native architectures (Lakehouse, Data Mesh, MPP) on Snowflake, Databricks, or hyperscalers. Ability to conduct deep-dive architectural reviews and course-correct design decisions at the squad level to ensure scalability.
- Operationalizing Governance: Proven experience in embedding data quality and observability (completeness, freshness, accuracy) directly into the CI/CD pipeline. Responsible for technical enforcement of regulatory compliance (GDPR/PII) and maintaining the integrity of data catalogs across active projects.
- Applied Domain Expertise: Practical experience leading the delivery of high-growth solutions, specifically Generative AI infrastructure (RAG, Vector DBs), Real-Time Streaming, and large-scale platform migrations with a focus on zero-downtime execution.
- DataOps & Engineering Standards: Expert-level mastery of DataOps, including the setup and management of orchestration frameworks (Airflow, Dagster) and Infrastructure as Code (IaC). You ensure that automation is a baseline requirement, not an afterthought, for all delivery teams.
3. Unit Management & Commercial Execution
- Unit & Team Management: Proven success in leading and mentoring Engineering Managers and Lead Architects. Responsible for the operational metrics, technical output, and career development of the business unit's talent pool.
- Offerings Implementation & Scoping: Expertise in translating service offerings (e.g., Data Maturity Assessments, Lakehouse Builds) into accurate project scopes, technical estimates, and resource plans to ensure delivery is both profitable and competitive.
- Talent Growth & Mentorship: Functional ability to implement growth frameworks for data engineering roles. Focus on hands-on coaching and scaling high-performance technical talent to meet the demands of complex, petabyte-scale environments.
- Partner Enablement: Functional competence in managing regional technical relationships with major partners (Snowflake, Databricks, GCP/AWS). Drives squad-level certifications, joint technical enablement, and alignment with partner product roadmaps.
Tech Superpowers
- Modern Data Architect – Reimagines business with the Modern Data Stack (MDS) to deliver data mesh implementations, insights, & real value to clients.
- End-to-End Ecosystem Thinker – Builds modular, reusable data products across ingestion, transformation (ETL/ELT), governance, and consumption layers.
- Distributed Compute Savant – Crafts resilient, high-throughput architectures that survive petabyte-scale volume and data skew without breaking the bank.
- Governance & Integrity Guardian – Embeds data quality, complete lineage, and privacy-by-design (GDPR/PII) into every table, view, and pipeline.
- AI-Ready Orchestrator – Engineers pipelines that bridge structured data with Unstructured/Vector stores, powering RAG models and Generative AI workflows.
- Product-Minded Strategist – Balances architectural purity with time-to-insight; treats every dataset as a measurable "Data Product" with clear ROI.
- Pragmatic Stack Curator – Chooses the simplest tools that compound reliability; fluent in SQL, Python, Spark, dbt, and Cloud Warehouses.
- Builder @ Heart – Writes, reviews, and optimizes queries daily; proves architectures with cost-performance benchmarks, not slideware. Business-first, data-second, outcome focused technology leader.
Experience & Relevance
- Executive Experience: Minimum 10+ years of progressive experience in data engineering and analytics, with at least 3 years in a Senior Manager or Director -level role managing multiple technical teams and owning significant operational and efficiency metrics for a large data service line.
- Delivery Standardization: Demonstrated success in defining and implementing globally consistent, repeatable delivery methodologies (DataOps/Agile Data Warehousing) across diverse teams.
- Architectural Depth: Must retain deep, current expertise in Modern Data Stack architectures (Lakehouse, MPP, Mesh) and maintain the ability to personally validate high-level architectural and data pipeline design decisions.
- Operational Leadership: Proven expertise in managing and scaling large professional services organizations, demonstrated ability to optimize utilization, resource allocation, and operational expense.
- Domain Expertise: Strong background in Enterprise Data Platforms, Applied AI/ML, Generative AI integration, or large-scale Cloud Data Migration.
- Communication: Exceptional executive-level presentation and negotiation skills, particularly in communicating complex operational, data quality, and governance metrics to C-level stakeholders.
Join the ‘real solvers’
ready to futurify?
If you are excited by the possibilities of what an AI-native engineering-led, modern tech consultancy can do to futurify businesses, apply here and experience the ‘Art of the possible’. Don’t Just Send a Resume. Send a Statement.
Solutions Architect - Data Engineering
Modern tech solutions advisory & 'futurify' consulting as a Searce lead fds (‘forward deployed solver’) architecting scalable data platforms and robust data engineering solutions that power intelligent insights and fuel AI innovation.
If you’re a tech-savvy, consultative seller with the brain of a strategist, the heart of a builder, and the charisma of a storyteller — we’ve got a seat for you at the front of the table.
You're not a sales lead. You're the transformation driver.
What are we looking for
real solver?
Solver? Absolutely. But not the usual kind. We're searching for the architects of the audacious & the pioneers of the possible. If you're the type to dismantle assumptions, re-engineer ‘best practices,’ and build solutions that make the future possible NOW, then you're speaking our language.
- Improver. Solver. Futurist.
- Great sense of humor.
- ‘Possible. It is.’ Mindset.
- Compassionate collaborator. Bold experimenter. Tireless iterator.
- Natural creativity that doesn’t just challenge the norm, but solves to design what’s better.
- Thinks in systems. Solves at scale.
This Isn’t for Everyone. But if you’re the kind who questions why things are done a certain way— and then identifies 3 better ways to do it — we’d love to chat with you.
Your Responsibilities
what you will wake up to solve.
You are not just a Solutions Architect; you are a futurifier of our data universe and the primary enabler of our AI ambitions. With a deep-seated passion for data engineering, you will architect and build the foundational data infrastructure that powers the customers entire data intelligence ecosystem.
As the Directly Responsible Individual (DRI) for our enterprise-grade data platforms, you own the outcome, end-to-end. You are the definitive solver for our customer's most complex data challenges, leveraging a powerful tech stack including Snowflake, Databricks, etc. and core GCP & AWS services (BigQuery, Spanner, Airflow, Kafka). This is a hands-on-keys role where you won't just design solutions—you'll build them, break them, and perfect them.
- Solution Design & Pre-sales Excellence:Collaborate with cross-functional teams, including sales, engineering, and operations, to ensure successful project delivery.
- Design Core Data Engineering: Master data modeling, architecting high-performance data ingestion pipelines and ensuring data quality and governance throughout the data lifecycle.
- Enable Cloud & AI: Design and implement solutions utilizing core GCP data services, building foundational data platforms that efficiently support advanced analytics and AI/ML initiatives.
- Optimize Performance & Cost: Continuously optimize data architectures and implementations for performance, efficiency, and cost-effectiveness within the cloud environment.
- Bridge Business & Tech: Translate complex business requirements into clear technical designs, providing technical leadership and guidance to data engineering teams.
- Stay Ahead of the Curve: Continuously research and evaluate new data technologies, architectural patterns, and industry trends to keep our data platforms at the cutting edge.
Functional Skills:
- Enterprise Data Architecture Design: Expert ability to design holistic, scalable, and resilient data architectures for complex enterprise environments.
- Cloud Data Platform Strategy: Proven capability to strategize, design, and implement cloud-native data platforms.
- Pre-Sales & Technical Storyteller: Crafts compelling, client-ready proposals, architectural decks, and technical demonstrations. Doesn't just present; shapes the strategic technical narrative behind every proposed solution.
- Advanced Data Modelling: Mastery in designing various data models for analytical, operational, and transactional use cases.
- Data Ingestion & Pipeline Orchestration: Strong expertise in designing and optimizing robust data ingestion and transformation pipelines.
- Stakeholder Communication: Exceptional skills in articulating complex technical concepts and architectural decisions to both technical and non-technical stakeholders.
- Performance & Cost Optimization: Adept at optimizing data solutions for performance, efficiency, and cost within a cloud environment.
Tech Superpowers:
- Cloud Data Mastery: You're a wizard at leveraging public cloud data services, with deep expertise in GCP (BigQuery, Spanner, etc.) and expert proficiency in modern data warehouse solutions like Snowflake.
- Data Engineering Core: Highly skilled in designing, implementing, and managing data workflows using tools like Apache Airflow and Apache Kafka. You're also an authority on advanced data modeling and ETL/ELT patterns.
- AI/ML Data Foundation: You instinctively design data pipelines and structures that efficiently feed and empower Machine Learning and Artificial Intelligence applications.
- Programming for Data: You have a strong command over key programming languages (Python, SQL) for scripting, automation, and building data processing applications.
Experience & Relevance:
- Architectural Leadership (8+ Years): You bring extensive experience (7+ years) specifically in a Solutions Architect role, focused on data engineering and platform building.
- Cloud Data Expertise: You have a proven track record of designing and implementing production-grade data solutions leveraging major public cloud platforms, with significant experience in Google Cloud Platform (GCP).
- Data Warehousing & Data Platform: Demonstrated hands-on experience in the end-to-end design, implementation, and optimization of modern data warehouses and comprehensive data platforms.
- Databricks & BigQuery Mastery: You possess significant practical experience with Databricks as a core data warehouse and GCP BigQuery for analytical workloads.
- Data Ingestion & Orchestration: Proven experience designing and implementing complex data ingestion pipelines and workflow orchestration using tools like Airflow and real-time streaming technologies like Kafka.
- AI/ML Data Enablement: Experience in building data foundations specifically geared towards supporting Machine Learning and Artificial Intelligence initiatives.
Join the ‘real solvers’
ready to futurify?
If you are excited by the possibilities of what an AI-native engineering-led, modern tech consultancy can do to futurify businesses, apply here and experience the ‘Art of the possible’.
Don’t Just Send a Resume. Send a Statement.
So, If you are passionate about tech, future & what you read above (we really are!), apply here to experience the ‘Art of Possible’
Description :
Experience : 3+ Years
Job Type : Full-time
Location : Sion, Mumbai (On-site)
Also Apply at : https://wohlig.keka.com/careers/jobdetails/122580
The Opportunity :
We are a global technology consultancy driving large-scale digital transformations for the Fortune 500. As a strategic partner to Google, we help enterprise clients navigate complex data landscapes migrating legacy systems to the cloud, optimizing costs, and turning raw data into executive-level insights.
We are seeking a Data Analyst who acts less like a technician and more like a Data Consultant. You will blend deep technical expertise in SQL and ETL with the soft skills required to tell compelling data stories to non-technical stakeholders.
What You Will Do :
- Strategic Cloud Data Architecture : Lead high-impact data migration projects. You will assess a client's legacy infrastructure and design the logic to move it to the cloud efficiently (focusing on scalability and security).
- Cost and Performance Optimization : Audit and optimize cloud data warehouses (e.g., BigQuery, Snowflake, Redshift). You will use logical reasoning to hunt down inefficiencies, optimize queries, and restructure data models to save clients significant operational costs.
- End-to-End Data Pipelines (ETL/ELT) : Build and maintain robust data pipelines. Whether its streaming or batch processing, you will ensure data flows seamlessly from source to dashboard using modern frameworks.
- Data Storytelling & Visualization : This is critical. You will build dashboards (Looker, Tableau, PowerBI) that don't just show numbers but answer business questions. You must be able to present these findings to C-suite clients with clarity and confidence.
- Advanced Analytics : Apply statistical rigor and logical deduction to solve unstructured business problems (e.g., Why is our user retention dropping?).
What We Are Looking For :
1. Core Competencies :
- Logical Reasoning : You possess a deductive mindset. You can break down ambiguous client problems into solvable technical components without needing hand-holding.
- Advanced SQL Mastery : You are fluent in complex SQL (Window functions, CTEs, stored procedures) and understand how to write query-efficient code for massive datasets.
- Communication Skills : You are an articulate storyteller who can bridge the gap between engineering jargon and business value.
2. Technical Experience :
- Cloud Proficiency : 3+ years of experience working within a major public cloud ecosystem (GCP, AWS, or Azure). Note : While we primarily use Google Cloud (BigQuery, Looker), we value strong architectural fundamentals over specific tool knowledge.
- Data Engineering & ETL : Experience with big data processing tools (e.g., Spark, Apache Beam, Databricks, or cloud-native equivalents).
- Visualization : Proven mastery of at least one enterprise BI tool (Looker, Tableau, PowerBI, Qlik).
Why Join us ?
- Cross-Cloud Exposure : While our current focus is GCP, your foundational skills will be challenged and expanded across various tech stacks.
- Client Impact : You aren't just writing code in a back room; you are the face of our data capability, directly advising clients on how to run their businesses better.
- Growth : We offer a structured career path with sponsorship for cloud certifications (Google Professional Data Engineer, etc.).
Review Criteria:
- Strong Dremio / Lakehouse Data Architect profile
- 5+ years of experience in Data Architecture / Data Engineering, with minimum 3+ years hands-on in Dremio
- Strong expertise in SQL optimization, data modeling, query performance tuning, and designing analytical schemas for large-scale systems
- Deep experience with cloud object storage (S3 / ADLS / GCS) and file formats such as Parquet, Delta, Iceberg along with distributed query planning concepts
- Hands-on experience integrating data via APIs, JDBC, Delta/Parquet, object storage, and coordinating with data engineering pipelines (Airflow, DBT, Kafka, Spark, etc.)
- Proven experience designing and implementing lakehouse architecture including ingestion, curation, semantic modeling, reflections/caching optimization, and enabling governed analytics
- Strong understanding of data governance, lineage, RBAC-based access control, and enterprise security best practices
- Excellent communication skills with ability to work closely with BI, data science, and engineering teams; strong documentation discipline
- Candidates must come from enterprise data modernization, cloud-native, or analytics-driven companies
Preferred:
- Experience integrating Dremio with BI tools (Tableau, Power BI, Looker) or data catalogs (Collibra, Alation, Purview); familiarity with Snowflake, Databricks, or BigQuery environments
Role & Responsibilities:
You will be responsible for architecting, implementing, and optimizing Dremio-based data lakehouse environments integrated with cloud storage, BI, and data engineering ecosystems. The role requires a strong balance of architecture design, data modeling, query optimization, and governance enablement in large-scale analytical environments.
- Design and implement Dremio lakehouse architecture on cloud (AWS/Azure/Snowflake/Databricks ecosystem).
- Define data ingestion, curation, and semantic modeling strategies to support analytics and AI workloads.
- Optimize Dremio reflections, caching, and query performance for diverse data consumption patterns.
- Collaborate with data engineering teams to integrate data sources via APIs, JDBC, Delta/Parquet, and object storage layers (S3/ADLS).
- Establish best practices for data security, lineage, and access control aligned with enterprise governance policies.
- Support self-service analytics by enabling governed data products and semantic layers.
- Develop reusable design patterns, documentation, and standards for Dremio deployment, monitoring, and scaling.
- Work closely with BI and data science teams to ensure fast, reliable, and well-modeled access to enterprise data.
Ideal Candidate:
- Bachelor’s or Master’s in Computer Science, Information Systems, or related field.
- 5+ years in data architecture and engineering, with 3+ years in Dremio or modern lakehouse platforms.
- Strong expertise in SQL optimization, data modeling, and performance tuning within Dremio or similar query engines (Presto, Trino, Athena).
- Hands-on experience with cloud storage (S3, ADLS, GCS), Parquet/Delta/Iceberg formats, and distributed query planning.
- Knowledge of data integration tools and pipelines (Airflow, DBT, Kafka, Spark, etc.).
- Familiarity with enterprise data governance, metadata management, and role-based access control (RBAC).
- Excellent problem-solving, documentation, and stakeholder communication skills.
Preferred:
- Experience integrating Dremio with BI tools (Tableau, Power BI, Looker) and data catalogs (Collibra, Alation, Purview).
- Exposure to Snowflake, Databricks, or BigQuery environments.
- Experience in high-tech, manufacturing, or enterprise data modernization programs.
ROLES AND RESPONSIBILITIES:
You will be responsible for architecting, implementing, and optimizing Dremio-based data Lakehouse environments integrated with cloud storage, BI, and data engineering ecosystems. The role requires a strong balance of architecture design, data modeling, query optimization, and governance enablement in large-scale analytical environments.
- Design and implement Dremio lakehouse architecture on cloud (AWS/Azure/Snowflake/Databricks ecosystem).
- Define data ingestion, curation, and semantic modeling strategies to support analytics and AI workloads.
- Optimize Dremio reflections, caching, and query performance for diverse data consumption patterns.
- Collaborate with data engineering teams to integrate data sources via APIs, JDBC, Delta/Parquet, and object storage layers (S3/ADLS).
- Establish best practices for data security, lineage, and access control aligned with enterprise governance policies.
- Support self-service analytics by enabling governed data products and semantic layers.
- Develop reusable design patterns, documentation, and standards for Dremio deployment, monitoring, and scaling.
- Work closely with BI and data science teams to ensure fast, reliable, and well-modeled access to enterprise data.
IDEAL CANDIDATE:
- Bachelor’s or Master’s in Computer Science, Information Systems, or related field.
- 5+ years in data architecture and engineering, with 3+ years in Dremio or modern lakehouse platforms.
- Strong expertise in SQL optimization, data modeling, and performance tuning within Dremio or similar query engines (Presto, Trino, Athena).
- Hands-on experience with cloud storage (S3, ADLS, GCS), Parquet/Delta/Iceberg formats, and distributed query planning.
- Knowledge of data integration tools and pipelines (Airflow, DBT, Kafka, Spark, etc.).
- Familiarity with enterprise data governance, metadata management, and role-based access control (RBAC).
- Excellent problem-solving, documentation, and stakeholder communication skills.
PREFERRED:
- Experience integrating Dremio with BI tools (Tableau, Power BI, Looker) and data catalogs (Collibra, Alation, Purview).
- Exposure to Snowflake, Databricks, or BigQuery environments.
- Experience in high-tech, manufacturing, or enterprise data modernization programs.
Review Criteria
- Strong Dremio / Lakehouse Data Architect profile
- 5+ years of experience in Data Architecture / Data Engineering, with minimum 3+ years hands-on in Dremio
- Strong expertise in SQL optimization, data modeling, query performance tuning, and designing analytical schemas for large-scale systems
- Deep experience with cloud object storage (S3 / ADLS / GCS) and file formats such as Parquet, Delta, Iceberg along with distributed query planning concepts
- Hands-on experience integrating data via APIs, JDBC, Delta/Parquet, object storage, and coordinating with data engineering pipelines (Airflow, DBT, Kafka, Spark, etc.)
- Proven experience designing and implementing lakehouse architecture including ingestion, curation, semantic modeling, reflections/caching optimization, and enabling governed analytics
- Strong understanding of data governance, lineage, RBAC-based access control, and enterprise security best practices
- Excellent communication skills with ability to work closely with BI, data science, and engineering teams; strong documentation discipline
- Candidates must come from enterprise data modernization, cloud-native, or analytics-driven companies
Preferred
- Preferred (Nice-to-have) – Experience integrating Dremio with BI tools (Tableau, Power BI, Looker) or data catalogs (Collibra, Alation, Purview); familiarity with Snowflake, Databricks, or BigQuery environments
Job Specific Criteria
- CV Attachment is mandatory
- How many years of experience you have with Dremio?
- Which is your preferred job location (Mumbai / Bengaluru / Hyderabad / Gurgaon)?
- Are you okay with 3 Days WFO?
- Virtual Interview requires video to be on, are you okay with it?
Role & Responsibilities
You will be responsible for architecting, implementing, and optimizing Dremio-based data lakehouse environments integrated with cloud storage, BI, and data engineering ecosystems. The role requires a strong balance of architecture design, data modeling, query optimization, and governance enablement in large-scale analytical environments.
- Design and implement Dremio lakehouse architecture on cloud (AWS/Azure/Snowflake/Databricks ecosystem).
- Define data ingestion, curation, and semantic modeling strategies to support analytics and AI workloads.
- Optimize Dremio reflections, caching, and query performance for diverse data consumption patterns.
- Collaborate with data engineering teams to integrate data sources via APIs, JDBC, Delta/Parquet, and object storage layers (S3/ADLS).
- Establish best practices for data security, lineage, and access control aligned with enterprise governance policies.
- Support self-service analytics by enabling governed data products and semantic layers.
- Develop reusable design patterns, documentation, and standards for Dremio deployment, monitoring, and scaling.
- Work closely with BI and data science teams to ensure fast, reliable, and well-modeled access to enterprise data.
Ideal Candidate
- Bachelor’s or master’s in computer science, Information Systems, or related field.
- 5+ years in data architecture and engineering, with 3+ years in Dremio or modern lakehouse platforms.
- Strong expertise in SQL optimization, data modeling, and performance tuning within Dremio or similar query engines (Presto, Trino, Athena).
- Hands-on experience with cloud storage (S3, ADLS, GCS), Parquet/Delta/Iceberg formats, and distributed query planning.
- Knowledge of data integration tools and pipelines (Airflow, DBT, Kafka, Spark, etc.).
- Familiarity with enterprise data governance, metadata management, and role-based access control (RBAC).
- Excellent problem-solving, documentation, and stakeholder communication skills.
Role Overview:
We are seeking a talented and experienced Data Architect with strong data visualization capabilities to join our dynamic team in Mumbai. As a Data Architect, you will be responsible for designing, building, and managing our data infrastructure, ensuring its reliability, scalability, and performance. You will also play a crucial role in transforming complex data into insightful visualizations that drive business decisions. This role requires a deep understanding of data modeling, database technologies (particularly Oracle Cloud), data warehousing principles, and proficiency in data manipulation and visualization tools, including Python and SQL.
Responsibilities:
- Design and implement robust and scalable data architectures, including data warehouses, data lakes, and operational data stores, primarily leveraging Oracle Cloud services.
- Develop and maintain data models (conceptual, logical, and physical) that align with business requirements and ensure data integrity and consistency.
- Define data governance policies and procedures to ensure data quality, security, and compliance.
- Collaborate with data engineers to build and optimize ETL/ELT pipelines for efficient data ingestion, transformation, and loading.
- Develop and execute data migration strategies to Oracle Cloud.
- Utilize strong SQL skills to query, manipulate, and analyze large datasets from various sources.
- Leverage Python and relevant libraries (e.g., Pandas, NumPy) for data cleaning, transformation, and analysis.
- Design and develop interactive and insightful data visualizations using tools like [Specify Visualization Tools - e.g., Tableau, Power BI, Matplotlib, Seaborn, Plotly] to communicate data-driven insights to both technical and non-technical stakeholders.
- Work closely with business analysts and stakeholders to understand their data needs and translate them into effective data models and visualizations.
- Ensure the performance and reliability of data visualization dashboards and reports.
- Stay up-to-date with the latest trends and technologies in data architecture, cloud computing (especially Oracle Cloud), and data visualization.
- Troubleshoot data-related issues and provide timely resolutions.
- Document data architectures, data flows, and data visualization solutions.
- Participate in the evaluation and selection of new data technologies and tools.
Qualifications:
- Bachelor's or Master's degree in Computer Science, Data Science, Information Systems, or a related field.
- Proven experience (typically 5+ years) as a Data Architect, Data Modeler, or similar role.
- Deep understanding of data warehousing concepts, dimensional modeling (e.g., star schema, snowflake schema), and ETL/ELT processes.
- Extensive experience working with relational databases, particularly Oracle, and proficiency in SQL.
- Hands-on experience with Oracle Cloud data services (e.g., Autonomous Data Warehouse, Object Storage, Data Integration).
- Strong programming skills in Python and experience with data manipulation and analysis libraries (e.g., Pandas, NumPy).
- Demonstrated ability to create compelling and effective data visualizations using industry-standard tools (e.g., Tableau, Power BI, Matplotlib, Seaborn, Plotly).
- Excellent analytical and problem-solving skills with the ability to interpret complex data and translate it into actionable insights.
- Strong communication and presentation skills, with the ability to effectively communicate technical concepts to non-technical audiences.
- Experience with data governance and data quality principles.
- Familiarity with agile development methodologies.
- Ability to work independently and collaboratively within a team environment.
Application Link- https://forms.gle/km7n2WipJhC2Lj2r5




