Glue semantics Jobs in Bangalore (Bengaluru)

2+ Glue semantics Jobs in Bangalore (Bengaluru) | Glue semantics Job openings in Bangalore (Bengaluru)

Apply to 2+ Glue semantics Jobs in Bangalore (Bengaluru) on CutShort.io. Explore the latest Glue semantics Job opportunities across top companies like Google, Amazon & Adobe.

Data Architect (Dremio Lakehouse)

AI-First Company

Agency job

via Peak Hire Solutions by Dharati Thakkar

Bengaluru (Bangalore), Mumbai, Hyderabad, Gurugram

5 - 17 yrs

₹30L - ₹45L / yr

Data engineering

Data architecture

SQL

Data modeling

GCS

+47 more

ROLES AND RESPONSIBILITIES:

You will be responsible for architecting, implementing, and optimizing Dremio-based data Lakehouse environments integrated with cloud storage, BI, and data engineering ecosystems. The role requires a strong balance of architecture design, data modeling, query optimization, and governance enablement in large-scale analytical environments.

Design and implement Dremio lakehouse architecture on cloud (AWS/Azure/Snowflake/Databricks ecosystem).
Define data ingestion, curation, and semantic modeling strategies to support analytics and AI workloads.
Optimize Dremio reflections, caching, and query performance for diverse data consumption patterns.
Collaborate with data engineering teams to integrate data sources via APIs, JDBC, Delta/Parquet, and object storage layers (S3/ADLS).
Establish best practices for data security, lineage, and access control aligned with enterprise governance policies.
Support self-service analytics by enabling governed data products and semantic layers.
Develop reusable design patterns, documentation, and standards for Dremio deployment, monitoring, and scaling.
Work closely with BI and data science teams to ensure fast, reliable, and well-modeled access to enterprise data.

IDEAL CANDIDATE:

Bachelor’s or Master’s in Computer Science, Information Systems, or related field.
5+ years in data architecture and engineering, with 3+ years in Dremio or modern lakehouse platforms.
Strong expertise in SQL optimization, data modeling, and performance tuning within Dremio or similar query engines (Presto, Trino, Athena).
Hands-on experience with cloud storage (S3, ADLS, GCS), Parquet/Delta/Iceberg formats, and distributed query planning.
Knowledge of data integration tools and pipelines (Airflow, DBT, Kafka, Spark, etc.).
Familiarity with enterprise data governance, metadata management, and role-based access control (RBAC).
Excellent problem-solving, documentation, and stakeholder communication skills.

PREFERRED:

Experience integrating Dremio with BI tools (Tableau, Power BI, Looker) and data catalogs (Collibra, Alation, Purview).
Exposure to Snowflake, Databricks, or BigQuery environments.
Experience in high-tech, manufacturing, or enterprise data modernization programs.

ROLES AND RESPONSIBILITIES:

Design and implement Dremio lakehouse architecture on cloud (AWS/Azure/Snowflake/Databricks ecosystem).
Define data ingestion, curation, and semantic modeling strategies to support analytics and AI workloads.
Optimize Dremio reflections, caching, and query performance for diverse data consumption patterns.
Collaborate with data engineering teams to integrate data sources via APIs, JDBC, Delta/Parquet, and object storage layers (S3/ADLS).
Establish best practices for data security, lineage, and access control aligned with enterprise governance policies.
Support self-service analytics by enabling governed data products and semantic layers.
Develop reusable design patterns, documentation, and standards for Dremio deployment, monitoring, and scaling.
Work closely with BI and data science teams to ensure fast, reliable, and well-modeled access to enterprise data.

IDEAL CANDIDATE:

Bachelor’s or Master’s in Computer Science, Information Systems, or related field.
5+ years in data architecture and engineering, with 3+ years in Dremio or modern lakehouse platforms.
Strong expertise in SQL optimization, data modeling, and performance tuning within Dremio or similar query engines (Presto, Trino, Athena).
Hands-on experience with cloud storage (S3, ADLS, GCS), Parquet/Delta/Iceberg formats, and distributed query planning.
Knowledge of data integration tools and pipelines (Airflow, DBT, Kafka, Spark, etc.).
Familiarity with enterprise data governance, metadata management, and role-based access control (RBAC).
Excellent problem-solving, documentation, and stakeholder communication skills.

PREFERRED:

Experience integrating Dremio with BI tools (Tableau, Power BI, Looker) and data catalogs (Collibra, Alation, Purview).
Exposure to Snowflake, Databricks, or BigQuery environments.
Experience in high-tech, manufacturing, or enterprise data modernization programs.

AWS Data Engineer

at Deqode

1 recruiter

Posted by Alisha Das

Pune, Mumbai, Bengaluru (Bangalore), Chennai

4 - 7 yrs

₹5L - ₹15L / yr

Amazon Web Services (AWS)

Python

PySpark

Glue semantics

Amazon Redshift

+1 more

Job Overview:

We are seeking an experienced AWS Data Engineer to join our growing data team. The ideal candidate will have hands-on experience with AWS Glue, Redshift, PySpark, and other AWS services to build robust, scalable data pipelines. This role is perfect for someone passionate about data engineering, automation, and cloud-native development.

Key Responsibilities:

Design, build, and maintain scalable and efficient ETL pipelines using AWS Glue, PySpark, and related tools.
Integrate data from diverse sources and ensure its quality, consistency, and reliability.
Work with large datasets in structured and semi-structured formats across cloud-based data lakes and warehouses.
Optimize and maintain data infrastructure, including Amazon Redshift, for high performance.
Collaborate with data analysts, data scientists, and product teams to understand data requirements and deliver solutions.
Automate data validation, transformation, and loading processes to support real-time and batch data processing.
Monitor and troubleshoot data pipeline issues and ensure smooth operations in production environments.

Required Skills:

5 to 7 years of hands-on experience in data engineering roles.
Strong proficiency in Python and PySpark for data transformation and scripting.
Deep understanding and practical experience with AWS Glue, AWS Redshift, S3, and other AWS data services.
Solid understanding of SQL and database optimization techniques.
Experience working with large-scale data pipelines and high-volume data environments.
Good knowledge of data modeling, warehousing, and performance tuning.

Preferred/Good to Have:

Experience with workflow orchestration tools like Airflow or Step Functions.
Familiarity with CI/CD for data pipelines.
Knowledge of data governance and security best practices on AWS.

Job Overview:

Key Responsibilities:

Design, build, and maintain scalable and efficient ETL pipelines using AWS Glue, PySpark, and related tools.
Integrate data from diverse sources and ensure its quality, consistency, and reliability.
Work with large datasets in structured and semi-structured formats across cloud-based data lakes and warehouses.
Optimize and maintain data infrastructure, including Amazon Redshift, for high performance.
Collaborate with data analysts, data scientists, and product teams to understand data requirements and deliver solutions.
Automate data validation, transformation, and loading processes to support real-time and batch data processing.
Monitor and troubleshoot data pipeline issues and ensure smooth operations in production environments.

Required Skills:

5 to 7 years of hands-on experience in data engineering roles.
Strong proficiency in Python and PySpark for data transformation and scripting.
Deep understanding and practical experience with AWS Glue, AWS Redshift, S3, and other AWS data services.
Solid understanding of SQL and database optimization techniques.
Experience working with large-scale data pipelines and high-volume data environments.
Good knowledge of data modeling, warehousing, and performance tuning.

Preferred/Good to Have:

Experience with workflow orchestration tools like Airflow or Step Functions.
Familiarity with CI/CD for data pipelines.
Knowledge of data governance and security best practices on AWS.

Get to hear about interesting companies hiring right now

Follow Cutshort

Why apply via Cutshort?

Connect with actual hiring teams and get their fast response. No spam.

Find more jobs

Get to hear about interesting companies hiring right now

Follow Cutshort