2+ Glue semantics Jobs in Bangalore (Bengaluru) | Glue semantics Job openings in Bangalore (Bengaluru)
Apply to 2+ Glue semantics Jobs in Bangalore (Bengaluru) on CutShort.io. Explore the latest Glue semantics Job opportunities across top companies like Google, Amazon & Adobe.
ROLES AND RESPONSIBILITIES:
You will be responsible for architecting, implementing, and optimizing Dremio-based data Lakehouse environments integrated with cloud storage, BI, and data engineering ecosystems. The role requires a strong balance of architecture design, data modeling, query optimization, and governance enablement in large-scale analytical environments.
- Design and implement Dremio lakehouse architecture on cloud (AWS/Azure/Snowflake/Databricks ecosystem).
- Define data ingestion, curation, and semantic modeling strategies to support analytics and AI workloads.
- Optimize Dremio reflections, caching, and query performance for diverse data consumption patterns.
- Collaborate with data engineering teams to integrate data sources via APIs, JDBC, Delta/Parquet, and object storage layers (S3/ADLS).
- Establish best practices for data security, lineage, and access control aligned with enterprise governance policies.
- Support self-service analytics by enabling governed data products and semantic layers.
- Develop reusable design patterns, documentation, and standards for Dremio deployment, monitoring, and scaling.
- Work closely with BI and data science teams to ensure fast, reliable, and well-modeled access to enterprise data.
IDEAL CANDIDATE:
- Bachelor’s or Master’s in Computer Science, Information Systems, or related field.
- 5+ years in data architecture and engineering, with 3+ years in Dremio or modern lakehouse platforms.
- Strong expertise in SQL optimization, data modeling, and performance tuning within Dremio or similar query engines (Presto, Trino, Athena).
- Hands-on experience with cloud storage (S3, ADLS, GCS), Parquet/Delta/Iceberg formats, and distributed query planning.
- Knowledge of data integration tools and pipelines (Airflow, DBT, Kafka, Spark, etc.).
- Familiarity with enterprise data governance, metadata management, and role-based access control (RBAC).
- Excellent problem-solving, documentation, and stakeholder communication skills.
PREFERRED:
- Experience integrating Dremio with BI tools (Tableau, Power BI, Looker) and data catalogs (Collibra, Alation, Purview).
- Exposure to Snowflake, Databricks, or BigQuery environments.
- Experience in high-tech, manufacturing, or enterprise data modernization programs.
Job Overview:
We are seeking an experienced AWS Data Engineer to join our growing data team. The ideal candidate will have hands-on experience with AWS Glue, Redshift, PySpark, and other AWS services to build robust, scalable data pipelines. This role is perfect for someone passionate about data engineering, automation, and cloud-native development.
Key Responsibilities:
- Design, build, and maintain scalable and efficient ETL pipelines using AWS Glue, PySpark, and related tools.
- Integrate data from diverse sources and ensure its quality, consistency, and reliability.
- Work with large datasets in structured and semi-structured formats across cloud-based data lakes and warehouses.
- Optimize and maintain data infrastructure, including Amazon Redshift, for high performance.
- Collaborate with data analysts, data scientists, and product teams to understand data requirements and deliver solutions.
- Automate data validation, transformation, and loading processes to support real-time and batch data processing.
- Monitor and troubleshoot data pipeline issues and ensure smooth operations in production environments.
Required Skills:
- 5 to 7 years of hands-on experience in data engineering roles.
- Strong proficiency in Python and PySpark for data transformation and scripting.
- Deep understanding and practical experience with AWS Glue, AWS Redshift, S3, and other AWS data services.
- Solid understanding of SQL and database optimization techniques.
- Experience working with large-scale data pipelines and high-volume data environments.
- Good knowledge of data modeling, warehousing, and performance tuning.
Preferred/Good to Have:
- Experience with workflow orchestration tools like Airflow or Step Functions.
- Familiarity with CI/CD for data pipelines.
- Knowledge of data governance and security best practices on AWS.


