Pricing FAQs For employers

Apache Spark Jobs in Chennai

Apache spark jobs

Jobs by category

9+ Apache Spark Jobs in Chennai | Apache Spark Job openings in Chennai

Apply to 9+ Apache Spark Jobs in Chennai on CutShort.io. Explore the latest Apache Spark Job opportunities across top companies like Google, Amazon & Adobe.

Service based company

Databricks Architect

Service based company

Agency job

via Codemind Staffing Solutions by Krishna kumar

Chennai

10 - 15 yrs

₹25L - ₹32L / yr

databricks

azure databricks

Apache Spark

PySpark

datalake

Key Responsibilities

Architect and implement enterprise-grade Lakehouse solutions using Databricks

Design and deliver scalable batch and real-time data pipelines using Apache Spark (PySpark/SQL)

Build ETL/ELT pipelines, incremental data loads, and metadata-driven ingestion frameworks

Implement and optimize Databricks components: Delta Lake, Delta Live Tables, Autoloader, Structured Streaming, and Workflows

Design large-scale data warehousing solutions with 3NF and dimensional modeling

Establish data governance, security, and data quality frameworks, including Unity Catalog

Lead ML lifecycle management using MLflow and drive AI use cases (RAG, AI/BI)

Manage cloud-native deployments on Microsoft Azure and integrate with enterprise systems (e.g., ServiceNow)

Drive CI/CD, DevOps practices, and performance optimization of Spark workloads

Provide technical leadership, mentor teams, and ensure successful delivery

Collaborate with stakeholders to translate business requirements into scalable solutions

Required Skills & Experience

10+ years in Data Engineering / Analytics / AI with strong delivery ownership

Deep expertise in Databricks ecosystem (Notebooks, Delta Lake, Workflows, AI/BI, Apps, Genie)

Strong hands-on experience with:

Apache Spark (performance tuning & scalability)

Python and SQL

Proven experience in:

Solution architecture and large-scale data platforms

Data warehousing and advanced data modeling

Batch and real-time processing systems

Experience with:

Azure Databricks and Azure data services

ML flow and MLOps practices

ServiceNow or enterprise integrations

Exposure to AI technologies (RAG, LLM-based solutions)

Strong stakeholder management and leadership skills

Certifications (Preferred)

Databricks certifications aligned to data engineering and AI tracks, such as:

Databricks Certified Data Engineer Associate (validates foundational ETL, Spark, and Lakehouse capabilities)

Databricks Certified Data Engineer Professional (advanced expertise in pipeline design, optimization, and governance)

Certifications in Databricks Machine Learning or Generative AI tracks (e.g., ML Associate / Professional) for AI-driven use cases

Relevant cloud certifications in Microsoft Azure or Amazon Web Services for platform deployment and architecture

Key Responsibilities

Architect and implement enterprise-grade Lakehouse solutions using Databricks

Design and deliver scalable batch and real-time data pipelines using Apache Spark (PySpark/SQL)

Build ETL/ELT pipelines, incremental data loads, and metadata-driven ingestion frameworks

Implement and optimize Databricks components: Delta Lake, Delta Live Tables, Autoloader, Structured Streaming, and Workflows

Design large-scale data warehousing solutions with 3NF and dimensional modeling

Establish data governance, security, and data quality frameworks, including Unity Catalog

Lead ML lifecycle management using MLflow and drive AI use cases (RAG, AI/BI)

Manage cloud-native deployments on Microsoft Azure and integrate with enterprise systems (e.g., ServiceNow)

Drive CI/CD, DevOps practices, and performance optimization of Spark workloads

Provide technical leadership, mentor teams, and ensure successful delivery

Collaborate with stakeholders to translate business requirements into scalable solutions

Required Skills & Experience

10+ years in Data Engineering / Analytics / AI with strong delivery ownership

Deep expertise in Databricks ecosystem (Notebooks, Delta Lake, Workflows, AI/BI, Apps, Genie)

Strong hands-on experience with:

Apache Spark (performance tuning & scalability)

Python and SQL

Proven experience in:

Solution architecture and large-scale data platforms

Data warehousing and advanced data modeling

Batch and real-time processing systems

Experience with:

Azure Databricks and Azure data services

ML flow and MLOps practices

ServiceNow or enterprise integrations

Exposure to AI technologies (RAG, LLM-based solutions)

Strong stakeholder management and leadership skills

Certifications (Preferred)

Databricks certifications aligned to data engineering and AI tracks, such as:

Databricks Certified Data Engineer Associate (validates foundational ETL, Spark, and Lakehouse capabilities)

Databricks Certified Data Engineer Professional (advanced expertise in pipeline design, optimization, and governance)

Certifications in Databricks Machine Learning or Generative AI tracks (e.g., ML Associate / Professional) for AI-driven use cases

Relevant cloud certifications in Microsoft Azure or Amazon Web Services for platform deployment and architecture

Read more

Service based company

Databricks Lead

Service based company

Agency job

via Codemind Staffing Solutions by Krishna kumar

Chennai

9 - 14 yrs

₹20L - ₹30L / yr

databricks

Spark

Apache Spark

Python

ETL

Key Responsibilities

 Architect and implement enterprise-grade Lakehouse solutions using Databricks

 Design and deliver scalable batch and real-time data pipelines using Apache Spark (PySpark/SQL)

 Build ETL/ELT pipelines, incremental data loads, and metadata-driven ingestion frameworks

 Implement and optimize Databricks components: Delta Lake, Delta Live Tables, Autoloader, Structured Streaming, and Workflows

 Design large-scale data warehousing solutions with 3NF and dimensional modeling

 Establish data governance, security, and data quality frameworks, including Unity Catalog

 Lead ML lifecycle management using MLflow and drive AI use cases (RAG, AI/BI)

 Manage cloud-native deployments on Microsoft Azure and integrate with enterprise systems (e.g., ServiceNow)

 Drive CI/CD, DevOps practices, and performance optimization of Spark workloads

 Provide technical leadership, mentor teams, and ensure successful delivery

 Collaborate with stakeholders to translate business requirements into scalable solutions

Required Skills & Experience

 10+ years in Data Engineering / Analytics / AI with strong delivery ownership

 Deep expertise in Databricks ecosystem (Notebooks, Delta Lake, Workflows, AI/BI, Apps, Genie)

 Strong hands-on experience with:

a. Apache Spark (performance tuning & scalability)

b. Python and SQL

 Proven experience in:

a. Solution architecture and large-scale data platforms

b. Data warehousing and advanced data modeling

c. Batch and real-time processing systems

 Experience with:

a. Azure Databricks and Azure data services

b. MLflow and MLOps practices

c. ServiceNow or enterprise integrations

 Exposure to AI technologies (RAG, LLM-based solutions)

 Strong stakeholder management and leadership skills

Certifications (Preferred)

 Databricks certifications aligned to data engineering and AI tracks, such as:

a. Databricks Certified Data Engineer Associate (validates foundational ETL, Spark, and Lakehouse capabilities)

b. Databricks Certified Data Engineer Professional (advanced expertise in pipeline design, optimization, and governance)

 Certifications in Databricks Machine Learning or Generative AI tracks (e.g., ML Associate / Professional) for AI-driven use cases

 Relevant cloud certifications in Microsoft Azure or Amazon Web Services for platform deployment and architecture

Key Responsibilities

 Architect and implement enterprise-grade Lakehouse solutions using Databricks

 Design and deliver scalable batch and real-time data pipelines using Apache Spark (PySpark/SQL)

 Build ETL/ELT pipelines, incremental data loads, and metadata-driven ingestion frameworks

 Implement and optimize Databricks components: Delta Lake, Delta Live Tables, Autoloader, Structured Streaming, and Workflows

 Design large-scale data warehousing solutions with 3NF and dimensional modeling

 Establish data governance, security, and data quality frameworks, including Unity Catalog

 Lead ML lifecycle management using MLflow and drive AI use cases (RAG, AI/BI)

 Manage cloud-native deployments on Microsoft Azure and integrate with enterprise systems (e.g., ServiceNow)

 Drive CI/CD, DevOps practices, and performance optimization of Spark workloads

 Provide technical leadership, mentor teams, and ensure successful delivery

 Collaborate with stakeholders to translate business requirements into scalable solutions

Required Skills & Experience

 10+ years in Data Engineering / Analytics / AI with strong delivery ownership

 Deep expertise in Databricks ecosystem (Notebooks, Delta Lake, Workflows, AI/BI, Apps, Genie)

 Strong hands-on experience with:

a. Apache Spark (performance tuning & scalability)

b. Python and SQL

 Proven experience in:

a. Solution architecture and large-scale data platforms

b. Data warehousing and advanced data modeling

c. Batch and real-time processing systems

 Experience with:

a. Azure Databricks and Azure data services

b. MLflow and MLOps practices

c. ServiceNow or enterprise integrations

 Exposure to AI technologies (RAG, LLM-based solutions)

 Strong stakeholder management and leadership skills

Certifications (Preferred)

 Databricks certifications aligned to data engineering and AI tracks, such as:

a. Databricks Certified Data Engineer Associate (validates foundational ETL, Spark, and Lakehouse capabilities)

b. Databricks Certified Data Engineer Professional (advanced expertise in pipeline design, optimization, and governance)

 Certifications in Databricks Machine Learning or Generative AI tracks (e.g., ML Associate / Professional) for AI-driven use cases

 Relevant cloud certifications in Microsoft Azure or Amazon Web Services for platform deployment and architecture

Read more

Quintica

Solution/Technical Architect (Databricks)

at Quintica

Nitin D

Posted by Nitin D

Remote, Bengaluru (Bangalore), Pune, Chennai, Nagpur

5 - 15 yrs

₹20L - ₹30L / yr

databricks

PySpark

Apache Spark

CI/CD

Data engineering

Technical Architect (Databricks)

10+ Years Data Engineering Experience with expertise in Databricks
3+ years of consulting experience
Completed Data Engineering Professional certification & required classes
Minimum 2-3 projects delivered with hands-on experience in Databricks
Completed Apache Spark Programming with Databricks, Data Engineering with Databricks, Optimizing Apache Spark™ on Databricks
Experience in Spark and/or Hadoop, Flink, Presto, other popular big data engines
Familiarity with Databricks multi-hop pipeline architecture

Sr. Data Engineer (Databricks)

5+ Years Data Engineering Experience with expertise in Databricks
Completed Data Engineering Associate certification & required classes
Minimum 1 project delivered with hands-on experience in development on Databricks
Completed Apache Spark Programming with Databricks, Data Engineering with Databricks, Optimizing Apache Spark™ on Databricks
SQL delivery experience, and familiarity with Bigquery, Synapse or Redshift
Proficient in Python, knowledge of additional databricks programming languages (Scala)

Technical Architect (Databricks)

10+ Years Data Engineering Experience with expertise in Databricks
3+ years of consulting experience
Completed Data Engineering Professional certification & required classes
Minimum 2-3 projects delivered with hands-on experience in Databricks
Completed Apache Spark Programming with Databricks, Data Engineering with Databricks, Optimizing Apache Spark™ on Databricks
Experience in Spark and/or Hadoop, Flink, Presto, other popular big data engines
Familiarity with Databricks multi-hop pipeline architecture

Sr. Data Engineer (Databricks)

5+ Years Data Engineering Experience with expertise in Databricks
Completed Data Engineering Associate certification & required classes
Minimum 1 project delivered with hands-on experience in development on Databricks
Completed Apache Spark Programming with Databricks, Data Engineering with Databricks, Optimizing Apache Spark™ on Databricks
SQL delivery experience, and familiarity with Bigquery, Synapse or Redshift
Proficient in Python, knowledge of additional databricks programming languages (Scala)

Read more

top MNC

data engineer

top MNC

Agency job

via Vy Systems by thirega thanasekaran

Bengaluru (Bangalore), Chennai, Hyderabad, Coimbatore, Kochi (Cochin), Thrissur, Thiruvananthapuram, Kozhikode (Calicut), Kasaragod

5 - 12 yrs

₹5L - ₹9L / yr

Data engineering

databricks

Apache Synapse

Apache Spark

Job Summary:

Seeking an experienced Senior Data Engineer to lead data ingestion, transformation, and optimization initiatives using the modern Apache and Azure data stack. The role involves working on scalable pipelines, large-scale distributed systems, and data lake management.

Core Responsibilities:

· Build and manage high-volume data pipelines using Spark/Databricks.

· Implement ELT frameworks using Azure Data Factory/Synapse Pipelines.

· Optimize large-scale datasets in Delta/Iceberg formats.

· Implement robust data quality, monitoring, and governance layers.

· Collaborate with Data Scientists, Analysts, and Business stakeholders.

Technical Stack:

· Big Data: Apache Spark, Kafka, Hive, Airflow, Hudi/Iceberg

· Cloud: Azure (Synapse, ADF, ADLS Gen2), Databricks, AWS (Glue/S3)

· Languages: Python, Scala, SQL

· Storage Formats: Delta Lake, Iceberg, Parquet, ORC

· CI/CD: Azure DevOps, Terraform (infra as code), Git

Senior Data Engineer (Apache Stack + Databricks/Synapse)

Share cv to

Thirega@ vysystems dot com - WhatsApp - 91Five0033Five2Three

Job Summary:

Seeking an experienced Senior Data Engineer to lead data ingestion, transformation, and optimization initiatives using the modern Apache and Azure data stack. The role involves working on scalable pipelines, large-scale distributed systems, and data lake management.

Core Responsibilities:

· Build and manage high-volume data pipelines using Spark/Databricks.

· Implement ELT frameworks using Azure Data Factory/Synapse Pipelines.

· Optimize large-scale datasets in Delta/Iceberg formats.

· Implement robust data quality, monitoring, and governance layers.

· Collaborate with Data Scientists, Analysts, and Business stakeholders.

Technical Stack:

· Big Data: Apache Spark, Kafka, Hive, Airflow, Hudi/Iceberg

· Cloud: Azure (Synapse, ADF, ADLS Gen2), Databricks, AWS (Glue/S3)

· Languages: Python, Scala, SQL

· Storage Formats: Delta Lake, Iceberg, Parquet, ORC

· CI/CD: Azure DevOps, Terraform (infra as code), Git

Senior Data Engineer (Apache Stack + Databricks/Synapse)

Share cv to

Thirega@ vysystems dot com - WhatsApp - 91Five0033Five2Three

Read more

Cubera Tech India Pvt Ltd

Senior Data Engineer

at Cubera Tech India Pvt Ltd

Surabhi Koushik

Posted by Surabhi Koushik

Bengaluru (Bangalore), Chennai

5 - 8 yrs

Best in industry

Data engineering

Big Data

Java

Python

Hibernate (Java)

+10 more

Data Engineer- Senior

Cubera is a data company revolutionizing big data analytics and Adtech through data share value principles wherein the users entrust their data to us. We refine the art of understanding, processing, extracting, and evaluating the data that is entrusted to us. We are a gateway for brands to increase their lead efficiency as the world moves towards web3.

What are you going to do?

Design & Develop high performance and scalable solutions that meet the needs of our customers.

Closely work with the Product Management, Architects and cross functional teams.

Build and deploy large-scale systems in Java/Python.

Identify, design, and implement internal process improvements: automating manual processes, optimizing data delivery, re-designing infrastructure for greater scalability, etc.

Create data tools for analytics and data scientist team members that assist them in building and optimizing their algorithms.

Follow best practices that can be adopted in Bigdata stack.

Use your engineering experience and technical skills to drive the features and mentor the engineers.

What are we looking for ( Competencies) :

Bachelor’s degree in computer science, computer engineering, or related technical discipline.

Overall 5 to 8 years of programming experience in Java, Python including object-oriented design.

Data handling frameworks: Should have a working knowledge of one or more data handling frameworks like- Hive, Spark, Storm, Flink, Beam, Airflow, Nifi etc.

Data Infrastructure: Should have experience in building, deploying and maintaining applications on popular cloud infrastructure like AWS, GCP etc.

Data Store: Must have expertise in one of general-purpose No-SQL data stores like Elasticsearch, MongoDB, Redis, RedShift, etc.

Strong sense of ownership, focus on quality, responsiveness, efficiency, and innovation.

Ability to work with distributed teams in a collaborative and productive manner.

Benefits:

Competitive Salary Packages and benefits.

Collaborative, lively and an upbeat work environment with young professionals.

Job Category: Development

Job Type: Full Time

Job Location: Bangalore

Data Engineer- Senior

Cubera is a data company revolutionizing big data analytics and Adtech through data share value principles wherein the users entrust their data to us. We refine the art of understanding, processing, extracting, and evaluating the data that is entrusted to us. We are a gateway for brands to increase their lead efficiency as the world moves towards web3.

What are you going to do?

Design & Develop high performance and scalable solutions that meet the needs of our customers.

Closely work with the Product Management, Architects and cross functional teams.

Build and deploy large-scale systems in Java/Python.

Identify, design, and implement internal process improvements: automating manual processes, optimizing data delivery, re-designing infrastructure for greater scalability, etc.

Create data tools for analytics and data scientist team members that assist them in building and optimizing their algorithms.

Follow best practices that can be adopted in Bigdata stack.

Use your engineering experience and technical skills to drive the features and mentor the engineers.

What are we looking for ( Competencies) :

Bachelor’s degree in computer science, computer engineering, or related technical discipline.

Overall 5 to 8 years of programming experience in Java, Python including object-oriented design.

Data handling frameworks: Should have a working knowledge of one or more data handling frameworks like- Hive, Spark, Storm, Flink, Beam, Airflow, Nifi etc.

Data Infrastructure: Should have experience in building, deploying and maintaining applications on popular cloud infrastructure like AWS, GCP etc.

Data Store: Must have expertise in one of general-purpose No-SQL data stores like Elasticsearch, MongoDB, Redis, RedShift, etc.

Strong sense of ownership, focus on quality, responsiveness, efficiency, and innovation.

Ability to work with distributed teams in a collaborative and productive manner.

Benefits:

Competitive Salary Packages and benefits.

Collaborative, lively and an upbeat work environment with young professionals.

Job Category: Development

Job Type: Full Time

Job Location: Bangalore

Read more

Leading Manufacturing Company

Data Science

Leading Manufacturing Company

Agency job

via People First Consultants by Jayaraj E

Chennai

3 - 6 yrs

₹3L - ₹8L / yr

Machine Learning (ML)

Data Science

Natural Language Processing (NLP)

Data modeling

Data Analytics

+2 more

Location: Chennai
Education: BE/BTech
Experience: Minimum 3+ years of experience as a Data Scientist/Data Engineer

Domain knowledge: Data cleaning, modelling, analytics, statistics, machine learning, AI

Requirements:

To be part of Digital Manufacturing and Industrie 4.0 projects across client group of companies
Design and develop AI//ML models to be deployed across factories
Knowledge on Hadoop, Apache Spark, MapReduce, Scala, Python programming, SQL and NoSQL databases is required
Should be strong in statistics, data analysis, data modelling, machine learning techniques and Neural Networks
Prior experience in developing AI and ML models is required
Experience with data from the Manufacturing Industry would be a plus

Roles and Responsibilities:

Develop AI and ML models for the Manufacturing Industry with a focus on Energy, Asset Performance Optimization and Logistics
Multitasking, good communication necessary
Entrepreneurial attitude

Additional Information:

Travel: Must be willing to travel on shorter duration within India and abroad

Job Location: Chennai
Reporting to: Team Leader, Energy Management System

Location: Chennai
Education: BE/BTech
Experience: Minimum 3+ years of experience as a Data Scientist/Data Engineer

Domain knowledge: Data cleaning, modelling, analytics, statistics, machine learning, AI

Requirements:

To be part of Digital Manufacturing and Industrie 4.0 projects across client group of companies
Design and develop AI//ML models to be deployed across factories
Knowledge on Hadoop, Apache Spark, MapReduce, Scala, Python programming, SQL and NoSQL databases is required
Should be strong in statistics, data analysis, data modelling, machine learning techniques and Neural Networks
Prior experience in developing AI and ML models is required
Experience with data from the Manufacturing Industry would be a plus

Roles and Responsibilities:

Develop AI and ML models for the Manufacturing Industry with a focus on Energy, Asset Performance Optimization and Logistics
Multitasking, good communication necessary
Entrepreneurial attitude

Additional Information:

Travel: Must be willing to travel on shorter duration within India and abroad

Job Location: Chennai
Reporting to: Team Leader, Energy Management System

Read more

American Multinational Retail Corp

Data Engineer

American Multinational Retail Corp

Agency job

via Hunt & Badge Consulting Pvt Ltd by Chandramohan Subramanian

Chennai

2 - 5 yrs

₹5L - ₹15L / yr

Scala

Spark

Apache Spark

Should have Passion to learn and adapt new technologies, understanding,

solving/troubleshooting issues and risks, able to make informed decisions and ability to

lead the projects.

Your Qualifications

2-5 Years’ Experience with functional programming
Experience with functional programming using Scala with Spark framework.
Strong understanding of Object-oriented programming, data structures and algorithms
Good experience in any of the cloud platforms (Azure, AWS, GCP) etc.,
Experience with distributed (multi-tiered) systems, relational databases and NoSql storage solutions
Desire to learn new technologies and languages
Participation in software design, development, and code reviews
High level of proficiency with Computer Science/Software Engineering knowledge and contribution to the technical skills growth of other team members

Your Responsibility

Design, build and configure applications to meet business process and application requirements
Proactively identify and communicate potential issues and concerns and recommend/implement alternative solutions as appropriate.
Troubleshooting & Optimization of existing solution

Provide advice on technical design to ensure solutions are forward looking and flexible for potential future requirements and business needs.

Should have Passion to learn and adapt new technologies, understanding,

solving/troubleshooting issues and risks, able to make informed decisions and ability to

lead the projects.

Your Qualifications

2-5 Years’ Experience with functional programming
Experience with functional programming using Scala with Spark framework.
Strong understanding of Object-oriented programming, data structures and algorithms
Good experience in any of the cloud platforms (Azure, AWS, GCP) etc.,
Experience with distributed (multi-tiered) systems, relational databases and NoSql storage solutions
Desire to learn new technologies and languages
Participation in software design, development, and code reviews
High level of proficiency with Computer Science/Software Engineering knowledge and contribution to the technical skills growth of other team members

Your Responsibility

Design, build and configure applications to meet business process and application requirements
Proactively identify and communicate potential issues and concerns and recommend/implement alternative solutions as appropriate.
Troubleshooting & Optimization of existing solution

Provide advice on technical design to ensure solutions are forward looking and flexible for potential future requirements and business needs.

Read more

Data & Cloud Technology serviced based company.

Data Engineer

Data & Cloud Technology serviced based company.

Agency job

via Multi Recruit by Ragul Ragul

Chennai, Coimbatore, Madurai

5 - 10 yrs

₹12L - ₹19L / yr

Apache Spark

HiveQL

Amazon Web Services (AWS)

Data engineering

JSON

+2 more

Must have the experience of leading teams and drive customer interactions
Must have multiple successful deployments user stories
Extensive hands on experience in Apache Spark along with HiveQL
Sound knowledge in Amazon Web Services or any other Cloud environment.
Experienced in data flow orchestration using Apache Airflow
JSON, XML, CSV, Parquet file formats with snappy compression.
File movements between HDFS and AWS S3
Experience in shell scripting and scripting to automate report generation and migration of reports to AWS S3
Worked in building a data pipeline using Pandas and Flask FrameworkGood Familiarity with Anaconda and Jupyternotebook

Must have the experience of leading teams and drive customer interactions
Must have multiple successful deployments user stories
Extensive hands on experience in Apache Spark along with HiveQL
Sound knowledge in Amazon Web Services or any other Cloud environment.
Experienced in data flow orchestration using Apache Airflow
JSON, XML, CSV, Parquet file formats with snappy compression.
File movements between HDFS and AWS S3
Experience in shell scripting and scripting to automate report generation and migration of reports to AWS S3
Worked in building a data pipeline using Pandas and Flask FrameworkGood Familiarity with Anaconda and Jupyternotebook

Read more

Lymbyc

Lead Data Engineer

at Lymbyc

1 video

2 recruiters

Venky Thiriveedhi

Posted by Venky Thiriveedhi

Bengaluru (Bangalore), Chennai

4 - 8 yrs

₹9L - ₹14L / yr

Apache Spark

Apache Kafka

Druid Database

Big Data

Apache Sqoop

+5 more

Key skill set : Apache NiFi, Kafka Connect (Confluent), Sqoop, Kylo, Spark, Druid, Presto, RESTful services, Lambda / Kappa architectures Responsibilities : - Build a scalable, reliable, operable and performant big data platform for both streaming and batch analytics - Design and implement data aggregation, cleansing and transformation layers Skills : - Around 4+ years of hands-on experience designing and operating large data platforms - Experience in Big data Ingestion, Transformation and stream/batch processing technologies using Apache NiFi, Apache Kafka, Kafka Connect (Confluent), Sqoop, Spark, Storm, Hive etc; - Experience in designing and building streaming data platforms in Lambda, Kappa architectures - Should have working experience in one of NoSQL, OLAP data stores like Druid, Cassandra, Elasticsearch, Pinot etc; - Experience in one of data warehousing tools like RedShift, BigQuery, Azure SQL Data Warehouse - Exposure to other Data Ingestion, Data Lake and querying frameworks like Marmaray, Kylo, Drill, Presto - Experience in designing and consuming microservices - Exposure to security and governance tools like Apache Ranger, Apache Atlas - Any contributions to open source projects a plus - Experience in performance benchmarks will be a plus

Key skill set : Apache NiFi, Kafka Connect (Confluent), Sqoop, Kylo, Spark, Druid, Presto, RESTful services, Lambda / Kappa architectures Responsibilities : - Build a scalable, reliable, operable and performant big data platform for both streaming and batch analytics - Design and implement data aggregation, cleansing and transformation layers Skills : - Around 4+ years of hands-on experience designing and operating large data platforms - Experience in Big data Ingestion, Transformation and stream/batch processing technologies using Apache NiFi, Apache Kafka, Kafka Connect (Confluent), Sqoop, Spark, Storm, Hive etc; - Experience in designing and building streaming data platforms in Lambda, Kappa architectures - Should have working experience in one of NoSQL, OLAP data stores like Druid, Cassandra, Elasticsearch, Pinot etc; - Experience in one of data warehousing tools like RedShift, BigQuery, Azure SQL Data Warehouse - Exposure to other Data Ingestion, Data Lake and querying frameworks like Marmaray, Kylo, Drill, Presto - Experience in designing and consuming microservices - Exposure to security and governance tools like Apache Ranger, Apache Atlas - Any contributions to open source projects a plus - Experience in performance benchmarks will be a plus

Read more

Get to hear about interesting companies hiring right now

Follow Cutshort

Why apply via Cutshort?

Connect with actual hiring teams and get their fast response. No spam.

Get to hear about interesting companies hiring right now

Follow Cutshort