Apache spark jobs

50+ Apache Spark Jobs in India

Apply to 50+ Apache Spark Jobs on CutShort.io. Find your next job, effortlessly. Browse Apache Spark Jobs and apply today!

Senior Data Analyst

at Wissen Technology

4 recruiters

Posted by Archana M

Mumbai

5 - 7 yrs

Best in industry

ETL

Python

Apache Spark

📢 DATA SOURCING & ANALYSIS EXPERT (L3 Support) – Mumbai 📢

Are you ready to supercharge your Data Engineering career in the financial domain?

We’re seeking a seasoned professional (5–7 years experience) to join our Mumbai team and lead in data sourcing, modelling, and analysis. If you’re passionate about solving complex challenges in Relational & Big Data ecosystems, this role is for you.

What You’ll Be Doing

Translate business needs into robust data models, program specs, and solutions
Perform advanced SQL optimization, query tuning, and L3-level issue resolution
Work across the entire data stack: ETL, Python / Spark, Autosys, and related systems
Debug, monitor, and improve data pipelines in production
Collaborate with business, analytics, and engineering teams to deliver dependable data services

What You Should Bring

5+ years in financial / fintech / capital markets environment
Proven expertise in relational databases and big data technologies
Strong command over SQL tuning, query optimization, indexing, partitioning
Hands-on experience with ETL pipelines, Spark / PySpark, Python scripting, job scheduling (e.g. Autosys)
Ability to troubleshoot issues at the L3 level, root cause analysis, performance tuning
Good communication skills — you’ll coordinate with business users, analytics, and tech teams

📢 DATA SOURCING & ANALYSIS EXPERT (L3 Support) – Mumbai 📢

Are you ready to supercharge your Data Engineering career in the financial domain?

What You’ll Be Doing

Translate business needs into robust data models, program specs, and solutions
Perform advanced SQL optimization, query tuning, and L3-level issue resolution
Work across the entire data stack: ETL, Python / Spark, Autosys, and related systems
Debug, monitor, and improve data pipelines in production
Collaborate with business, analytics, and engineering teams to deliver dependable data services

What You Should Bring

5+ years in financial / fintech / capital markets environment
Proven expertise in relational databases and big data technologies
Strong command over SQL tuning, query optimization, indexing, partitioning
Hands-on experience with ETL pipelines, Spark / PySpark, Python scripting, job scheduling (e.g. Autosys)
Ability to troubleshoot issues at the L3 level, root cause analysis, performance tuning
Good communication skills — you’ll coordinate with business users, analytics, and tech teams

Senior Data Engineer

at Hunarstreet Technologies pvt ltd

Agency job

via Hunarstreet Technologies pvt ltd by Sakshi Patankar

Remote only

12 - 16 yrs

₹20L - ₹35L / yr

Scala

Apache Spark

Big Data

Data engineering

databricks

+1 more

What You’ll Be Doing:

● Design and build parts of our data pipeline architecture for extraction, transformation, and loading of data from a wide variety of data sources using the latest Big Data technologies.

● Identify, design, and implement internal process improvements: automating manual processes, optimizing data delivery, re-designing infrastructure for greater scalability, etc.

● Work with stakeholders including the Executive, Product, Data and Design teams to assist with data-related technical issues and support their data infrastructure needs.

● Work with machine learning, data, and analytics experts to drive innovation, accuracy and greater functionality in our data system.

Qualifications:

● Bachelor's degree in Engineering, Computer Science, or relevant field.

● 10+ years of relevant and recent experience in a Data Engineer role. ● 5+ years recent experience with Apache Spark and solid understanding of the fundamentals.

● Deep understanding of Big Data concepts and distributed systems.

● Strong coding skills with Scala, Python, Java and/or other languages and the ability to quickly switch between them with ease.

● Advanced working SQL knowledge and experience working with a variety of relational databases such as Postgres and/or MySQL.

● Cloud Experience with DataBricks

● Experience working with data stored in many formats including Delta Tables, Parquet, CSV and JSON.

● Comfortable working in a linux shell environment and writing scripts as needed.

● Comfortable working in an Agile environment

● Machine Learning knowledge is a plus.

● Must be capable of working independently and delivering stable, efficient and reliable software.

● Excellent written and verbal communication skills in English.

● Experience supporting and working with cross-functional teams in a dynamic environment.

REPORTING: This position will report to our CEO or any other Lead as assigned by Management.

EMPLOYMENT TYPE: Full-Time,

Permanent LOCATION: Remote

SHIFT TIMINGS: 2.00 pm-11:00pm IST

What You’ll Be Doing:

● Design and build parts of our data pipeline architecture for extraction, transformation, and loading of data from a wide variety of data sources using the latest Big Data technologies.

● Identify, design, and implement internal process improvements: automating manual processes, optimizing data delivery, re-designing infrastructure for greater scalability, etc.

● Work with stakeholders including the Executive, Product, Data and Design teams to assist with data-related technical issues and support their data infrastructure needs.

● Work with machine learning, data, and analytics experts to drive innovation, accuracy and greater functionality in our data system.

Qualifications:

● Bachelor's degree in Engineering, Computer Science, or relevant field.

● 10+ years of relevant and recent experience in a Data Engineer role. ● 5+ years recent experience with Apache Spark and solid understanding of the fundamentals.

● Deep understanding of Big Data concepts and distributed systems.

● Strong coding skills with Scala, Python, Java and/or other languages and the ability to quickly switch between them with ease.

● Advanced working SQL knowledge and experience working with a variety of relational databases such as Postgres and/or MySQL.

● Cloud Experience with DataBricks

● Experience working with data stored in many formats including Delta Tables, Parquet, CSV and JSON.

● Comfortable working in a linux shell environment and writing scripts as needed.

● Comfortable working in an Agile environment

● Machine Learning knowledge is a plus.

● Must be capable of working independently and delivering stable, efficient and reliable software.

● Excellent written and verbal communication skills in English.

● Experience supporting and working with cross-functional teams in a dynamic environment.

REPORTING: This position will report to our CEO or any other Lead as assigned by Management.

EMPLOYMENT TYPE: Full-Time,

Permanent LOCATION: Remote

SHIFT TIMINGS: 2.00 pm-11:00pm IST

Solution/Technical Architect (Databricks)

at Quintica

Posted by Nitin D

Remote, Bengaluru (Bangalore), Pune, Chennai, Nagpur

5 - 15 yrs

₹20L - ₹30L / yr

databricks

PySpark

Apache Spark

CI/CD

Data engineering

Technical Architect (Databricks)

10+ Years Data Engineering Experience with expertise in Databricks
3+ years of consulting experience
Completed Data Engineering Professional certification & required classes
Minimum 2-3 projects delivered with hands-on experience in Databricks
Completed Apache Spark Programming with Databricks, Data Engineering with Databricks, Optimizing Apache Spark™ on Databricks
Experience in Spark and/or Hadoop, Flink, Presto, other popular big data engines
Familiarity with Databricks multi-hop pipeline architecture

Sr. Data Engineer (Databricks)

5+ Years Data Engineering Experience with expertise in Databricks
Completed Data Engineering Associate certification & required classes
Minimum 1 project delivered with hands-on experience in development on Databricks
Completed Apache Spark Programming with Databricks, Data Engineering with Databricks, Optimizing Apache Spark™ on Databricks
SQL delivery experience, and familiarity with Bigquery, Synapse or Redshift
Proficient in Python, knowledge of additional databricks programming languages (Scala)

Technical Architect (Databricks)

10+ Years Data Engineering Experience with expertise in Databricks
3+ years of consulting experience
Completed Data Engineering Professional certification & required classes
Minimum 2-3 projects delivered with hands-on experience in Databricks
Completed Apache Spark Programming with Databricks, Data Engineering with Databricks, Optimizing Apache Spark™ on Databricks
Experience in Spark and/or Hadoop, Flink, Presto, other popular big data engines
Familiarity with Databricks multi-hop pipeline architecture

Sr. Data Engineer (Databricks)

5+ Years Data Engineering Experience with expertise in Databricks
Completed Data Engineering Associate certification & required classes
Minimum 1 project delivered with hands-on experience in development on Databricks
Completed Apache Spark Programming with Databricks, Data Engineering with Databricks, Optimizing Apache Spark™ on Databricks
SQL delivery experience, and familiarity with Bigquery, Synapse or Redshift
Proficient in Python, knowledge of additional databricks programming languages (Scala)

Senior Data Engineer

IT Industry

Agency job

via Hunarstreet Technologies pvt ltd by Simran Dudani

Remote only

10 - 17 yrs

₹25L - ₹40L / yr

Data engineering

Scala

Apache Spark

PostgreSQL

We’re Hiring: Senior Data Engineer | Remote (Pan India)

Are you passionate about building scalable data pipelines and optimizing data architecture? We’re looking for an experienced Senior Data Engineer (10+ years) to join our team and play a key role in shaping next-gen data systems.

What you’ll do:

✅ Design & develop robust data pipelines (ETL) using the latest Big Data tech

✅ Optimize infrastructure & automate processes for scalability

✅ Collaborate with cross-functional teams (Product, Data, Design, ML)

✅ Work with modern tools: Apache Spark, Databricks, SQL, Python/Scala/Java

Scala is mandatory

What we’re looking for:

🔹 Strong expertise in Big Data, Spark & distributed systems

🔹 Hands-on with SQL, relational DBs (Postgres/MySQL), Linux scripting

🔹 Experience with Delta Tables, Parquet, CSV, JSON

🔹 Cloud & Databricks exposure

🔹 Bonus: Machine Learning knowledge

📍 Location: Remote (Pan India)

⏰ Shift: 2:00 pm – 11:00 pm IST

💼 Type: Full-time, Permanent

We’re Hiring: Senior Data Engineer | Remote (Pan India)

What you’ll do:

✅ Design & develop robust data pipelines (ETL) using the latest Big Data tech

✅ Optimize infrastructure & automate processes for scalability

✅ Collaborate with cross-functional teams (Product, Data, Design, ML)

✅ Work with modern tools: Apache Spark, Databricks, SQL, Python/Scala/Java

Scala is mandatory

What we’re looking for:

🔹 Strong expertise in Big Data, Spark & distributed systems

🔹 Hands-on with SQL, relational DBs (Postgres/MySQL), Linux scripting

🔹 Experience with Delta Tables, Parquet, CSV, JSON

🔹 Cloud & Databricks exposure

🔹 Bonus: Machine Learning knowledge

📍 Location: Remote (Pan India)

⏰ Shift: 2:00 pm – 11:00 pm IST

💼 Type: Full-time, Permanent

Databricks - Solution Consultant

at Aceis Services

2 candid answers

Posted by Anushi Mishra

Remote only

2 - 10 yrs

₹8.6L - ₹30.2L / yr

CI/CD

Apache Spark

PySpark

MLOps

Machine Learning (ML)

+6 more

We are hiring freelancers to work on advanced Data & AI projects using Databricks. If you are passionate about cloud platforms, machine learning, data engineering, or architecture, and want to work with cutting-edge tools on real-world challenges, this is the opportunity for you!

✅ Key Details

Work Type: Freelance / Contract
Location: Remote
Time Zones: IST / EST only
Domain: Data & AI, Cloud, Big Data, Machine Learning
Collaboration: Work with industry leaders on innovative projects

🔹 Open Roles

1. Databricks – Senior Consultant

Skills: Data Warehousing, Python, Java, Scala, ETL, SQL, AWS, GCP, Azure
Experience: 6+ years

2. Databricks – ML Engineer

Skills: CI/CD, MLOps, Machine Learning, Spark, Hadoop
Experience: 4+ years

3. Databricks – Solution Architect

Skills: Azure, GCP, AWS, CI/CD, MLOps
Experience: 7+ years

4. Databricks – Solution Consultant

Skills: SQL, Spark, BigQuery, Python, Scala
Experience: 2+ years

✅ What We Offer

Opportunity to work with top-tier professionals and clients
Exposure to cutting-edge technologies and real-world data challenges
Flexible remote work environment aligned with IST / EST time zones
Competitive compensation and growth opportunities

📌 Skills We Value

✅ Key Details

Work Type: Freelance / Contract
Location: Remote
Time Zones: IST / EST only
Domain: Data & AI, Cloud, Big Data, Machine Learning
Collaboration: Work with industry leaders on innovative projects

🔹 Open Roles

1. Databricks – Senior Consultant

Skills: Data Warehousing, Python, Java, Scala, ETL, SQL, AWS, GCP, Azure
Experience: 6+ years

2. Databricks – ML Engineer

Skills: CI/CD, MLOps, Machine Learning, Spark, Hadoop
Experience: 4+ years

3. Databricks – Solution Architect

Skills: Azure, GCP, AWS, CI/CD, MLOps
Experience: 7+ years

4. Databricks – Solution Consultant

Skills: SQL, Spark, BigQuery, Python, Scala
Experience: 2+ years

✅ What We Offer

Opportunity to work with top-tier professionals and clients
Exposure to cutting-edge technologies and real-world data challenges
Flexible remote work environment aligned with IST / EST time zones
Competitive compensation and growth opportunities

📌 Skills We Value

Sr. Data Engineer

at Fountane inc

Posted by HR Fountane

Remote only

5 - 9 yrs

₹18L - ₹32L / yr

Amazon Web Services (AWS)

AWS Lambda

AWS CloudFormation

ETL

Docker

+3 more

Position Overview: We are looking for an experienced and highly skilled Senior Data Engineer to join our team and help design, implement, and optimize data systems that support high-end analytical solutions for our clients. As a customer-centric Data Engineer, you will work closely with clients to understand their business needs and translate them into robust, scalable, and efficient technical solutions. You will be responsible for end-to-end data modelling, integration workflows, and data transformation processes while ensuring security, privacy, and compliance.In this role, you will also leverage the latest advancements in artificial intelligence, machine learning, and large language models (LLMs) to deliver high-impact solutions that drive business success. The ideal candidate will have a deep understanding of data infrastructure, optimization techniques, and cost-effective data management

Key Responsibilities:

• Customer Collaboration:

– Partner with clients to gather and understand their business

requirements, translating them into actionable technical specifications.

– Act as the primary technical consultant to guide clients through data challenges and deliver tailored solutions that drive value.

•Data Modeling & Integration:

– Design and implement scalable, efficient, and optimized data models to support business operations and analytical needs.

– Develop and maintain data integration workflows to seamlessly extract, transform, and load (ETL) data from various sources into data repositories.

– Ensure smooth integration between multiple data sources and platforms, including cloud and on-premise systems

• Data Processing & Optimization:

– Develop, optimize, and manage data processing pipelines to enable real-time and batch data processing at scale.

– Continuously evaluate and improve data processing performance, optimizing for throughput while minimizing infrastructure costs.

• Data Governance & Security:

–Implement and enforce data governance policies and best practices, ensuring data security, privacy, and compliance with relevant industry regulations (e.g., GDPR, HIPAA).

–Collaborate with security teams to safeguard sensitive data and maintain privacy controls across data environments.

• Cross-Functional Collaboration:

– Work closely with data engineers, data scientists, and business

analysts to ensure that the data architecture aligns with organizational objectives and delivers actionable insights.

– Foster collaboration across teams to streamline data workflows and optimize solution delivery.

• Leveraging Advanced Technologies:

– Utilize AI, machine learning models, and large language models (LLMs) to automate processes, accelerate delivery, and provide

smart, data-driven solutions to business challenges.

– Identify opportunities to apply cutting-edge technologies to improve the efficiency, speed, and quality of data processing and analytics.

• Cost Optimization:

–Proactively manage infrastructure and cloud resources to optimize throughput while minimizing operational costs.

–Make data-driven recommendations to reduce infrastructure overhead and increase efficiency without sacrificing performance.

Qualifications:

• Experience:

– Proven experience (5+ years) as a Data Engineer or similar role, designing and implementing data solutions at scale.

– Strong expertise in data modelling, data integration (ETL), and data transformation processes.

– Experience with cloud platforms (AWS, Azure, Google Cloud) and big data technologies (e.g., Hadoop, Spark).

• Technical Skills:

– Advanced proficiency in SQL, data modelling tools (e.g., Erwin,PowerDesigner), and data integration frameworks (e.g., Apache

NiFi, Talend).

– Strong understanding of data security protocols, privacy regulations, and compliance requirements.

– Experience with data storage solutions (e.g., data lakes, data warehouses, NoSQL, relational databases).

• AI & Machine Learning Exposure:

– Familiarity with leveraging AI and machine learning technologies (e.g., TensorFlow, PyTorch, scikit-learn) to optimize data processing and analytical tasks.

–Ability to apply advanced algorithms and automation techniques to improve business processes.

• Soft Skills:

– Excellent communication skills to collaborate with clients, stakeholders, and cross-functional teams.

– Strong problem-solving ability with a customer-centric approach to solution design.

– Ability to translate complex technical concepts into clear, understandable terms for non-technical audiences.

• Education:

– Bachelor’s or Master’s degree in Computer Science, Information Systems, Data Science, or a related field (or equivalent practical experience).

LIFE AT FOUNTANE:

Fountane offers an environment where all members are supported, challenged, recognized & given opportunities to grow to their fullest potential.
Competitive pay
Health insurance for spouses, kids, and parents.
PF/ESI or equivalent
Individual/team bonuses
Employee stock ownership plan
Fun/challenging variety of projects/industries
Flexible workplace policy - remote/physical
Flat organization - no micromanagement
Individual contribution - set your deadlines
Above all - culture that helps you grow exponentially!

A LITTLE BIT ABOUT THE COMPANY:

Established in 2017, Fountane Inc is a Ventures Lab incubating and investing in new competitive technology businesses from scratch. Thus far, we’ve created half a dozen multi-million valuation companies in the US and a handful of sister ventures for large corporations, including Target, US Ventures, and Imprint Engine.

We’re a team of 120+ strong from around the world that are radically open-minded and believes in excellence, respecting one another, and pushing our boundaries to the furthest it's ever been.

Key Responsibilities:

• Customer Collaboration:

– Partner with clients to gather and understand their business

requirements, translating them into actionable technical specifications.

– Act as the primary technical consultant to guide clients through data challenges and deliver tailored solutions that drive value.

•Data Modeling & Integration:

– Design and implement scalable, efficient, and optimized data models to support business operations and analytical needs.

– Develop and maintain data integration workflows to seamlessly extract, transform, and load (ETL) data from various sources into data repositories.

– Ensure smooth integration between multiple data sources and platforms, including cloud and on-premise systems

• Data Processing & Optimization:

– Develop, optimize, and manage data processing pipelines to enable real-time and batch data processing at scale.

– Continuously evaluate and improve data processing performance, optimizing for throughput while minimizing infrastructure costs.

• Data Governance & Security:

–Implement and enforce data governance policies and best practices, ensuring data security, privacy, and compliance with relevant industry regulations (e.g., GDPR, HIPAA).

–Collaborate with security teams to safeguard sensitive data and maintain privacy controls across data environments.

• Cross-Functional Collaboration:

– Work closely with data engineers, data scientists, and business

analysts to ensure that the data architecture aligns with organizational objectives and delivers actionable insights.

– Foster collaboration across teams to streamline data workflows and optimize solution delivery.

• Leveraging Advanced Technologies:

– Utilize AI, machine learning models, and large language models (LLMs) to automate processes, accelerate delivery, and provide

smart, data-driven solutions to business challenges.

– Identify opportunities to apply cutting-edge technologies to improve the efficiency, speed, and quality of data processing and analytics.

• Cost Optimization:

–Proactively manage infrastructure and cloud resources to optimize throughput while minimizing operational costs.

–Make data-driven recommendations to reduce infrastructure overhead and increase efficiency without sacrificing performance.

Qualifications:

• Experience:

– Proven experience (5+ years) as a Data Engineer or similar role, designing and implementing data solutions at scale.

– Strong expertise in data modelling, data integration (ETL), and data transformation processes.

– Experience with cloud platforms (AWS, Azure, Google Cloud) and big data technologies (e.g., Hadoop, Spark).

• Technical Skills:

– Advanced proficiency in SQL, data modelling tools (e.g., Erwin,PowerDesigner), and data integration frameworks (e.g., Apache

NiFi, Talend).

– Strong understanding of data security protocols, privacy regulations, and compliance requirements.

– Experience with data storage solutions (e.g., data lakes, data warehouses, NoSQL, relational databases).

• AI & Machine Learning Exposure:

– Familiarity with leveraging AI and machine learning technologies (e.g., TensorFlow, PyTorch, scikit-learn) to optimize data processing and analytical tasks.

–Ability to apply advanced algorithms and automation techniques to improve business processes.

• Soft Skills:

– Excellent communication skills to collaborate with clients, stakeholders, and cross-functional teams.

– Strong problem-solving ability with a customer-centric approach to solution design.

– Ability to translate complex technical concepts into clear, understandable terms for non-technical audiences.

• Education:

– Bachelor’s or Master’s degree in Computer Science, Information Systems, Data Science, or a related field (or equivalent practical experience).

LIFE AT FOUNTANE:

Fountane offers an environment where all members are supported, challenged, recognized & given opportunities to grow to their fullest potential.
Competitive pay
Health insurance for spouses, kids, and parents.
PF/ESI or equivalent
Individual/team bonuses
Employee stock ownership plan
Fun/challenging variety of projects/industries
Flexible workplace policy - remote/physical
Flat organization - no micromanagement
Individual contribution - set your deadlines
Above all - culture that helps you grow exponentially!

A LITTLE BIT ABOUT THE COMPANY:

We’re a team of 120+ strong from around the world that are radically open-minded and believes in excellence, respecting one another, and pushing our boundaries to the furthest it's ever been.

Data Engineer

Hunarstreet Technologies pvt ltd

Agency job

via Hunarstreet Technologies pvt ltd by Sakshi Patankar

Remote only

10 - 20 yrs

₹15L - ₹30L / yr

Data engineering

databricks

Python

Scala

Spark

+14 more

What You’ll Be Doing:

● Design and build parts of our data pipeline architecture for extraction, transformation, and loading of data from a wide variety of data sources using the latest Big Data technologies.

● Identify, design, and implement internal process improvements: automating manual processes, optimizing data delivery, re-designing infrastructure for greater scalability, etc.

● Work with stakeholders including the Executive, Product, Data and Design teams to assist with data-related technical issues and support their data infrastructure needs.

● Work with machine learning, data, and analytics experts to drive innovation, accuracy and greater functionality in our data system. Qualifications:

● Bachelor's degree in Engineering, Computer Science, or relevant field.

● 10+ years of relevant and recent experience in a Data Engineer role. ● 5+ years recent experience with Apache Spark and solid understanding of the fundamentals.

● Deep understanding of Big Data concepts and distributed systems.

● Strong coding skills with Scala, Python, Java and/or other languages and the ability to quickly switch between them with ease.

● Advanced working SQL knowledge and experience working with a variety of relational databases such as Postgres and/or MySQL.

● Cloud Experience with DataBricks

● Experience working with data stored in many formats including Delta Tables, Parquet, CSV and JSON.

● Comfortable working in a linux shell environment and writing scripts as needed.

● Comfortable working in an Agile environment

● Machine Learning knowledge is a plus.

● Must be capable of working independently and delivering stable, efficient and reliable software.

● Excellent written and verbal communication skills in English.

● Experience supporting and working with cross-functional teams in a dynamic environment

EMPLOYMENT TYPE: Full-Time, Permanent

LOCATION: Remote (Pan India)