Apache Spark Jobs in Pune

6+ Apache Spark Jobs in Pune | Apache Spark Job openings in Pune

Apply to 6+ Apache Spark Jobs in Pune on CutShort.io. Explore the latest Apache Spark Job opportunities across top companies like Google, Amazon & Adobe.

Solution/Technical Architect (Databricks)

at Quintica

Posted by Nitin D

Remote, Bengaluru (Bangalore), Pune, Chennai, Nagpur

5 - 15 yrs

₹20L - ₹30L / yr

databricks

PySpark

Apache Spark

CI/CD

Data engineering

Technical Architect (Databricks)

10+ Years Data Engineering Experience with expertise in Databricks
3+ years of consulting experience
Completed Data Engineering Professional certification & required classes
Minimum 2-3 projects delivered with hands-on experience in Databricks
Completed Apache Spark Programming with Databricks, Data Engineering with Databricks, Optimizing Apache Spark™ on Databricks
Experience in Spark and/or Hadoop, Flink, Presto, other popular big data engines
Familiarity with Databricks multi-hop pipeline architecture

Sr. Data Engineer (Databricks)

5+ Years Data Engineering Experience with expertise in Databricks
Completed Data Engineering Associate certification & required classes
Minimum 1 project delivered with hands-on experience in development on Databricks
Completed Apache Spark Programming with Databricks, Data Engineering with Databricks, Optimizing Apache Spark™ on Databricks
SQL delivery experience, and familiarity with Bigquery, Synapse or Redshift
Proficient in Python, knowledge of additional databricks programming languages (Scala)

Technical Architect (Databricks)

10+ Years Data Engineering Experience with expertise in Databricks
3+ years of consulting experience
Completed Data Engineering Professional certification & required classes
Minimum 2-3 projects delivered with hands-on experience in Databricks
Completed Apache Spark Programming with Databricks, Data Engineering with Databricks, Optimizing Apache Spark™ on Databricks
Experience in Spark and/or Hadoop, Flink, Presto, other popular big data engines
Familiarity with Databricks multi-hop pipeline architecture

Sr. Data Engineer (Databricks)

5+ Years Data Engineering Experience with expertise in Databricks
Completed Data Engineering Associate certification & required classes
Minimum 1 project delivered with hands-on experience in development on Databricks
Completed Apache Spark Programming with Databricks, Data Engineering with Databricks, Optimizing Apache Spark™ on Databricks
SQL delivery experience, and familiarity with Bigquery, Synapse or Redshift
Proficient in Python, knowledge of additional databricks programming languages (Scala)

Big Data Engineer-Release

at Rigel Networks Pvt Ltd

Posted by Minakshi Soni

Pune

5 - 9 yrs

₹8L - ₹15L / yr

Big data Engineer

Software deployment

Release Management

Software release life cycle

Release engineering

+6 more

Dear Candidate,

We are urgently looking for a Release- Big data Engineer For Pune Location.

Experience : 5-8 yrs

Location : Pune

Skills: Big data Engineer , Release Engineer ,DevOps, Aws/Azure/GCP Cloud exp. ,

JD:

Oversee the end-to-end release lifecycle, from planning to post-production monitoring. Coordinate with cross-functional teams (DBA, BizOps, DevOps, DNS).
Partner with development teams to resolve technical challenges in deployment and automation test runs
Work with shared services DBA teams for schema-based multi-tenancy designs and smooth migrations.
Drive automation for batch deployments and DR exercises. YAML based micro service deployment using shell/Python/Go
Provide oversight for Big Data toolsets for deployment (e.g., Spark, Hive, HBase) in private cloud and public cloud CDP environments
Ensure high-quality releases with a focus on stability and long-term performance.
Able to run the automation batch scripts and debug the deployment and functional aspects/ work with dev leads to resolve the release cycle issues.

Regards,

Minakshi Soni

Executive- Talent Acquisition

Rigel Networks

Dear Candidate,

We are urgently looking for a Release- Big data Engineer For Pune Location.

Experience : 5-8 yrs

Location : Pune

Skills: Big data Engineer , Release Engineer ,DevOps, Aws/Azure/GCP Cloud exp. ,

JD:

Oversee the end-to-end release lifecycle, from planning to post-production monitoring. Coordinate with cross-functional teams (DBA, BizOps, DevOps, DNS).
Partner with development teams to resolve technical challenges in deployment and automation test runs
Work with shared services DBA teams for schema-based multi-tenancy designs and smooth migrations.
Drive automation for batch deployments and DR exercises. YAML based micro service deployment using shell/Python/Go
Provide oversight for Big Data toolsets for deployment (e.g., Spark, Hive, HBase) in private cloud and public cloud CDP environments
Ensure high-quality releases with a focus on stability and long-term performance.
Able to run the automation batch scripts and debug the deployment and functional aspects/ work with dev leads to resolve the release cycle issues.

Regards,

Minakshi Soni

Executive- Talent Acquisition

Rigel Networks

Senior Data Engineer (L2)

at Publicis Sapient

10 recruiters

Posted by Mohit Singh

Bengaluru (Bangalore), Pune, Hyderabad, Gurugram, Noida

5 - 11 yrs

₹20L - ₹36L / yr

PySpark

Data engineering

Big Data

Hadoop

Spark

+7 more

Publicis Sapient Overview:

The Senior Associate People Senior Associate L1 in Data Engineering, you will translate client requirements into technical design, and implement components for data engineering solution. Utilize deep understanding of data integration and big data design principles in creating custom solutions or implementing package solutions. You will independently drive design discussions to insure the necessary health of the overall solution

Job Summary:

As Senior Associate L2 in Data Engineering, you will translate client requirements into technical design, and implement components for data engineering solution. Utilize deep understanding of data integration and big data design principles in creating custom solutions or implementing package solutions. You will independently drive design discussions to insure the necessary health of the overall solution

The role requires a hands-on technologist who has strong programming background like Java / Scala / Python, should have experience in Data Ingestion, Integration and data Wrangling, Computation, Analytics pipelines and exposure to Hadoop ecosystem components. You are also required to have hands-on knowledge on at least one of AWS, GCP, Azure cloud platforms.

Role & Responsibilities:

Your role is focused on Design, Development and delivery of solutions involving:

• Data Integration, Processing & Governance

• Data Storage and Computation Frameworks, Performance Optimizations

• Analytics & Visualizations

• Infrastructure & Cloud Computing

• Data Management Platforms

• Implement scalable architectural models for data processing and storage

• Build functionality for data ingestion from multiple heterogeneous sources in batch & real-time mode

• Build functionality for data analytics, search and aggregation

Experience Guidelines:

Mandatory Experience and Competencies:

# Competency

1.Overall 5+ years of IT experience with 3+ years in Data related technologies

2.Minimum 2.5 years of experience in Big Data technologies and working exposure in at least one cloud platform on related data services (AWS / Azure / GCP)

3.Hands-on experience with the Hadoop stack – HDFS, sqoop, kafka, Pulsar, NiFi, Spark, Spark Streaming, Flink, Storm, hive, oozie, airflow and other components required in building end to end data pipeline.

4.Strong experience in at least of the programming language Java, Scala, Python. Java preferable

5.Hands-on working knowledge of NoSQL and MPP data platforms like Hbase, MongoDb, Cassandra, AWS Redshift, Azure SQLDW, GCP BigQuery etc

6.Well-versed and working knowledge with data platform related services on at least 1 cloud platform, IAM and data security

Preferred Experience and Knowledge (Good to Have):

# Competency

1.Good knowledge of traditional ETL tools (Informatica, Talend, etc) and database technologies (Oracle, MySQL, SQL Server, Postgres) with hands on experience

2.Knowledge on data governance processes (security, lineage, catalog) and tools like Collibra, Alation etc

3.Knowledge on distributed messaging frameworks like ActiveMQ / RabbiMQ / Solace, search & indexing and Micro services architectures

4.Performance tuning and optimization of data pipelines

5.CI/CD – Infra provisioning on cloud, auto build & deployment pipelines, code quality

6.Cloud data specialty and other related Big data technology certifications

Personal Attributes:

• Strong written and verbal communication skills

• Articulation skills

• Good team player

• Self-starter who requires minimal oversight

• Ability to prioritize and manage multiple tasks

• Process orientation and the ability to define and set up processes

Publicis Sapient Overview:

Job Summary:

Role & Responsibilities:

Your role is focused on Design, Development and delivery of solutions involving:

• Data Integration, Processing & Governance

• Data Storage and Computation Frameworks, Performance Optimizations

• Analytics & Visualizations

• Infrastructure & Cloud Computing

• Data Management Platforms

• Implement scalable architectural models for data processing and storage

• Build functionality for data ingestion from multiple heterogeneous sources in batch & real-time mode

• Build functionality for data analytics, search and aggregation

Experience Guidelines:

Mandatory Experience and Competencies:

# Competency

1.Overall 5+ years of IT experience with 3+ years in Data related technologies

2.Minimum 2.5 years of experience in Big Data technologies and working exposure in at least one cloud platform on related data services (AWS / Azure / GCP)

4.Strong experience in at least of the programming language Java, Scala, Python. Java preferable

5.Hands-on working knowledge of NoSQL and MPP data platforms like Hbase, MongoDb, Cassandra, AWS Redshift, Azure SQLDW, GCP BigQuery etc

6.Well-versed and working knowledge with data platform related services on at least 1 cloud platform, IAM and data security

Preferred Experience and Knowledge (Good to Have):

# Competency

1.Good knowledge of traditional ETL tools (Informatica, Talend, etc) and database technologies (Oracle, MySQL, SQL Server, Postgres) with hands on experience

2.Knowledge on data governance processes (security, lineage, catalog) and tools like Collibra, Alation etc

3.Knowledge on distributed messaging frameworks like ActiveMQ / RabbiMQ / Solace, search & indexing and Micro services architectures

4.Performance tuning and optimization of data pipelines

5.CI/CD – Infra provisioning on cloud, auto build & deployment pipelines, code quality

6.Cloud data specialty and other related Big data technology certifications

Personal Attributes:

• Strong written and verbal communication skills

• Articulation skills

• Good team player

• Self-starter who requires minimal oversight

• Ability to prioritize and manage multiple tasks

• Process orientation and the ability to define and set up processes

Lead Data Engineer

at xpressbees

Posted by Alfiya Khan

Pune, Bengaluru (Bangalore)

6 - 8 yrs

₹15L - ₹25L / yr

Big Data

Data Warehouse (DWH)

Data modeling

Apache Spark

Data integration

+10 more

Company Profile
XpressBees – a logistics company started in 2015 – is amongst the fastest growing
companies of its sector. While we started off rather humbly in the space of
ecommerce B2C logistics, the last 5 years have seen us steadily progress towards
expanding our presence. Our vision to evolve into a strong full-service logistics
organization reflects itself in our new lines of business like 3PL, B2B Xpress and cross
border operations. Our strong domain expertise and constant focus on meaningful
innovation have helped us rapidly evolve as the most trusted logistics partner of
India. We have progressively carved our way towards best-in-class technology
platforms, an extensive network reach, and a seamless last mile management
system. While on this aggressive growth path, we seek to become the one-stop-shop
for end-to-end logistics solutions. Our big focus areas for the very near future
include strengthening our presence as service providers of choice and leveraging the
power of technology to improve efficiencies for our clients.

Job Profile
As a Lead Data Engineer in the Data Platform Team at XpressBees, you will build the data platform
and infrastructure to support high quality and agile decision-making in our supply chain and logistics
workflows.
You will define the way we collect and operationalize data (structured / unstructured), and
build production pipelines for our machine learning models, and (RT, NRT, Batch) reporting &
dashboarding requirements. As a Senior Data Engineer in the XB Data Platform Team, you will use
your experience with modern cloud and data frameworks to build products (with storage and serving
systems)
that drive optimisation and resilience in the supply chain via data visibility, intelligent decision making,
insights, anomaly detection and prediction.

What You Will Do
• Design and develop data platform and data pipelines for reporting, dashboarding and
machine learning models. These pipelines would productionize machine learning models
and integrate with agent review tools.
• Meet the data completeness, correction and freshness requirements.
• Evaluate and identify the data store and data streaming technology choices.
• Lead the design of the logical model and implement the physical model to support
business needs. Come up with logical and physical database design across platforms (MPP,
MR, Hive/PIG) which are optimal physical designs for different use cases (structured/semi
structured). Envision & implement the optimal data modelling, physical design,
performance optimization technique/approach required for the problem.
• Support your colleagues by reviewing code and designs.
• Diagnose and solve issues in our existing data pipelines and envision and build their
successors.

Qualifications & Experience relevant for the role

• A bachelor's degree in Computer Science or related field with 6 to 9 years of technology
experience.
• Knowledge of Relational and NoSQL data stores, stream processing and micro-batching to
make technology & design choices.
• Strong experience in System Integration, Application Development, ETL, Data-Platform
projects. Talented across technologies used in the enterprise space.
• Software development experience using:
• Expertise in relational and dimensional modelling
• Exposure across all the SDLC process
• Experience in cloud architecture (AWS)
• Proven track record in keeping existing technical skills and developing new ones, so that
you can make strong contributions to deep architecture discussions around systems and
applications in the cloud ( AWS).

• Characteristics of a forward thinker and self-starter that flourishes with new challenges
and adapts quickly to learning new knowledge
• Ability to work with a cross functional teams of consulting professionals across multiple
projects.
• Knack for helping an organization to understand application architectures and integration
approaches, to architect advanced cloud-based solutions, and to help launch the build-out
of those systems
• Passion for educating, training, designing, and building end-to-end systems.

Technical Project Manager

at Celebal Technologies

2 recruiters

Posted by Payal Hasnani

Jaipur, Noida, Gurugram, Delhi, Ghaziabad, Faridabad, Pune, Mumbai

5 - 15 yrs

₹7L - ₹25L / yr

PySpark

Data engineering

Big Data

Hadoop

Spark

+4 more

Job Responsibilities:

• Project Planning and Management
o Take end-to-end ownership of multiple projects / project tracks
o Create and maintain project plans and other related documentation for project
objectives, scope, schedule and delivery milestones
o Lead and participate across all the phases of software engineering, right from
requirements gathering to GO LIVE
o Lead internal team meetings on solution architecture, effort estimation, manpower
planning and resource (software/hardware/licensing) planning
o Manage RIDA (Risks, Impediments, Dependencies, Assumptions) for projects by
developing effective mitigation plans
• Team Management
o Act as the Scrum Master
o Conduct SCRUM ceremonies like Sprint Planning, Daily Standup, Sprint Retrospective
o Set clear objectives for the project and roles/responsibilities for each team member
o Train and mentor the team on their job responsibilities and SCRUM principles
o Make the team accountable for their tasks and help the team in achieving them
o Identify the requirements and come up with a plan for Skill Development for all team
members
• Communication
o Be the Single Point of Contact for the client in terms of day-to-day communication
o Periodically communicate project status to all the stakeholders (internal/external)
• Process Management and Improvement
o Create and document processes across all disciplines of software engineering
o Identify gaps and continuously improve processes within the team
o Encourage team members to contribute towards process improvement
o Develop a culture of quality and efficiency within the team

Must have:
• Minimum 08 years of experience (hands-on as well as leadership) in software / data engineering
across multiple job functions like Business Analysis, Development, Solutioning, QA, DevOps and
Project Management
• Hands-on as well as leadership experience in Big Data Engineering projects
• Experience developing or managing cloud solutions using Azure or other cloud provider
• Demonstrable knowledge on Hadoop, Hive, Spark, NoSQL DBs, SQL, Data Warehousing, ETL/ELT,
DevOps tools
• Strong project management and communication skills
• Strong analytical and problem-solving skills
• Strong systems level critical thinking skills
• Strong collaboration and influencing skills

Good to have:
• Knowledge on PySpark, Azure Data Factory, Azure Data Lake Storage, Synapse Dedicated SQL
Pool, Databricks, PowerBI, Machine Learning, Cloud Infrastructure
• Background in BFSI with focus on core banking
• Willingness to travel

Work Environment
• Customer Office (Mumbai) / Remote Work

Education
• UG: B. Tech - Computers / B. E. – Computers / BCA / B.Sc. Computer Science

Big Data Spark Lead

at DataMetica

1 video

7 recruiters

Posted by Sumangali Desai

Pune, Hyderabad

7 - 12 yrs

₹7L - ₹20L / yr

Apache Spark

Big Data

Spark

Scala

Hadoop

+3 more

We at Datametica Solutions Private Limited are looking for Big Data Spark Lead who have a passion for cloud with knowledge of different on-premise and cloud Data implementation in the field of Big Data and Analytics including and not limiting to Teradata, Netezza, Exadata, Oracle, Cloudera, Hortonworks and alike.
Ideal candidates should have technical experience in migrations and the ability to help customers get value from Datametica's tools and accelerators.

Job Description
Experience : 7+ years
Location : Pune / Hyderabad
Skills :

Drive and participate in requirements gathering workshops, estimation discussions, design meetings and status review meetings
Participate and contribute in Solution Design and Solution Architecture for implementing Big Data Projects on-premise and on cloud
Technical Hands on experience in design, coding, development and managing Large Hadoop implementation
Proficient in SQL, Hive, PIG, Spark SQL, Shell Scripting, Kafka, Flume, Scoop with large Big Data and Data Warehousing projects with either Java, Python or Scala based Hadoop programming background
Proficient with various development methodologies like waterfall, agile/scrum and iterative
Good Interpersonal skills and excellent communication skills for US and UK based clients

About Us!
A global Leader in the Data Warehouse Migration and Modernization to the Cloud, we empower businesses by migrating their Data/Workload/ETL/Analytics to the Cloud by leveraging Automation.

We have expertise in transforming legacy Teradata, Oracle, Hadoop, Netezza, Vertica, Greenplum along with ETLs like Informatica, Datastage, AbInitio & others, to cloud-based data warehousing with other capabilities in data engineering, advanced analytics solutions, data management, data lake and cloud optimization.

Datametica is a key partner of the major cloud service providers - Google, Microsoft, Amazon, Snowflake.

We have our own products!
Eagle – Data warehouse Assessment & Migration Planning Product
Raven – Automated Workload Conversion Product
Pelican - Automated Data Validation Product, which helps automate and accelerate data migration to the cloud.

Why join us!
Datametica is a place to innovate, bring new ideas to live and learn new things. We believe in building a culture of innovation, growth and belonging. Our people and their dedication over these years are the key factors in achieving our success.

Benefits we Provide!
Working with Highly Technical and Passionate, mission-driven people
Subsidized Meals & Snacks
Flexible Schedule
Approachable leadership
Access to various learning tools and programs
Pet Friendly
Certification Reimbursement Policy

Check out more about us on our website below!
www.datametica.com

Drive and participate in requirements gathering workshops, estimation discussions, design meetings and status review meetings
Participate and contribute in Solution Design and Solution Architecture for implementing Big Data Projects on-premise and on cloud
Technical Hands on experience in design, coding, development and managing Large Hadoop implementation
Proficient in SQL, Hive, PIG, Spark SQL, Shell Scripting, Kafka, Flume, Scoop with large Big Data and Data Warehousing projects with either Java, Python or Scala based Hadoop programming background
Proficient with various development methodologies like waterfall, agile/scrum and iterative
Good Interpersonal skills and excellent communication skills for US and UK based clients

Get to hear about interesting companies hiring right now

Follow Cutshort

Why apply via Cutshort?

Connect with actual hiring teams and get their fast response. No spam.

Find more jobs

Get to hear about interesting companies hiring right now

Follow Cutshort