Cutshort logo

11+ MapReduce Jobs in Pune | MapReduce Job openings in Pune

Apply to 11+ MapReduce Jobs in Pune on CutShort.io. Explore the latest MapReduce Job opportunities across top companies like Google, Amazon & Adobe.

icon
DataMetica

at DataMetica

1 video
7 recruiters
Sumangali Desai
Posted by Sumangali Desai
Pune, Hyderabad
7 - 12 yrs
₹7L - ₹20L / yr
Apache Spark
Big Data
Spark
skill iconScala
Hadoop
+3 more
We at Datametica Solutions Private Limited are looking for Big Data Spark Lead who have a passion for cloud with knowledge of different on-premise and cloud Data implementation in the field of Big Data and Analytics including and not limiting to Teradata, Netezza, Exadata, Oracle, Cloudera, Hortonworks and alike.
Ideal candidates should have technical experience in migrations and the ability to help customers get value from Datametica's tools and accelerators.

Job Description
Experience : 7+ years
Location : Pune / Hyderabad
Skills :
  • Drive and participate in requirements gathering workshops, estimation discussions, design meetings and status review meetings
  • Participate and contribute in Solution Design and Solution Architecture for implementing Big Data Projects on-premise and on cloud
  • Technical Hands on experience in design, coding, development and managing Large Hadoop implementation
  • Proficient in SQL, Hive, PIG, Spark SQL, Shell Scripting, Kafka, Flume, Scoop with large Big Data and Data Warehousing projects with either Java, Python or Scala based Hadoop programming background
  • Proficient with various development methodologies like waterfall, agile/scrum and iterative
  • Good Interpersonal skills and excellent communication skills for US and UK based clients

About Us!
A global Leader in the Data Warehouse Migration and Modernization to the Cloud, we empower businesses by migrating their Data/Workload/ETL/Analytics to the Cloud by leveraging Automation.

We have expertise in transforming legacy Teradata, Oracle, Hadoop, Netezza, Vertica, Greenplum along with ETLs like Informatica, Datastage, AbInitio & others, to cloud-based data warehousing with other capabilities in data engineering, advanced analytics solutions, data management, data lake and cloud optimization.

Datametica is a key partner of the major cloud service providers - Google, Microsoft, Amazon, Snowflake.


We have our own products!
Eagle –
Data warehouse Assessment & Migration Planning Product
Raven –
Automated Workload Conversion Product
Pelican -
Automated Data Validation Product, which helps automate and accelerate data migration to the cloud.

Why join us!
Datametica is a place to innovate, bring new ideas to live and learn new things. We believe in building a culture of innovation, growth and belonging. Our people and their dedication over these years are the key factors in achieving our success.

Benefits we Provide!
Working with Highly Technical and Passionate, mission-driven people
Subsidized Meals & Snacks
Flexible Schedule
Approachable leadership
Access to various learning tools and programs
Pet Friendly
Certification Reimbursement Policy

Check out more about us on our website below!
www.datametica.com
Read more
Publicis Sapient

at Publicis Sapient

10 recruiters
Mohit Singh
Posted by Mohit Singh
Bengaluru (Bangalore), Gurugram, Pune, Hyderabad, Noida
4 - 10 yrs
Best in industry
PySpark
Data engineering
Big Data
Hadoop
Spark
+6 more

Publicis Sapient Overview:

The Senior Associate People Senior Associate L1 in Data Engineering, you will translate client requirements into technical design, and implement components for data engineering solution. Utilize deep understanding of data integration and big data design principles in creating custom solutions or implementing package solutions. You will independently drive design discussions to insure the necessary health of the overall solution 

.

Job Summary:

As Senior Associate L1 in Data Engineering, you will do technical design, and implement components for data engineering solution. Utilize deep understanding of data integration and big data design principles in creating custom solutions or implementing package solutions. You will independently drive design discussions to insure the necessary health of the overall solution

The role requires a hands-on technologist who has strong programming background like Java / Scala / Python, should have experience in Data Ingestion, Integration and data Wrangling, Computation, Analytics pipelines and exposure to Hadoop ecosystem components. Having hands-on knowledge on at least one of AWS, GCP, Azure cloud platforms will be preferable.


Role & Responsibilities:

Job Title: Senior Associate L1 – Data Engineering

Your role is focused on Design, Development and delivery of solutions involving:

• Data Ingestion, Integration and Transformation

• Data Storage and Computation Frameworks, Performance Optimizations

• Analytics & Visualizations

• Infrastructure & Cloud Computing

• Data Management Platforms

• Build functionality for data ingestion from multiple heterogeneous sources in batch & real-time

• Build functionality for data analytics, search and aggregation


Experience Guidelines:

Mandatory Experience and Competencies:

# Competency

1.Overall 3.5+ years of IT experience with 1.5+ years in Data related technologies

2.Minimum 1.5 years of experience in Big Data technologies

3.Hands-on experience with the Hadoop stack – HDFS, sqoop, kafka, Pulsar, NiFi, Spark, Spark Streaming, Flink, Storm, hive, oozie, airflow and other components required in building end to end data pipeline. Working knowledge on real-time data pipelines is added advantage.

4.Strong experience in at least of the programming language Java, Scala, Python. Java preferable

5.Hands-on working knowledge of NoSQL and MPP data platforms like Hbase, MongoDb, Cassandra, AWS Redshift, Azure SQLDW, GCP BigQuery etc


Preferred Experience and Knowledge (Good to Have):

# Competency

1.Good knowledge of traditional ETL tools (Informatica, Talend, etc) and database technologies (Oracle, MySQL, SQL Server, Postgres) with hands on experience

2.Knowledge on data governance processes (security, lineage, catalog) and tools like Collibra, Alation etc

3.Knowledge on distributed messaging frameworks like ActiveMQ / RabbiMQ / Solace, search & indexing and Micro services architectures

4.Performance tuning and optimization of data pipelines

5.CI/CD – Infra provisioning on cloud, auto build & deployment pipelines, code quality

6.Working knowledge with data platform related services on at least 1 cloud platform, IAM and data security

7.Cloud data specialty and other related Big data technology certifications


Job Title: Senior Associate L1 – Data Engineering

Personal Attributes:

• Strong written and verbal communication skills

• Articulation skills

• Good team player

• Self-starter who requires minimal oversight

• Ability to prioritize and manage multiple tasks

• Process orientation and the ability to define and set up processes

Read more
EnterpriseMinds

at EnterpriseMinds

2 recruiters
phani kalyan
Posted by phani kalyan
Pune
9 - 14 yrs
₹20L - ₹40L / yr
Spark
Hadoop
Big Data
Data engineering
PySpark
+3 more
Job Id: SG0601

Hi,

Enterprise Minds is looking for Data Architect for Pune Location.

Req Skills:
Python,Pyspark,Hadoop,Java,Scala
Read more
Persistent Systems

at Persistent Systems

1 video
1 recruiter
Agency job
via Milestone Hr Consultancy by Haina khan
Pune, Bengaluru (Bangalore), Hyderabad, Nagpur
4 - 9 yrs
₹4L - ₹15L / yr
Spark
Hadoop
Big Data
Data engineering
PySpark
+3 more
Greetings..

We have an urgent requirements of Big Data Developer profiles in our reputed MNC company.

Location: Pune/Bangalore/Hyderabad/Nagpur
Experience: 4-9yrs

Skills: Pyspark,AWS
or Spark,Scala,AWS
or Python Aws
Read more
A2Tech Consultants

at A2Tech Consultants

3 recruiters
Dhaval B
Posted by Dhaval B
Pune
4 - 12 yrs
₹6L - ₹15L / yr
Data engineering
Data Engineer
ETL
Spark
Apache Kafka
+5 more
We are looking for a smart candidate with:
  • Strong Python Coding skills and OOP skills
  • Should have worked on Big Data product Architecture
  • Should have worked with any one of the SQL-based databases like MySQL, PostgreSQL and any one of
  • NoSQL-based databases such as Cassandra, Elasticsearch etc.
  • Hands on experience on frameworks like Spark RDD, DataFrame, Dataset
  • Experience on development of ETL for data product
  • Candidate should have working knowledge on performance optimization, optimal resource utilization, Parallelism and tuning of spark jobs
  • Working knowledge on file formats: CSV, JSON, XML, PARQUET, ORC, AVRO
  • Good to have working knowledge with any one of the Analytical Databases like Druid, MongoDB, Apache Hive etc.
  • Experience to handle real-time data feeds (good to have working knowledge on Apache Kafka or similar tool)
Key Skills:
  • Python and Scala (Optional), Spark / PySpark, Parallel programming
Read more
Fast paced Startup

Fast paced Startup

Agency job
via Kavayah People Consulting by Kavita Singh
Pune
3 - 6 yrs
₹15L - ₹22L / yr
Big Data
Data engineering
Hadoop
Spark
Apache Hive
+6 more

ears of Exp: 3-6+ Years 
Skills: Scala, Python, Hive, Airflow, Spark

Languages: Java, Python, Shell Scripting

GCP: BigTable, DataProc,  BigQuery, GCS, Pubsub

OR
AWS: Athena, Glue, EMR, S3, Redshift

MongoDB, MySQL, Kafka

Platforms: Cloudera / Hortonworks
AdTech domain experience is a plus.
Job Type - Full Time 

Read more
Health Care MNC

Health Care MNC

Agency job
via Kavayah People Consulting by Kavita Singh
Pune
12 - 24 yrs
₹35L - ₹60L / yr
skill iconData Science
skill iconPython
skill iconC++
skill iconJava
skill iconAmazon Web Services (AWS)
+1 more
The Director for Data Science will support building of AI products in Agile fashion that
empower healthcare payers, providers and members to quickly process medical data to
make informed decisions and reduce health care costs. You will be focusing on research,
development, strategy, operations, people management, and being a thought leader for
team members based out of India. You should have professional healthcare experience
using both structured and unstructured data to build applications. These applications
include but are not limited to machine learning, artificial intelligence, optical character
recognition, natural language processing, and integrating processes into the overall AI
pipeline to mine healthcare and medical information with high recall and other relevant
metrics. The results will be used dually for real-time operational processes with both
automated and human-based decision making as well as contribute to reducing
healthcare administrative costs. We work with all major cloud and big data vendors
offerings including (Azure, AWS, Google, IBM, etc.) to achieve our goals in healthcare and
support 
The Director, Data Science will have the opportunity to build a team, shape team culture
and operating norms as a result of the fast-paced nature of a new, high-growth
organization.

• Strong communication and presentation skills to convey progress to a diverse group of stakeholders
• Strong expertise in data science, data engineering, software engineering, cloud vendors, big data technologies, real-time streaming applications, DevOps and product delivery
• Experience building stakeholder trust and confidence in deployed models especially via application of the algorithmic bias, interpretable machine learning,
data integrity, data quality, reproducible research and reliable engineering 24x7x365 product availability, scalability
• Expertise in healthcare privacy, federated learning, continuous integration and deployment, DevOps support
• Provide mentoring to data scientists and machine learning engineers as well as career development
• Meet project related team members for individual specific needs on a regular basis related to project/product deliverables
• Provide training and guidance for team members when required
• Provide performance feedback when required by leadership

The Experience You’ll Need (Required):
• MS/M.Tech degree or PhD in Computer Science, Mathematics, Physics or related STEM fields
• Significant healthcare data experience including but not limited to usage of claims data
• Delivered multiple data science and machine learning projects over 8+ years with values exceeding $10 Million or more and has worked on platform members exceeding 10 million lives
• 9+ years of industry experience in data science, machine learning, and artificial intelligence
• Strong expertise in data science, data engineering, software engineering, cloud vendors, big data technologies, real time streaming applications, DevOps, and product delivery
• Knows how to solve and launch real artificial intelligence and data science related problems and products along with managing and coordinating the
business process change, IT / cloud operations, meeting production level code standards
• Ownerships of key workflows part of data science life cycle like data acquisition, data quality, and results
• Experience building stakeholder trust and confidence in deployed models especially via application of algorithmic bias, interpretable machine learning,
data integrity, data quality, reproducible research, and reliable engineering 24x7x365 product availability, scalability
• Expertise in healthcare privacy, federated learning, continuous integration and deployment, DevOps support
• 3+ Years of experience managing directly five (5) or more senior level data scientists, machine learning engineers with advanced degrees and directly
made staff decisions

• Very strong understanding of mathematical concepts including but not limited to linear algebra, advanced calculus, partial differential equations, and
statistics including Bayesian approaches at master’s degree level and above
• 6+ years of programming experience in C++ or Java or Scala and data science programming languages like Python and R including strong understanding of
concepts like data structures, algorithms, compression techniques, high performance computing, distributed computing, and various computer architecture
• Very strong understanding and experience with traditional data science approaches like sampling techniques, feature engineering, classification, and
regressions, SVM, trees, model evaluations with several projects over 3+ years
• Very strong understanding and experience in Natural Language Processing,
reasoning, and understanding, information retrieval, text mining, search, with
3+ years of hands on experience
• Experience with developing and deploying several products in production with
experience in two or more of the following languages (Python, C++, Java, Scala)
• Strong Unix/Linux background and experience with at least one of the
following cloud vendors like AWS, Azure, and Google
• Three plus (3+) years hands on experience with MapR \ Cloudera \ Databricks
Big Data platform with Spark, Hive, Kafka etc.
• Three plus (3+) years of experience with high-performance computing like
Dask, CUDA distributed GPU, TPU etc.
• Presented at major conferences and/or published materials
Read more
Maveric Systems

at Maveric Systems

3 recruiters
Rashmi Poovaiah
Posted by Rashmi Poovaiah
Bengaluru (Bangalore), Chennai, Pune
4 - 10 yrs
₹8L - ₹15L / yr
Big Data
Hadoop
Spark
Apache Kafka
HiveQL
+2 more

Role Summary/Purpose:

We are looking for a Developer/Senior Developers to be a part of building advanced analytical platform leveraging Big Data technologies and transform the legacy systems. This role is an exciting, fast-paced, constantly changing and challenging work environment, and will play an important role in resolving and influencing high-level decisions.

 

Requirements:

  • The candidate must be a self-starter, who can work under general guidelines in a fast-spaced environment.
  • Overall minimum of 4 to 8 year of software development experience and 2 years in Data Warehousing domain knowledge
  • Must have 3 years of hands-on working knowledge on Big Data technologies such as Hadoop, Hive, Hbase, Spark, Kafka, Spark Streaming, SCALA etc…
  • Excellent knowledge in SQL & Linux Shell scripting
  • Bachelors/Master’s/Engineering Degree from a well-reputed university.
  • Strong communication, Interpersonal, Learning and organizing skills matched with the ability to manage stress, Time, and People effectively
  • Proven experience in co-ordination of many dependencies and multiple demanding stakeholders in a complex, large-scale deployment environment
  • Ability to manage a diverse and challenging stakeholder community
  • Diverse knowledge and experience of working on Agile Deliveries and Scrum teams.

 

Responsibilities

  • Should works as a senior developer/individual contributor based on situations
  • Should be part of SCRUM discussions and to take requirements
  • Adhere to SCRUM timeline and deliver accordingly
  • Participate in a team environment for the design, development and implementation
  • Should take L3 activities on need basis
  • Prepare Unit/SIT/UAT testcase and log the results
  • Co-ordinate SIT and UAT Testing. Take feedbacks and provide necessary remediation/recommendation in time.
  • Quality delivery and automation should be a top priority
  • Co-ordinate change and deployment in time
  • Should create healthy harmony within the team
  • Owns interaction points with members of core team (e.g.BA team, Testing and business team) and any other relevant stakeholders
Read more
A Product development Organisation

A Product development Organisation

Agency job
via Millions Advisory by Vasuki N
Pune
5 - 8 yrs
₹10L - ₹17L / yr
skill iconPython
Big Data
skill iconAmazon Web Services (AWS)
Windows Azure
Google Cloud Platform (GCP)
+3 more
  • Must have 5-8 years of experience in handling data
  • Must have the ability to interpret large amounts of data and to multi-task
  • Must have strong knowledge of and experience with programming (Python), Linux/Bash scripting, databases(SQL, etc)
  • Must have strong analytical and critical thinking to resolve business problems using data and tech
  •  Must have domain familiarity and interest of – Cloud technologies (GCP/Azure Microsoft/ AWS Amazon), open-source technologies, Enterprise technologies
  • Must have the ability to collect, organize, analyze, and disseminate significant amounts of information with attention to detail and accuracy.
  • Must have good communication skills
  • Working knowledge/exposure to ElasticSearch, PostgreSQL, Athena, PrestoDB, Jupyter Notebook
Read more
Intentbase

at Intentbase

1 video
1 recruiter
Nischal Vohra
Posted by Nischal Vohra
Pune
2 - 5 yrs
₹5L - ₹10L / yr
Pandas
Numpy
Bash
Structured Query Language
skill iconPython
+2 more
We are an early stage startup working in the space of analytics, big data, machine learning, data visualization on multiple platforms and SaaS. We have our offices in Palo Alto and WTC, Kharadi, Pune and got some marque names as our customers. We are looking for really good Python programmer who MUST have scientific programming experience (Python, etc.) Hands-on with numpy and the Python scientific stack is a must. Demonstrated ability to track and work with 100s-1000s of files and GB-TB of data. Exposure to ML and Data mining algorithms. Need to be comfortable working in a Unix environment and SQL. You will be required to do following: Using command line tools to perform data conversion and analysis Supporting other team members in retrieving and archiving experimental results Quickly writing scripts to automate routine analysis tasks Creating insightful, simple graphics to represent complex trends Explore/design/invent new tools and design patterns to solve complex big data problems Experience working on a long-term, lab-based project (academic experience acceptable)
Read more
Saama Technologies

at Saama Technologies

6 recruiters
Sandeep Chaudhary
Posted by Sandeep Chaudhary
Pune
2 - 5 yrs
₹1L - ₹18L / yr
Hadoop
Spark
Apache Hive
Apache Flume
skill iconJava
+5 more
Description Deep experience and understanding of Apache Hadoop and surrounding technologies required; Experience with Spark, Impala, Hive, Flume, Parquet and MapReduce. Strong understanding of development languages to include: Java, Python, Scala, Shell Scripting Expertise in Apache Spark 2. x framework principals and usages. Should be proficient in developing Spark Batch and Streaming job in Python, Scala or Java. Should have proven experience in performance tuning of Spark applications both from application code and configuration perspective. Should be proficient in Kafka and integration with Spark. Should be proficient in Spark SQL and data warehousing techniques using Hive. Should be very proficient in Unix shell scripting and in operating on Linux. Should have knowledge about any cloud based infrastructure. Good experience in tuning Spark applications and performance improvements. Strong understanding of data profiling concepts and ability to operationalize analyses into design and development activities Experience with best practices of software development; Version control systems, automated builds, etc. Experienced in and able to lead the following phases of the Software Development Life Cycle on any project (feasibility planning, analysis, development, integration, test and implementation) Capable of working within the team or as an individual Experience to create technical documentation
Read more
Get to hear about interesting companies hiring right now
Company logo
Company logo
Company logo
Company logo
Company logo
Linkedin iconFollow Cutshort
Why apply via Cutshort?
Connect with actual hiring teams and get their fast response. No spam.
Find more jobs
Get to hear about interesting companies hiring right now
Company logo
Company logo
Company logo
Company logo
Company logo
Linkedin iconFollow Cutshort