Data Engineer (PySpark)

at MNC

Agency job
icon
Bengaluru (Bangalore)
icon
3 - 7 yrs
icon
₹8L - ₹16L / yr
icon
Full time
Skills
PySpark
Python
Spark
Roles and Responsibilities:

• Responsible for developing and maintaining applications with PySpark 
• Contribute to the overall design and architecture of the application developed and deployed.
• Performance Tuning wrt to executor sizing and other environmental parameters, code optimization, partitions tuning, etc.
• Interact with business users to understand requirements and troubleshoot issues.
• Implement Projects based on functional specifications.

Must-Have Skills:

• Good experience in Pyspark - Including Dataframe core functions and Spark SQL
• Good experience in SQL DBs - Be able to write queries including fair complexity.
• Should have excellent experience in Big Data programming for data transformation and aggregations
• Good at ETL architecture. Business rules processing and data extraction from Data Lake into data streams for business consumption.
• Good customer communication.
• Good Analytical skills
Read more
Why apply to jobs via Cutshort
Personalized job matches
Stop wasting time. Get matched with jobs that meet your skills, aspirations and preferences.
Verified hiring teams
See actual hiring teams, find common social connections or connect with them directly. No 3rd party agencies here.
Move faster with AI
We use AI to get you faster responses, recommendations and unmatched user experience.
2101133
Matches delivered
3712187
Network size
15000
Companies hiring

Similar jobs

Head of AI/ML

at Tvarit

Founded 2019  •  Products & Services  •  20-100 employees  •  Raised funding
Python
Deep Learning
Artificial Intelligence (AI)
Machine Learning (ML)
Bayesian
icon
Mumbai
icon
2 - 10 yrs
icon
₹20L - ₹30L / yr

We are looking for a Head of AI/ML to lead our data science research team and build innovative ML-powered applications. As a Head of AI/ML at Tvarit, you will have the opportunity to grow as a researcher and to steer the company’s overall artificial intelligence efforts, creating a more formal path for ML experimentation, development, and product realization. By collaborating with other leaders in engineering, product and design, you will align product needs with new areas of research and innovation that support ML efforts across the company, both in the near-term and long-term.

Key work responsibilities

  • Provide leadership to a team of 5+ data scientists and engineers within our center of excellence for AI & ML. 

  • Identify and translate business needs into clearly scoped data science projects and take a hands-on approach to steer solution design and implementation.

  • Help build world-class products to secure the company’s unique position in the market.

  • Create a vision and a strategic roadmap of AI & ML within the company including the strategy to secure patents for the company.

  • Plan and prioritize research activities across the data team with a results-focused approach

  • Build a team of ML and data science researchers through effective hiring and mentoring with a focus on cultivating talent and instilling data-driven values in a fast-paced environment.

  • Collaborate with engineering, product, design, and business leaders to ensure that work is aligned with the company’s mission and goals

  • Carry out AI research with world's renowned professors from Tvarit's partner universities such as IITs, TU Darmstadt, TU Munich, and Stanford

Your background

  • Ph.D. or MSc in a quantitative field such as Statistics, Mathematics, Operations Research, Econometrics, Computer Science, or Engineering

  • Successful track record building Machine Learning / Statistical Models. 5+ years experience preferred

  • Extensive 2+ years of experience preferred experience managing and developing a team of data scientists and machine learning engineers.

  • Proven track record writing production-grade ML applications in Python. Proficient using SQL.

  • Deep understanding of advanced ML concepts (e.g. Deep Learning, Bayesian statistics).

  • Experience with agile methodologies such as Scrum & Kanban and proficiency in using project management tools. Jira preferred.

Preferred Skills

  • Experience in fast-growing data-driven startups.

  • Experience with ideation and innovation of technology at scale, successfully producing intelligent software products and features within strict timelines.

  • Experience with production-ready systems for Machine Learning (e.g. Feature Storage).

  • A desire to keep up with the field by attending or publishing at relevant conferences (ICML, NeurIPS, ACL, EMNLP, KDD, AAAI, ICLR, CVPR etc.)

Read more
Job posted by
Soumya Sahadevan

Data Analyst

at GroupM

Founded  •   •  employees  • 
Spotfire
Qlikview
Tableau
PowerBI
Data Visualization
Data Analytics
Google Analytics
Adobe Analytics
Pivot table
SQL
Python
Big Data
icon
Bengaluru (Bangalore), Gurugram, Mumbai
icon
4 - 6 yrs
icon
₹12L - ₹16L / yr
Data & Technology -This function is an Analytics, Technology, and consulting group supporting the buying & campaign delivery teams. We combine Adtech and Martech platform strategy with data science & data engineering expertise, helping our clients make advertising work better for people. We are currently looking for an Analytics Manager to join the GroupM Services team. 

Description

Overview

Data & Technology -This function is an Analytics, Technology, and consulting group supporting the buying & campaign delivery teams. We combine Adtech and Martech platform strategy with data science & data engineering expertise, helping our clients make advertising work better for people.

This role is a fantastic opportunity for personal and professional growth and to contribute to a high-performance team, focused on continuous learning, rigorous best practice and achieving high levels of customer service. The role requires a top-class candidate with excellent numeracy and proven analytics problem-solving skills to join our high energy, entrepreneurial team.

 

Reporting of the role

This role reports to the Analytics Director.

3 best things about the job:

  1. Be a member of a high performing team focused on technology, data, partners and platforms, a key strategic growth area for GroupM and WPP.
  2. Work in an environment that promotes freedom, flexibility, empowerment, and diverse working styles to solve real business problems.
  3. The opportunity to learn & collaborate with a wide range of stakeholders across all GroupM agencies & business units.
  4.  

Measures of success –

In three months:

  • Gain an in depth understanding of the media landscape, be trained on the various media buying platforms specifically, data & analytics databases and tools and understand how GMS business operates
  • Lead and roll out various analytics and attribution frameworks and best practices for campaign measurement
  • Develop proficiency in clean room analytics such as ADH, Infosum, Liveramp etc.
  • Develop relationships and earn trust with your own team

 

In six months:

  • Working with the campaign delivery teams to deliver high value, in-depth analytics, and attribution including client site analytics, channel analytics, automated where possible. Part of this will be to ensure that prior to the campaign all tracking and assets are in place as required by the briefing, then monitoring throughout the campaign that data is being collected.
  • Help develop standard and where possible automated advanced clean room analytics solutions that can be scaled across all agencies.
  • Perform active stakeholder management to continue to evolve these analytics solutions as per the priority requirements. 

In twelve months:            

  • Work with the APAC GMS teams to ensure the local and regional data analytics solutions are aligned and local needs are strongly represented at the regional / global level
  • Develop proficiency in measurement frameworks in a post cookie era, leading experiments for measuring campaign delivery, brand health and marketing effectives / ROI.
  • Be an expert in data and lead bespoke insight analytics work as the demand and function continues to grow – i.e. answering complex business problems posed by our clients, providing thought leadership in defining measurement strategies, etc.

 

Responsibilities of the role:

  • Provide digital campaign analytics – including campaign delivery, measurement, and attribution
  • Client site analytics – e.g., Google Analytics, Adobe Analytics
  • Client channel analytics – e.g., social listening, ecommerce – shopalyst, pre-post purchase analytics, pricing benchmarks
  • Create omni(digital)-channel measurement strategies for performance reporting
  • Deploy data-driven attribution models to support campaign optimisation
  • Develop and roll out frameworks around various attribution models
  • Create a leading analytics solution suite leveraging media / neutral data clean rooms
  • Foster a community of data analytics practitioners for knowledge sharing and growing expertise

What you will need:

  • Min 4 –5 years’ experience working within an analytical role
  • Prior experience within a digital media role is highly desirable, particularly search, social and programmatic
  • A degree in a quantitative field (e.g. economics, computer science, mathematics, statistics, engineering, physics, etc.)
  • Proficiency in Excel (including but not limited to VLOOKUP’s, arrays, pivot tables, conditional and nested formulas, VBA/macros)
  • Experience with SQL/ Big Query/GMP tech stack / Clean rooms such as ADH
  • Hands-on experience on BI/Visual Analytics Tools like PowerBI or Tableau
  • Knowledge or hands-on experience on analytics platforms like Google Analytics, Data Studio, Adobe Analytics, MMP such as Firebase, Appsflyer, Kochava etc.
  • Evidence of technical comfort and good understanding of internet functionality desirable
  • Analytical pedigree - evidence of having approached problems from a mathematical perspective and working through to a solution in a logical way
  • Proactive and results-oriented
  • A positive, can-do attitude with a thirst to continually learn new things
  • An ability to work independently and collaboratively with a wide range of teams
  • Excellent communication skills, both written and oral
  • An interest in media, advertising and marketing

 

More about GroupM

GroupM - GroupM leads and shapes media markets by delivering performance enhancing media products and services, powered by data and technology. Our global network agencies and businesses enable our people to work collaboratively across borders with the best in class, providing them the opportunity to accelerate their progress and development. We are not limited by teams or geographies; our scale and diverse range of clients lets us be more adventurous with our business and talent. We give our talent the space, support and tools to innovate and grow.

Discover more about GroupM at www.groupm.com
Follow @GroupMAPAC on Twitter
Follow GroupM on LinkedIn - https://www.linkedin.com/company/groupm


2020 brought opportunities for brands to innovate because of which we saw an evolving media stack. The growth of digital is set to soar high because of changing consumer habits. With approximately 500 million smartphone users, low-priced data plans, 45 to 50 million e-commerce shoppers, approximately 60 OTT offerings and a young population, India is a mobile-first internet market. It is also one of the top 10 ad spend markets in the world and is set to climb the ranks. Global big tech corporations have made considerable investments in top e-commerce/retail ventures and Indian start-ups, blurring the lines between social media, e-commerce and mobile payments, resulting in disruption on an unimaginable scale.

At GroupM India, there’s never a dull moment between juggling client requests, managing vendor partners and having fun with your team. We believe in tackling challenges head-on and getting things done.

GroupM is an equal opportunity employer. We view everyone as an individual and we understand that inclusion is more than just diversity – it’s about belonging. We celebrate the fact that everyone is unique and that’s what makes us so good at what we do. We pride ourselves on being a company that embraces difference and truly represents the global clients we work with.

 
Read more
Job posted by
Surabhi Deo

Senior Data Quality Engineer

at Quicken Inc

Founded 1982  •  Product  •  100-500 employees  •  Profitable
ETL
Informatica
Data Warehouse (DWH)
Python
ETL QA
Big Data
icon
Bengaluru (Bangalore)
icon
5 - 8 yrs
icon
₹20L - ₹30L / yr
  • Graduate+ in Mathematics, Statistics, Computer Science, Economics, Business, Engineering or equivalent work experience.
  • Total experience of 5+ years with at least 2 years in managing data quality for high scale data platforms.
  • Good knowledge of SQL querying.
  • Strong skill in analysing data and uncovering patterns using SQL or Python.
  • Excellent understanding of data warehouse/big data concepts such data extraction, data transformation, data loading (ETL process).
  • Strong background in automation and building automated testing frameworks for data ingestion and transformation jobs.
  • Experience in big data technologies a big plus.
  • Experience in machine learning, especially in data quality applications a big plus.
  • Experience in building data quality automation frameworks a big plus.
  • Strong experience working with an Agile development team with rapid iterations. 
  • Very strong verbal and written communication, and presentation skills.
  • Ability to quickly understand business rules.
  • Ability to work well with others in a geographically distributed team.
  • Keen observation skills to analyse data, highly detail oriented.
  • Excellent judgment, critical-thinking, and decision-making skills; can balance attention to detail with swift execution.
  • Able to identify stakeholders, build relationships, and influence others to get work done.
  • Self-directed and self-motivated individual who takes complete ownership of the product and its outcome.
Read more
Job posted by
Shreelakshmi M

Data Engineer

at RedSeer Consulting

Founded  •   •  employees  • 
Python
PySpark
SQL
pandas
Cloud Computing
Microsoft Windows Azure
Big Data
icon
Bengaluru (Bangalore)
icon
0 - 2 yrs
icon
₹10L - ₹15L / yr

BRIEF DESCRIPTION:

At-least 1 year of Python, Spark, SQL, data engineering experience

Primary Skillset: PySpark, Scala/Python/Spark, Azure Synapse, S3, RedShift/Snowflake

Relevant Experience: Legacy ETL job Migration to AWS Glue / Python & Spark combination

 

ROLE SCOPE:

Reverse engineer the existing/legacy ETL jobs

Create the workflow diagrams and review the logic diagrams with Tech Leads

Write equivalent logic in Python & Spark

Unit test the Glue jobs and certify the data loads before passing to system testing

Follow the best practices, enable appropriate audit & control mechanism

Analytically skillful, identify the root causes quickly and efficiently debug issues

Take ownership of the deliverables and support the deployments

 

REQUIREMENTS:

Create data pipelines for data integration into Cloud stacks eg. Azure Synapse

Code data processing jobs in Azure Synapse Analytics, Python, and Spark

Experience in dealing with structured, semi-structured, and unstructured data in batch and real-time environments.

Should be able to process .json, .parquet and .avro files

 

PREFERRED BACKGROUND:

Tier1/2 candidates from IIT/NIT/IIITs

However, relevant experience, learning attitude takes precedence

Read more
Job posted by
Raunak Swarnkar

Data Engineer - Python, Apache, Spark

at Spica Systems

Founded 2019  •  Products & Services  •  20-100 employees  •  Raised funding
Python
Apache Spark
icon
Kolkata
icon
3 - 5 yrs
icon
₹7L - ₹12L / yr
We are a Silicon Valley based start-up, established in 2019 and are recognized as experts in building products and providing R&D and Software Development services in wide range of leading-edge technologies such as LTE, 5G, Cloud Services (Public -AWS, AZURE,GCP,Private – Openstack) and Kubernetes. It has a highly scalable and secured 5G Packet Core Network, orchestrated by ML powered Kubernetes platform, which can be deployed in various multi cloud mode along with a test tool.Headquartered in San Jose, California, we have our R&D centre in Sector V, Salt Lake Kolkata.
 

Requirements:

  • Overall 3 to 5 years of experience in designing and implementing complex large scale Software.
  • Good in Python is must.
  • Experience in Apache Spark, Scala, Java and Delta Lake
  • Experience in designing and implementing templated ETL/ELT data pipelines
  • Expert level experience in Data Pipeline Orchestrationusing Apache Airflow for large scale production deployment
  • Experience in visualizing data from various tasks in the data pipeline using Apache Zeppelin/Plotly or any other visualization library.
  • Log management and log monitoring using ELK/Grafana
  • Git Hub Integration

 

Technology Stack: Apache Spark, Apache Airflow, Python, AWS, EC2, S3, Kubernetes, ELK, Grafana , Apache Arrow, Java

Read more
Job posted by
Priyanka Bhattacharya

Big Data Spark Lead

at Datametica Solutions Private Limited

Founded 2013  •  Products & Services  •  100-1000 employees  •  Profitable
Apache Spark
Big Data
Spark
Scala
Hadoop
MapReduce
Java
Apache Hive
icon
Pune, Hyderabad
icon
7 - 12 yrs
icon
₹7L - ₹20L / yr
We at Datametica Solutions Private Limited are looking for Big Data Spark Lead who have a passion for cloud with knowledge of different on-premise and cloud Data implementation in the field of Big Data and Analytics including and not limiting to Teradata, Netezza, Exadata, Oracle, Cloudera, Hortonworks and alike.
Ideal candidates should have technical experience in migrations and the ability to help customers get value from Datametica's tools and accelerators.

Job Description
Experience : 7+ years
Location : Pune / Hyderabad
Skills :
  • Drive and participate in requirements gathering workshops, estimation discussions, design meetings and status review meetings
  • Participate and contribute in Solution Design and Solution Architecture for implementing Big Data Projects on-premise and on cloud
  • Technical Hands on experience in design, coding, development and managing Large Hadoop implementation
  • Proficient in SQL, Hive, PIG, Spark SQL, Shell Scripting, Kafka, Flume, Scoop with large Big Data and Data Warehousing projects with either Java, Python or Scala based Hadoop programming background
  • Proficient with various development methodologies like waterfall, agile/scrum and iterative
  • Good Interpersonal skills and excellent communication skills for US and UK based clients

About Us!
A global Leader in the Data Warehouse Migration and Modernization to the Cloud, we empower businesses by migrating their Data/Workload/ETL/Analytics to the Cloud by leveraging Automation.

We have expertise in transforming legacy Teradata, Oracle, Hadoop, Netezza, Vertica, Greenplum along with ETLs like Informatica, Datastage, AbInitio & others, to cloud-based data warehousing with other capabilities in data engineering, advanced analytics solutions, data management, data lake and cloud optimization.

Datametica is a key partner of the major cloud service providers - Google, Microsoft, Amazon, Snowflake.


We have our own products!
Eagle –
Data warehouse Assessment & Migration Planning Product
Raven –
Automated Workload Conversion Product
Pelican -
Automated Data Validation Product, which helps automate and accelerate data migration to the cloud.

Why join us!
Datametica is a place to innovate, bring new ideas to live and learn new things. We believe in building a culture of innovation, growth and belonging. Our people and their dedication over these years are the key factors in achieving our success.

Benefits we Provide!
Working with Highly Technical and Passionate, mission-driven people
Subsidized Meals & Snacks
Flexible Schedule
Approachable leadership
Access to various learning tools and programs
Pet Friendly
Certification Reimbursement Policy

Check out more about us on our website below!
www.datametica.com
Read more
Job posted by
Sumangali Desai

Sr. Data Scientist

at www.claimgenius

Founded 2017  •  Product  •  100-500 employees  •  Raised funding
Data Science
Deep Learning
Python
Image Processing
CNN
Convolution neural network
OpenCV
icon
Nagpur, Hyderabad
icon
3 - 10 yrs
icon
₹5L - ₹25L / yr

Responsibilities: 
 

  • The Machine & Deep Machine Learning Software Engineer (Expertise in Computer Vision) will be an early member of a growing team with responsibilities for designing and developing highly scalable machine learning solutions that impact many areas of our business. 
  • The individual in this role will help in the design and development of Neural Network (especially Convolution Neural Networks) & ML solutions based on our reference architecture which is underpinned by big data & cloud technology, micro-service architecture and high performing compute infrastructure. 
  • Typical daily activities include contributing to all phases of algorithm development including ideation, prototyping, design, and development production implementation. 


Required Skills: 
 

  • An ideal candidate will have a background in software engineering and data science with expertise in machine learning algorithms, statistical analysis tools, and distributed systems. 
  • Experience in building machine learning applications, and broad knowledge of machine learning APIs, tools, and open-source libraries 
  • Strong coding skills and fundamentals in data structures, predictive modeling, and big data concepts 
  • Experience in designing full stack ML solutions in a distributed computing environment 
  • Experience working with Python, Tensor Flow, Kera’s, Sci-kit, pandas, NumPy, AZURE, AWS GPU
  • Excellent communication skills with multiple levels of the organization 
  • Image CNN, Image processing, MRCNN, FRCNN experience is a must.
Read more
Job posted by
KalyaniMuley

Hiring for Data Analyst - Assistant Manager - Chennai

at LatentView Analytics

Founded 2006  •  Products & Services  •  100-1000 employees  •  Profitable
Data Structures
Business Development
Data Analytics
Regression Testing
Machine Learning (ML)
R Programming
SQL server
MySQL
Python
icon
Bengaluru (Bangalore), Chennai
icon
9 - 14 yrs
icon
₹9L - ₹14L / yr
Required Skill Set: -5+ years of hands-on experience in delivering results-driven analytics solutions with proven business value - Great consulting and quantitative skills, detail-oriented approach, with proven expertise in developing solutions using SQL, R, Python or such tools - A background in Statistics / Econometrics / Applied Math / Operations Research would be considered a plus -Exposure to working with globally dispersed teams based out of India or other offshore locations Role Description/ Responsibilities: Be the face of LatentView in the client's organization and help define analytics-driven consulting solutions to business problems -Translate business problems into analytic solution requirements and work with the LatentView team to develop high-quality solutions "- Communicate effectively with client / offshore team to manage client expectations and ensure timeliness and quality of insights -Develop expertise in clients business and help translate that into increasingly high value-added advisory solutions to client -Oversee Project Delivery to ensure the team meets the quality, productivity and SLA objectives - Grow the Account in terms of revenue and the size of the team You should Apply if you want to: - Change the world with Math and Models: At the core, we believe that analytics can help drive business transformation and lasting competitive advantage. We work with a heavy mix of algorithms, analysis, large databases and ROI to positively transform many a client- business performance - Make a direct impact on business: Your contribution to delivering results-driven solutions can potentially lead to millions of dollars of additional revenue or profit for our clients - Thrive in a Fast-pace Environment: You work in small teams, in an entrepreneurial environment, and a meritorious culture that values speed, growth, diversity and contribution - Work with great people: Our selection process ensures that we hire only the very best, while more than 50% of our analysts and 90% of our managers are alumni/alumna of prestigious global institutions
Read more
Job posted by
Kannikanti madhuri

Big Data Engineer

at Crisp Analytics

Founded 2015  •  Products & Services  •  20-100 employees  •  Profitable
Spark
Apache Kafka
Hadoop
Pig
HDFS
icon
Noida, NCR (Delhi | Gurgaon | Noida)
icon
3 - 7 yrs
icon
₹5L - ₹12L / yr
Together we will create wonderful solutions which deliver value for us and our customers.
Read more
Job posted by
Sneha Pandey

Enthusiastic Cloud-ML Engineers with a keen sense of curiosity

at Talent Sculpt

Founded 2012  •  Products & Services  •  100-1000 employees  •  Raised funding
Java
Python
Spark
Hadoop
MongoDB
Scala
Natural Language Processing (NLP)
Machine Learning (ML)
icon
Bengaluru (Bangalore)
icon
3 - 12 yrs
icon
₹3L - ₹25L / yr
We are a start-up in India seeking excellence in everything we do with an unwavering curiosity and enthusiasm. We build simplified new-age AI driven Big Data Analytics platform for Global Enterprises and solve their biggest business challenges. Our Engineers develop fresh intuitive solutions keeping the user in the center of everything. As a Cloud-ML Engineer, you will design and implement ML solutions for customer use cases and problem solve complex technical customer challenges. Expectations and Tasks - Total of 7+ years of experience with minimum of 2 years in Hadoop technologies like HDFS, Hive, MapReduce - Experience working with recommendation engines, data pipelines, or distributed machine learning and experience with data analytics and data visualization techniques and software. - Experience with core Data Science techniques such as regression, classification or clustering, and experience with deep learning frameworks - Experience in NLP, R and Python - Experience in performance tuning and optimization techniques to process big data from heterogeneous sources. - Ability to communicate clearly and concisely across technology and the business teams. - Excellent Problem solving and Technical troubleshooting skills. - Ability to handle multiple projects and prioritize tasks in a rapidly changing environment. Technical Skills Core Java, Multithreading, Collections, OOPS, Python, R, Apache Spark, MapReduce, Hive, HDFS, Hadoop, MongoDB, Scala We are a retained Search Firm employed by our client - Technology Start-up @ Bangalore. Interested candidates can share their resumes with me - [email protected] I will respond to you within 24 hours. Online assessments and pre-employment screening are part of the selection process.
Read more
Job posted by
Blitzkrieg HR Consulting
Did not find a job you were looking for?
icon
Search for relevant jobs from 10000+ companies such as Google, Amazon & Uber actively hiring on Cutshort.
Get to hear about interesting companies hiring right now
iconFollow Cutshort
Want to apply to this role at MNC?
Why apply via Cutshort?
Connect with actual hiring teams and get their fast response. No spam.
Learn more
Get to hear about interesting companies hiring right now
iconFollow Cutshort