Cutshort logo
PySpark Jobs in Bangalore (Bengaluru)

50+ PySpark Jobs in Bangalore (Bengaluru) | PySpark Job openings in Bangalore (Bengaluru)

Apply to 50+ PySpark Jobs in Bangalore (Bengaluru) on CutShort.io. Explore the latest PySpark Job opportunities across top companies like Google, Amazon & Adobe.

icon
Publicis Sapient

at Publicis Sapient

10 recruiters
Mohit Singh
Posted by Mohit Singh
Bengaluru (Bangalore), Pune, Hyderabad, Gurugram, Noida
5 - 11 yrs
₹20L - ₹36L / yr
PySpark
Data engineering
Big Data
Hadoop
Spark
+7 more

Publicis Sapient Overview:

The Senior Associate People Senior Associate L1 in Data Engineering, you will translate client requirements into technical design, and implement components for data engineering solution. Utilize deep understanding of data integration and big data design principles in creating custom solutions or implementing package solutions. You will independently drive design discussions to insure the necessary health of the overall solution 

.

Job Summary:

As Senior Associate L2 in Data Engineering, you will translate client requirements into technical design, and implement components for data engineering solution. Utilize deep understanding of data integration and big data design principles in creating custom solutions or implementing package solutions. You will independently drive design discussions to insure the necessary health of the overall solution

The role requires a hands-on technologist who has strong programming background like Java / Scala / Python, should have experience in Data Ingestion, Integration and data Wrangling, Computation, Analytics pipelines and exposure to Hadoop ecosystem components. You are also required to have hands-on knowledge on at least one of AWS, GCP, Azure cloud platforms.


Role & Responsibilities:

Your role is focused on Design, Development and delivery of solutions involving:

• Data Integration, Processing & Governance

• Data Storage and Computation Frameworks, Performance Optimizations

• Analytics & Visualizations

• Infrastructure & Cloud Computing

• Data Management Platforms

• Implement scalable architectural models for data processing and storage

• Build functionality for data ingestion from multiple heterogeneous sources in batch & real-time mode

• Build functionality for data analytics, search and aggregation

Experience Guidelines:

Mandatory Experience and Competencies:

# Competency

1.Overall 5+ years of IT experience with 3+ years in Data related technologies

2.Minimum 2.5 years of experience in Big Data technologies and working exposure in at least one cloud platform on related data services (AWS / Azure / GCP)

3.Hands-on experience with the Hadoop stack – HDFS, sqoop, kafka, Pulsar, NiFi, Spark, Spark Streaming, Flink, Storm, hive, oozie, airflow and other components required in building end to end data pipeline.

4.Strong experience in at least of the programming language Java, Scala, Python. Java preferable

5.Hands-on working knowledge of NoSQL and MPP data platforms like Hbase, MongoDb, Cassandra, AWS Redshift, Azure SQLDW, GCP BigQuery etc

6.Well-versed and working knowledge with data platform related services on at least 1 cloud platform, IAM and data security


Preferred Experience and Knowledge (Good to Have):

# Competency

1.Good knowledge of traditional ETL tools (Informatica, Talend, etc) and database technologies (Oracle, MySQL, SQL Server, Postgres) with hands on experience

2.Knowledge on data governance processes (security, lineage, catalog) and tools like Collibra, Alation etc

3.Knowledge on distributed messaging frameworks like ActiveMQ / RabbiMQ / Solace, search & indexing and Micro services architectures

4.Performance tuning and optimization of data pipelines

5.CI/CD – Infra provisioning on cloud, auto build & deployment pipelines, code quality

6.Cloud data specialty and other related Big data technology certifications


Personal Attributes:

• Strong written and verbal communication skills

• Articulation skills

• Good team player

• Self-starter who requires minimal oversight

• Ability to prioritize and manage multiple tasks

• Process orientation and the ability to define and set up processes


Read more
Publicis Sapient

at Publicis Sapient

10 recruiters
Mohit Singh
Posted by Mohit Singh
Bengaluru (Bangalore), Gurugram, Pune, Hyderabad, Noida
4 - 10 yrs
Best in industry
PySpark
Data engineering
Big Data
Hadoop
Spark
+6 more

Publicis Sapient Overview:

The Senior Associate People Senior Associate L1 in Data Engineering, you will translate client requirements into technical design, and implement components for data engineering solution. Utilize deep understanding of data integration and big data design principles in creating custom solutions or implementing package solutions. You will independently drive design discussions to insure the necessary health of the overall solution 

.

Job Summary:

As Senior Associate L1 in Data Engineering, you will do technical design, and implement components for data engineering solution. Utilize deep understanding of data integration and big data design principles in creating custom solutions or implementing package solutions. You will independently drive design discussions to insure the necessary health of the overall solution

The role requires a hands-on technologist who has strong programming background like Java / Scala / Python, should have experience in Data Ingestion, Integration and data Wrangling, Computation, Analytics pipelines and exposure to Hadoop ecosystem components. Having hands-on knowledge on at least one of AWS, GCP, Azure cloud platforms will be preferable.


Role & Responsibilities:

Job Title: Senior Associate L1 – Data Engineering

Your role is focused on Design, Development and delivery of solutions involving:

• Data Ingestion, Integration and Transformation

• Data Storage and Computation Frameworks, Performance Optimizations

• Analytics & Visualizations

• Infrastructure & Cloud Computing

• Data Management Platforms

• Build functionality for data ingestion from multiple heterogeneous sources in batch & real-time

• Build functionality for data analytics, search and aggregation


Experience Guidelines:

Mandatory Experience and Competencies:

# Competency

1.Overall 3.5+ years of IT experience with 1.5+ years in Data related technologies

2.Minimum 1.5 years of experience in Big Data technologies

3.Hands-on experience with the Hadoop stack – HDFS, sqoop, kafka, Pulsar, NiFi, Spark, Spark Streaming, Flink, Storm, hive, oozie, airflow and other components required in building end to end data pipeline. Working knowledge on real-time data pipelines is added advantage.

4.Strong experience in at least of the programming language Java, Scala, Python. Java preferable

5.Hands-on working knowledge of NoSQL and MPP data platforms like Hbase, MongoDb, Cassandra, AWS Redshift, Azure SQLDW, GCP BigQuery etc


Preferred Experience and Knowledge (Good to Have):

# Competency

1.Good knowledge of traditional ETL tools (Informatica, Talend, etc) and database technologies (Oracle, MySQL, SQL Server, Postgres) with hands on experience

2.Knowledge on data governance processes (security, lineage, catalog) and tools like Collibra, Alation etc

3.Knowledge on distributed messaging frameworks like ActiveMQ / RabbiMQ / Solace, search & indexing and Micro services architectures

4.Performance tuning and optimization of data pipelines

5.CI/CD – Infra provisioning on cloud, auto build & deployment pipelines, code quality

6.Working knowledge with data platform related services on at least 1 cloud platform, IAM and data security

7.Cloud data specialty and other related Big data technology certifications


Job Title: Senior Associate L1 – Data Engineering

Personal Attributes:

• Strong written and verbal communication skills

• Articulation skills

• Good team player

• Self-starter who requires minimal oversight

• Ability to prioritize and manage multiple tasks

• Process orientation and the ability to define and set up processes

Read more
Arting Digital
Pragati Bhardwaj
Posted by Pragati Bhardwaj
Bengaluru (Bangalore)
10 - 16 yrs
₹10L - ₹15L / yr
databricks
Data modeling
SQL
Python
AWS Lambda
+2 more

Title:- Lead Data Engineer 


Experience: 10+y

Budget: 32-36 LPA

Location: Bangalore 

Work of Mode: Work from office

Primary Skills: Data Bricks, Spark, Pyspark,Sql, Python, AWS

Qualification: Any Engineering degree


Roles and Responsibilities:


• 8 - 10+ years’ experience in developing scalable Big Data applications or solutions on

 distributed platforms.

• Able to partner with others in solving complex problems by taking a broad

 perspective to identify.

• innovative solutions.

• Strong skills building positive relationships across Product and Engineering.

• Able to influence and communicate effectively, both verbally and written, with team

  members and business stakeholders

• Able to quickly pick up new programming languages, technologies, and frameworks.

• Experience working in Agile and Scrum development process.

• Experience working in a fast-paced, results-oriented environment.

• Experience in Amazon Web Services (AWS) mainly S3, Managed Airflow, EMR/ EC2,

  IAM etc.

• Experience working with Data Warehousing tools, including SQL database, Presto,

  and Snowflake

• Experience architecting data product in Streaming, Serverless and Microservices

  Architecture and platform.

• Experience working with Data platforms, including EMR, Airflow, Databricks (Data

  Engineering & Delta

• Lake components, and Lakehouse Medallion architecture), etc.

• Experience with creating/ configuring Jenkins pipeline for smooth CI/CD process for

  Managed Spark jobs, build Docker images, etc.

• Experience working with distributed technology tools, including Spark, Python, Scala

• Working knowledge of Data warehousing, Data modelling, Governance and Data

  Architecture

• Working knowledge of Reporting & Analytical tools such as Tableau, Quicksite

  etc.

• Demonstrated experience in learning new technologies and skills.

• Bachelor’s degree in computer science, Information Systems, Business, or other

  relevant subject area

Read more
Quinnox

at Quinnox

2 recruiters
MidhunKumar T
Posted by MidhunKumar T
Bengaluru (Bangalore), Mumbai
10 - 15 yrs
₹30L - ₹35L / yr
ADF
azure data lake services
SQL Azure
azure synapse
Spark
+4 more

Mandatory Skills: Azure Data Lake Storage, Azure SQL databases, Azure Synapse, Data Bricks (Pyspark/Spark), Python, SQL, Azure Data Factory.


Good to have: Power BI, Azure IAAS services, Azure Devops, Microsoft Fabric


Ø Very strong understanding on ETL and ELT

Ø Very strong understanding on Lakehouse architecture.

Ø Very strong knowledge in Pyspark and Spark architecture.

Ø Good knowledge in Azure data lake architecture and access controls

Ø Good knowledge in Microsoft Fabric architecture

Ø Good knowledge in Azure SQL databases

Ø Good knowledge in T-SQL

Ø Good knowledge in CI /CD process using Azure devops

Ø Power BI

Read more
Bengaluru (Bangalore), Hyderabad, Delhi, Gurugram
5 - 10 yrs
₹14L - ₹15L / yr
Google Cloud Platform (GCP)
Spark
PySpark
Apache Spark
"DATA STREAMING"

Data Engineering : Senior Engineer / Manager


As Senior Engineer/ Manager in Data Engineering, you will translate client requirements into technical design, and implement components for a data engineering solutions. Utilize a deep understanding of data integration and big data design principles in creating custom solutions or implementing package solutions. You will independently drive design discussions to insure the necessary health of the overall solution.


Must Have skills :


1. GCP


2. Spark streaming : Live data streaming experience is desired.


3. Any 1 coding language: Java/Pyhton /Scala



Skills & Experience :


- Overall experience of MINIMUM 5+ years with Minimum 4 years of relevant experience in Big Data technologies


- Hands-on experience with the Hadoop stack - HDFS, sqoop, kafka, Pulsar, NiFi, Spark, Spark Streaming, Flink, Storm, hive, oozie, airflow and other components required in building end to end data pipeline. Working knowledge on real-time data pipelines is added advantage.


- Strong experience in at least of the programming language Java, Scala, Python. Java preferable


- Hands-on working knowledge of NoSQL and MPP data platforms like Hbase, MongoDb, Cassandra, AWS Redshift, Azure SQLDW, GCP BigQuery etc.


- Well-versed and working knowledge with data platform related services on GCP


- Bachelor's degree and year of work experience of 6 to 12 years or any combination of education, training and/or experience that demonstrates the ability to perform the duties of the position


Your Impact :


- Data Ingestion, Integration and Transformation


- Data Storage and Computation Frameworks, Performance Optimizations


- Analytics & Visualizations


- Infrastructure & Cloud Computing


- Data Management Platforms


- Build functionality for data ingestion from multiple heterogeneous sources in batch & real-time


- Build functionality for data analytics, search and aggregation

Read more
A fast growing Big Data company
Noida, Bengaluru (Bangalore), Chennai, Hyderabad
6 - 8 yrs
₹10L - ₹15L / yr
AWS Glue
SQL
Python
PySpark
Data engineering
+6 more

AWS Glue Developer 

Work Experience: 6 to 8 Years

Work Location:  Noida, Bangalore, Chennai & Hyderabad

Must Have Skills: AWS Glue, DMS, SQL, Python, PySpark, Data integrations and Data Ops, 

Job Reference ID:BT/F21/IND


Job Description:

Design, build and configure applications to meet business process and application requirements.


Responsibilities:

7 years of work experience with ETL, Data Modelling, and Data Architecture Proficient in ETL optimization, designing, coding, and tuning big data processes using Pyspark Extensive experience to build data platforms on AWS using core AWS services Step function, EMR, Lambda, Glue and Athena, Redshift, Postgres, RDS etc and design/develop data engineering solutions. Orchestrate using Airflow.


Technical Experience:

Hands-on experience on developing Data platform and its components Data Lake, cloud Datawarehouse, APIs, Batch and streaming data pipeline Experience with building data pipelines and applications to stream and process large datasets at low latencies.


➢ Enhancements, new development, defect resolution and production support of Big data ETL development using AWS native services.

➢ Create data pipeline architecture by designing and implementing data ingestion solutions.

➢ Integrate data sets using AWS services such as Glue, Lambda functions/ Airflow.

➢ Design and optimize data models on AWS Cloud using AWS data stores such as Redshift, RDS, S3, Athena.

➢ Author ETL processes using Python, Pyspark.

➢ Build Redshift Spectrum direct transformations and data modelling using data in S3.

➢ ETL process monitoring using CloudWatch events.

➢ You will be working in collaboration with other teams. Good communication must.

➢ Must have experience in using AWS services API, AWS CLI and SDK


Professional Attributes:

➢ Experience operating very large data warehouses or data lakes Expert-level skills in writing and optimizing SQL Extensive, real-world experience designing technology components for enterprise solutions and defining solution architectures and reference architectures with a focus on cloud technology.

➢ Must have 6+ years of big data ETL experience using Python, S3, Lambda, Dynamo DB, Athena, Glue in AWS environment.

➢ Expertise in S3, RDS, Redshift, Kinesis, EC2 clusters highly desired.


Qualification:

➢ Degree in Computer Science, Computer Engineering or equivalent.


Salary: Commensurate with experience and demonstrated competence

Read more
hopscotch
Bengaluru (Bangalore)
5 - 8 yrs
₹6L - ₹15L / yr
Python
Amazon Redshift
Amazon Web Services (AWS)
PySpark
Data engineering
+3 more

About the role:

 Hopscotch is looking for a passionate Data Engineer to join our team. You will work closely with other teams like data analytics, marketing, data science and individual product teams to specify, validate, prototype, scale, and deploy data pipelines features and data architecture.


Here’s what will be expected out of you:

➢ Ability to work in a fast-paced startup mindset. Should be able to manage all aspects of data extraction transfer and load activities.

➢ Develop data pipelines that make data available across platforms.

➢ Should be comfortable in executing ETL (Extract, Transform and Load) processes which include data ingestion, data cleaning and curation into a data warehouse, database, or data platform.

➢ Work on various aspects of the AI/ML ecosystem – data modeling, data and ML pipelines.

➢ Work closely with Devops and senior Architect to come up with scalable system and model architectures for enabling real-time and batch services.


What we want:

➢ 5+ years of experience as a data engineer or data scientist with a focus on data engineering and ETL jobs.

➢ Well versed with the concept of Data warehousing, Data Modelling and/or Data Analysis.

➢ Experience using & building pipelines and performing ETL with industry-standard best practices on Redshift (more than 2+ years).

➢ Ability to troubleshoot and solve performance issues with data ingestion, data processing & query execution on Redshift.

➢ Good understanding of orchestration tools like Airflow.

 ➢ Strong Python and SQL coding skills.

➢ Strong Experience in distributed systems like spark.

➢ Experience with AWS Data and ML Technologies (AWS Glue,MWAA, Data Pipeline,EMR,Athena, Redshift,Lambda etc).

➢ Solid hands on with various data extraction techniques like CDC or Time/batch based and the related tools (Debezium, AWS DMS, Kafka Connect, etc) for near real time and batch data extraction.


Note :

Product based companies, Ecommerce companies is added advantage

Read more
Epik Solutions
Sakshi Sarraf
Posted by Sakshi Sarraf
Bengaluru (Bangalore), Noida
4 - 13 yrs
₹7L - ₹18L / yr
Python
SQL
databricks
Scala
Spark
+2 more

Job Description:


As an Azure Data Engineer, your role will involve designing, developing, and maintaining data solutions on the Azure platform. You will be responsible for building and optimizing data pipelines, ensuring data quality and reliability, and implementing data processing and transformation logic. Your expertise in Azure Databricks, Python, SQL, Azure Data Factory (ADF), PySpark, and Scala will be essential for performing the following key responsibilities:


Designing and developing data pipelines: You will design and implement scalable and efficient data pipelines using Azure Databricks, PySpark, and Scala. This includes data ingestion, data transformation, and data loading processes.


Data modeling and database design: You will design and implement data models to support efficient data storage, retrieval, and analysis. This may involve working with relational databases, data lakes, or other storage solutions on the Azure platform.


Data integration and orchestration: You will leverage Azure Data Factory (ADF) to orchestrate data integration workflows and manage data movement across various data sources and targets. This includes scheduling and monitoring data pipelines.


Data quality and governance: You will implement data quality checks, validation rules, and data governance processes to ensure data accuracy, consistency, and compliance with relevant regulations and standards.


Performance optimization: You will optimize data pipelines and queries to improve overall system performance and reduce processing time. This may involve tuning SQL queries, optimizing data transformation logic, and leveraging caching techniques.


Monitoring and troubleshooting: You will monitor data pipelines, identify performance bottlenecks, and troubleshoot issues related to data ingestion, processing, and transformation. You will work closely with cross-functional teams to resolve data-related problems.


Documentation and collaboration: You will document data pipelines, data flows, and data transformation processes. You will collaborate with data scientists, analysts, and other stakeholders to understand their data requirements and provide data engineering support.


Skills and Qualifications:


Strong experience with Azure Databricks, Python, SQL, ADF, PySpark, and Scala.

Proficiency in designing and developing data pipelines and ETL processes.

Solid understanding of data modeling concepts and database design principles.

Familiarity with data integration and orchestration using Azure Data Factory.

Knowledge of data quality management and data governance practices.

Experience with performance tuning and optimization of data pipelines.

Strong problem-solving and troubleshooting skills related to data engineering.

Excellent collaboration and communication skills to work effectively in cross-functional teams.

Understanding of cloud computing principles and experience with Azure services.


Read more
Kloud9 Technologies
Bengaluru (Bangalore)
3 - 6 yrs
₹5L - ₹20L / yr
Amazon Web Services (AWS)
Amazon EMR
EMR
Spark
PySpark
+9 more

About Kloud9:

 

Kloud9 exists with the sole purpose of providing cloud expertise to the retail industry. Our team of cloud architects, engineers and developers help retailers launch a successful cloud initiative so you can quickly realise the benefits of cloud technology. Our standardised, proven cloud adoption methodologies reduce the cloud adoption time and effort so you can directly benefit from lower migration costs.

 

Kloud9 was founded with the vision of bridging the gap between E-commerce and cloud. The E-commerce of any industry is limiting and poses a huge challenge in terms of the finances spent on physical data structures.

 

At Kloud9, we know migrating to the cloud is the single most significant technology shift your company faces today. We are your trusted advisors in transformation and are determined to build a deep partnership along the way. Our cloud and retail experts will ease your transition to the cloud.

 

Our sole focus is to provide cloud expertise to retail industry giving our clients the empowerment that will take their business to the next level. Our team of proficient architects, engineers and developers have been designing, building and implementing solutions for retailers for an average of more than 20 years.

 

We are a cloud vendor that is both platform and technology independent. Our vendor independence not just provides us with a unique perspective into the cloud market but also ensures that we deliver the cloud solutions available that best meet our clients' requirements.


What we are looking for:

● 3+ years’ experience developing Data & Analytic solutions

● Experience building data lake solutions leveraging one or more of the following AWS, EMR, S3, Hive& Spark

● Experience with relational SQL

● Experience with scripting languages such as Shell, Python

● Experience with source control tools such as GitHub and related dev process

● Experience with workflow scheduling tools such as Airflow

● In-depth knowledge of scalable cloud

● Has a passion for data solutions

● Strong understanding of data structures and algorithms

● Strong understanding of solution and technical design

● Has a strong problem-solving and analytical mindset

● Experience working with Agile Teams.

● Able to influence and communicate effectively, both verbally and written, with team members and business stakeholders

● Able to quickly pick up new programming languages, technologies, and frameworks

● Bachelor’s Degree in computer science


Why Explore a Career at Kloud9:

 

With job opportunities in prime locations of US, London, Poland and Bengaluru, we help build your career paths in cutting edge technologies of AI, Machine Learning and Data Science. Be part of an inclusive and diverse workforce that's changing the face of retail technology with their creativity and innovative solutions. Our vested interest in our employees translates to deliver the best products and solutions to our customers.

Read more
Kloud9 Technologies
Bengaluru (Bangalore)
4 - 7 yrs
₹10L - ₹30L / yr
Google Cloud Platform (GCP)
PySpark
Python
Scala

About Kloud9:

 

Kloud9 exists with the sole purpose of providing cloud expertise to the retail industry. Our team of cloud architects, engineers and developers help retailers launch a successful cloud initiative so you can quickly realise the benefits of cloud technology. Our standardised, proven cloud adoption methodologies reduce the cloud adoption time and effort so you can directly benefit from lower migration costs.

 

Kloud9 was founded with the vision of bridging the gap between E-commerce and cloud. The E-commerce of any industry is limiting and poses a huge challenge in terms of the finances spent on physical data structures.

 

At Kloud9, we know migrating to the cloud is the single most significant technology shift your company faces today. We are your trusted advisors in transformation and are determined to build a deep partnership along the way. Our cloud and retail experts will ease your transition to the cloud.

 

Our sole focus is to provide cloud expertise to retail industry giving our clients the empowerment that will take their business to the next level. Our team of proficient architects, engineers and developers have been designing, building and implementing solutions for retailers for an average of more than 20 years.

 

We are a cloud vendor that is both platform and technology independent. Our vendor independence not just provides us with a unique perspective into the cloud market but also ensures that we deliver the cloud solutions available that best meet our clients' requirements.


●    Overall 8+ Years of Experience in Web Application development.

●    5+ Years of development experience with JAVA8 , Springboot, Microservices and middleware

●    3+ Years of Designing Middleware using Node JS platform.

●    good to have 2+ Years of Experience in using NodeJS along with AWS Serverless platform.

●    Good Experience with Javascript / TypeScript, Event Loops, ExpressJS, GraphQL, SQL DB (MySQLDB), NoSQL DB(MongoDB) and YAML templates.

●    Good Experience with TDD Driven Development and Automated Unit Testing.

●    Good Experience with exposing and consuming Rest APIs in Java 8, Springboot platform and Swagger API contracts.

●    Good Experience in building NodeJS middleware performing Transformations, Routing, Aggregation, Orchestration and Authentication(JWT/OAUTH).

●    Experience supporting and working with cross-functional teams in a dynamic environment.

●    Experience working in Agile Scrum Methodology.

●    Very good Problem-Solving Skills.

●    Very good learner and passion for technology.

●     Excellent verbal and written communication skills in English

●     Ability to communicate effectively with team members and business stakeholders


Secondary Skill Requirements:

 

● Experience working with any of Loopback, NestJS, Hapi.JS, Sails.JS, Passport.JS


Why Explore a Career at Kloud9:

 

With job opportunities in prime locations of US, London, Poland and Bengaluru, we help build your career paths in cutting edge technologies of AI, Machine Learning and Data Science. Be part of an inclusive and diverse workforce that's changing the face of retail technology with their creativity and innovative solutions. Our vested interest in our employees translates to deliver the best products and solutions to our customers.

Read more
Tata Digital Pvt Ltd
Agency job
via Seven N Half by Priya Singh
Bengaluru (Bangalore)
8 - 13 yrs
₹10L - ₹15L / yr
PySpark
Data engineering
Big Data
Hadoop
Spark
+2 more

 

              Data Engineer

 

-          High Skilled and proficient on Azure Data Engineering Tech stacks (ADF, Databricks)

-          Should be well experienced in design and development of Big data integration platform (Kafka, Hadoop).

-          Highly skilled and experienced in building medium to complex data integration pipelines for Data at Rest and streaming data using Spark.

-          Strong knowledge in R/Python.

-          Advanced proficiency in solution design and implementation through Azure Data Lake, SQL and NoSQL Databases.

-          Strong in Data Warehousing concepts

-          Expertise in SQL, SQL tuning, Data Management (Data Security), schema design, Python and ETL processes

-          Highly Motivated, Self-Starter and quick learner

-          Must have Good knowledge on Data modelling and understating of Data analytics

-          Exposure to Statistical procedures, Experiments and Machine Learning techniques is an added advantage.

-          Experience in leading small team of 6/7 Data Engineers.

-          Excellent written and verbal communication skills

 

Read more
Bengaluru (Bangalore), Gurugram
2 - 8 yrs
₹10L - ₹35L / yr
Data Science
Machine Learning (ML)
Natural Language Processing (NLP)
Computer Vision
Python
+11 more
Greetings!!

We are looking for a Machine Learning engineer for on of our premium client.
Experience: 2-9 years
Location: Gurgaon/Bangalore
Tech Stack:

Python, PySpark, the Python Scientific Stack; MLFlow, Grafana, Prometheus for machine learning pipeline management and monitoring; SQL, Airflow, Databricks, our own open-source data pipelining framework called Kedro, Dask/RAPIDS; Django, GraphQL and ReactJS for horizontal product development; container technologies such as Docker and Kubernetes, CircleCI/Jenkins for CI/CD, cloud solutions such as AWS, GCP, and Azure as well as Terraform and Cloudformation for deployment
Read more
Aureus Tech Systems

at Aureus Tech Systems

3 recruiters
Naveen Yelleti
Posted by Naveen Yelleti
Kolkata, Hyderabad, Chennai, Bengaluru (Bangalore), Bhubaneswar, Visakhapatnam, Vijayawada, Trichur, Thiruvananthapuram, Mysore, Delhi, Noida, Gurugram, Nagpur
1 - 7 yrs
₹4L - ₹15L / yr
PySpark
Data engineering
Big Data
Hadoop
Spark
+2 more

Skills and requirements

  • Experience analyzing complex and varied data in a commercial or academic setting.
  • Desire to solve new and complex problems every day.
  • Excellent ability to communicate scientific results to both technical and non-technical team members.


Desirable

  • A degree in a numerically focused discipline such as, Maths, Physics, Chemistry, Engineering or Biological Sciences..
  • Hands on experience on Python, Pyspark, SQL
  • Hands on experience on building End to End Data Pipelines.
  • Hands on Experience on Azure Data Factory, Azure Data Bricks, Data Lake - added advantage
  • Hands on Experience in building data pipelines.
  • Experience with Bigdata Tools, Hadoop, Hive, Sqoop, Spark, SparkSQL
  • Experience with SQL or NoSQL databases for the purposes of data retrieval and management.
  • Experience in data warehousing and business intelligence tools, techniques and technology, as well as experience in diving deep on data analysis or technical issues to come up with effective solutions.
  • BS degree in math, statistics, computer science or equivalent technical field.
  • Experience in data mining structured and unstructured data (SQL, ETL, data warehouse, Machine Learning etc.) in a business environment with large-scale, complex data sets.
  • Proven ability to look at solutions in unconventional ways. Sees opportunities to innovate and can lead the way.
  • Willing to learn and work on Data Science, ML, AI.
Read more
EnterpriseMinds

at EnterpriseMinds

2 recruiters
phani kalyan
Posted by phani kalyan
Bengaluru (Bangalore)
3 - 7.5 yrs
₹10L - ₹25L / yr
Machine Learning (ML)
Data Science
Natural Language Processing (NLP)
Spark
Software deployment
+1 more
Job ID: ZS0701

Hi,

We are hiring for Data Scientist for Bangalore.

Req Skills:

  • NLP 
  • ML programming
  • Spark
  • Model Deployment
  • Experience processing unstructured data and building NLP models
  • Experience with big data tools pyspark
  • Pipeline orchestration using Airflow and model deployment experience is preferred
Read more
EnterpriseMinds

at EnterpriseMinds

2 recruiters
phani kalyan
Posted by phani kalyan
Bengaluru (Bangalore)
3 - 6 yrs
Best in industry
Python
PySpark
Data Science
Job ID: ZS070

Hi,

Enterprise minds is looking for Data Scientist. 

Strong in Python,Pyspark.

Prefer immediate joiners
Read more
RedSeer Consulting

at RedSeer Consulting

2 recruiters
Raunak Swarnkar
Posted by Raunak Swarnkar
Bengaluru (Bangalore)
0 - 2 yrs
₹10L - ₹15L / yr
Python
PySpark
SQL
pandas
Cloud Computing
+2 more

BRIEF DESCRIPTION:

At-least 1 year of Python, Spark, SQL, data engineering experience

Primary Skillset: PySpark, Scala/Python/Spark, Azure Synapse, S3, RedShift/Snowflake

Relevant Experience: Legacy ETL job Migration to AWS Glue / Python & Spark combination

 

ROLE SCOPE:

Reverse engineer the existing/legacy ETL jobs

Create the workflow diagrams and review the logic diagrams with Tech Leads

Write equivalent logic in Python & Spark

Unit test the Glue jobs and certify the data loads before passing to system testing

Follow the best practices, enable appropriate audit & control mechanism

Analytically skillful, identify the root causes quickly and efficiently debug issues

Take ownership of the deliverables and support the deployments

 

REQUIREMENTS:

Create data pipelines for data integration into Cloud stacks eg. Azure Synapse

Code data processing jobs in Azure Synapse Analytics, Python, and Spark

Experience in dealing with structured, semi-structured, and unstructured data in batch and real-time environments.

Should be able to process .json, .parquet and .avro files

 

PREFERRED BACKGROUND:

Tier1/2 candidates from IIT/NIT/IIITs

However, relevant experience, learning attitude takes precedence

Read more
Top 3 Fintech Startup
Agency job
via Jobdost by Sathish Kumar
Bengaluru (Bangalore)
6 - 9 yrs
₹16L - ₹24L / yr
SQL
Amazon Web Services (AWS)
Spark
PySpark
Apache Hive

We are looking for an exceptionally talented Lead data engineer who has exposure in implementing AWS services to build data pipelines, api integration and designing data warehouse. Candidate with both hands-on and leadership capabilities will be ideal for this position.

 

Qualification: At least a bachelor’s degree in Science, Engineering, Applied Mathematics. Preferred Masters degree

 

Job Responsibilities:

• Total 6+ years of experience as a Data Engineer and 2+ years of experience in managing a team

• Have minimum 3 years of AWS Cloud experience.

• Well versed in languages such as Python, PySpark, SQL, NodeJS etc

• Has extensive experience in the real-timeSpark ecosystem and has worked on both real time and batch processing

• Have experience in AWS Glue, EMR, DMS, Lambda, S3, DynamoDB, Step functions, Airflow, RDS, Aurora etc.

• Experience with modern Database systems such as Redshift, Presto, Hive etc.

• Worked on building data lakes in the past on S3 or Apache Hudi

• Solid understanding of Data Warehousing Concepts

• Good to have experience on tools such as Kafka or Kinesis

• Good to have AWS Developer Associate or Solutions Architect Associate Certification

• Have experience in managing a team

Read more
Fragma Data Systems

at Fragma Data Systems

8 recruiters

Vamsikrishna G
Posted by Vamsikrishna G
Bengaluru (Bangalore)
2 - 10 yrs
₹5L - ₹15L / yr
PySpark
Data engineering
Big Data
Hadoop
Spark
+1 more
Job Description:

Must Have Skills:
• Good experience in Pyspark - Including Dataframe core functions and Spark SQL
• Good experience in SQL DBs - Be able to write queries including fair complexity.
• Should have excellent experience in Big Data programming for data transformation and aggregations
• Good at ELT architecture. Business rules processing and data extraction from Data Lake into data streams for business consumption.
• Good customer communication.
• Good Analytical skills
Read more
Indium Software

at Indium Software

16 recruiters
Karunya P
Posted by Karunya P
Bengaluru (Bangalore), Hyderabad
1 - 9 yrs
₹1L - ₹15L / yr
SQL
Python
Hadoop
HiveQL
Spark
+1 more

Responsibilities:

 

* 3+ years of Data Engineering Experience - Design, develop, deliver and maintain data infrastructures.

SQL Specialist – Strong knowledge and Seasoned experience with SQL Queries

Languages: Python

* Good communicator, shows initiative, works well with stakeholders.

* Experience working closely with Data Analysts and provide the data they need and guide them on the issues.

* Solid ETL experience and Hadoop/Hive/Pyspark/Presto/ SparkSQL

* Solid communication and articulation skills

* Able to handle stakeholders independently with less interventions of reporting manager.

* Develop strategies to solve problems in logical yet creative ways.

* Create custom reports and presentations accompanied by strong data visualization and storytelling

 

We would be excited if you have:

 

* Excellent communication and interpersonal skills

* Ability to meet deadlines and manage project delivery

* Excellent report-writing and presentation skills

* Critical thinking and problem-solving capabilities

Read more
Top 3 Fintech Startup
Agency job
via Jobdost by Sathish Kumar
Bengaluru (Bangalore)
6 - 9 yrs
₹20L - ₹30L / yr
Amazon Web Services (AWS)
PySpark
SQL
Apache Spark
Python

We are looking for an exceptionally talented Lead data engineer who has exposure in implementing AWS services to build data pipelines, api integration and designing data warehouse. Candidate with both hands-on and leadership capabilities will be ideal for this position.

 

Qualification: At least a bachelor’s degree in Science, Engineering, Applied Mathematics. Preferred Masters degree

 

Job Responsibilities:

• Total 6+ years of experience as a Data Engineer and 2+ years of experience in managing a team

• Have minimum 3 years of AWS Cloud experience.

• Well versed in languages such as Python, PySpark, SQL, NodeJS etc

• Has extensive experience in Spark ecosystem and has worked on both real time and batch processing

• Have experience in AWS Glue, EMR, DMS, Lambda, S3, DynamoDB, Step functions, Airflow, RDS, Aurora etc.

• Experience with modern Database systems such as Redshift, Presto, Hive etc.

• Worked on building data lakes in the past on S3 or Apache Hudi

• Solid understanding of Data Warehousing Concepts

• Good to have experience on tools such as Kafka or Kinesis

• Good to have AWS Developer Associate or Solutions Architect Associate Certification

• Have experience in managing a team

Read more
Pune, Bengaluru (Bangalore), Hyderabad
4 - 9 yrs
₹8L - ₹27L / yr
Python
PySpark
Amazon Web Services (AWS)
Spark
Scala
Greetings..

We have urgent requirement of Data Engineer/Sr Data Engineer for reputed MNC company.

Exp: 4-9yrs

Location: Pune/Bangalore/Hyderabad

Skills: We need candidate either Python AWS or Pyspark AWS or Spark Scala
Read more
Persistent Systems

at Persistent Systems

1 video
1 recruiter
Agency job
via Milestone Hr Consultancy by Haina khan
Pune, Bengaluru (Bangalore), Hyderabad, Nagpur
4 - 9 yrs
₹4L - ₹15L / yr
Spark
Hadoop
Big Data
Data engineering
PySpark
+3 more
Greetings..

We have an urgent requirements of Big Data Developer profiles in our reputed MNC company.

Location: Pune/Bangalore/Hyderabad/Nagpur
Experience: 4-9yrs

Skills: Pyspark,AWS
or Spark,Scala,AWS
or Python Aws
Read more
Top 3 Fintech Startup
Agency job
via Jobdost by Sathish Kumar
Bengaluru (Bangalore)
4 - 7 yrs
₹11L - ₹17L / yr
Machine Learning (ML)
Data Science
Natural Language Processing (NLP)
Computer Vision
Python
+6 more
Responsible to lead a team of analysts to build and deploy predictive models to infuse core business functions with deep analytical insights. The Senior Data Scientist will also work
closely with the Kinara management team to investigate strategically important business
questions.

Lead a team through the entire analytical and machine learning model life cycle:

 Define the problem statement
 Build and clean datasets
 Exploratory data analysis
 Feature engineering
 Apply ML algorithms and assess the performance
 Code for deployment
 Code testing and troubleshooting
 Communicate Analysis to Stakeholders
 Manage Data Analysts and Data Scientists
Read more
Gurugram, Pune, Bengaluru (Bangalore), Delhi, Noida, Ghaziabad, Faridabad
2 - 9 yrs
₹8L - ₹20L / yr
Python
Hadoop
Big Data
Spark
Data engineering
+3 more

Key Responsibilities : ( Data Developer Python, Spark)

Exp : 2 to 9 Yrs 

Development of data platforms, integration frameworks, processes, and code.

Develop and deliver APIs in Python or Scala for Business Intelligence applications build using a range of web languages

Develop comprehensive automated tests for features via end-to-end integration tests, performance tests, acceptance tests and unit tests.

Elaborate stories in a collaborative agile environment (SCRUM or Kanban)

Familiarity with cloud platforms like GCP, AWS or Azure.

Experience with large data volumes.

Familiarity with writing rest-based services.

Experience with distributed processing and systems

Experience with Hadoop / Spark toolsets

Experience with relational database management systems (RDBMS)

Experience with Data Flow development

Knowledge of Agile and associated development techniques including:

Read more
Bengaluru (Bangalore), UK
5 - 10 yrs
₹15L - ₹25L / yr
Data Visualization
PowerBI
ADF
Business Intelligence (BI)
PySpark
+11 more

Power BI Developer

Senior visualization engineer with 5 years’ experience in Power BI to develop and deliver solutions that enable delivery of information to audiences in support of key business processes. In addition, Hands-on experience on Azure data services like ADF and databricks is a must.

Ensure code and design quality through execution of test plans and assist in development of standards & guidelines working closely with internal and external design, business, and technical counterparts.

Candidates should have worked in agile development environments.

Desired Competencies:

  • Should have minimum of 3 years project experience using Power BI on Azure stack.
  • Should have good understanding and working knowledge of Data Warehouse and Data Modelling.
  • Good hands-on experience of Power BI
  • Hands-on experience T-SQL/ DAX/ MDX/ SSIS
  • Data Warehousing on SQL Server (preferably 2016)
  • Experience in Azure Data Services – ADF, DataBricks & PySpark
  • Manage own workload with minimum supervision.
  • Take responsibility of projects or issues assigned to them
  • Be personable, flexible and a team player
  • Good written and verbal communications
  • Have a strong personality who will be able to operate directly with users
Read more
Greenway Health

at Greenway Health

2 recruiters
Agency job
via Vipsa Talent Solutions by Prashma S R
Bengaluru (Bangalore)
6 - 8 yrs
₹8L - ₹15L / yr
PySpark
Data engineering
Big Data
Hadoop
Spark
+5 more
6-8years of experience in data engineer
Spark
Hadoop
Big Data
Data engineering
PySpark
Python
AWS Lambda
SQL
hadoop
kafka
Read more
Fragma Data Systems

at Fragma Data Systems

8 recruiters
Harpreet kour
Posted by Harpreet kour
Bengaluru (Bangalore)
1 - 6 yrs
₹10L - ₹15L / yr
Data engineering
Big Data
PySpark
SQL
Python
 Good experience in Pyspark - Including Dataframe core functions and Spark SQL
Good experience in SQL DBs - Be able to write queries including fair complexity.
Should have excellent experience in Big Data programming for data transformation and aggregations
Good at ELT architecture. Business rules processing and data extraction from Data Lake into data streams for business consumption.
 Good customer communication.
 Good Analytical skills
Read more
Virtusa

at Virtusa

2 recruiters
Agency job
via Response Informatics by Anupama Lavanya Uppala
Chennai, Bengaluru (Bangalore), Mumbai, Hyderabad, Pune
3 - 10 yrs
₹10L - ₹25L / yr
PySpark
Python
  • Minimum 1 years of relevant experience, in PySpark (mandatory)
  • Hands on experience in development, test, deploy, maintain and improving data integration pipeline in AWS cloud environment is added plus 
  • Ability to play lead role and independently manage 3-5 member of Pyspark development team 
  • EMR ,Python and PYspark mandate.
  • Knowledge and awareness working with AWS Cloud technologies like Apache Spark, , Glue, Kafka, Kinesis, and Lambda in S3, Redshift, RDS
Read more
UAE Client
Agency job
via Fragma Data Systems by Harpreet kour
Dubai, Bengaluru (Bangalore)
4 - 8 yrs
₹6L - ₹16L / yr
Data engineering
Data Engineer
Big Data
Big Data Engineer
Apache Spark
+3 more
• Responsible for developing and maintaining applications with PySpark 
• Contribute to the overall design and architecture of the application developed and deployed.
• Performance Tuning wrt to executor sizing and other environmental parameters, code optimization, partitions tuning, etc.
• Interact with business users to understand requirements and troubleshoot issues.
• Implement Projects based on functional specifications.

Must Have Skills:
• Good experience in Pyspark - Including Dataframe core functions and Spark SQL
• Good experience in SQL DBs - Be able to write queries including fair complexity.
• Should have excellent experience in Big Data programming for data transformation and aggregations
• Good at ELT architecture. Business rules processing and data extraction from Data Lake into data streams for business consumption.
• Good customer communication.
• Good Analytical skills
Read more
Fragma Data Systems

at Fragma Data Systems

8 recruiters
Evelyn Charles
Posted by Evelyn Charles
Remote, Bengaluru (Bangalore), Hyderabad
0 - 1 yrs
₹3L - ₹3.5L / yr
SQL
Data engineering
Data Engineer
Python
Big Data
+1 more
Strong Programmer with expertise in Python and SQL
 
● Hands-on Work experience in SQL/PLSQL
● Expertise in at least one popular Python framework (like Django,
Flask or Pyramid)
● Knowledge of object-relational mapping (ORM)
● Familiarity with front-end technologies (like JavaScript and HTML5)
● Willingness to learn & upgrade to Big data and cloud technologies
like Pyspark Azure etc.
● Team spirit
● Good problem-solving skills
● Write effective, scalable code
Read more
Hammoq

at Hammoq

1 recruiter
Nikitha Muthuswamy
Posted by Nikitha Muthuswamy
Remote, Indore, Ujjain, Hyderabad, Bengaluru (Bangalore)
5 - 8 yrs
₹5L - ₹15L / yr
pandas
NumPy
Data engineering
Data Engineer
Apache Spark
+6 more
  • Does analytics to extract insights from raw historical data of the organization. 
  • Generates usable training dataset for any/all MV projects with the help of Annotators, if needed.
  • Analyses user trends, and identifies their biggest bottlenecks in Hammoq Workflow.
  • Tests the short/long term impact of productized MV models on those trends.
  • Skills - Numpy, Pandas, SPARK, APACHE SPARK, PYSPARK, ETL mandatory. 
Read more
Infogain
Agency job
via Technogen India PvtLtd by RAHUL BATTA
Bengaluru (Bangalore), Pune, Noida, NCR (Delhi | Gurgaon | Noida)
7 - 10 yrs
₹20L - ₹25L / yr
Data engineering
Python
SQL
Spark
PySpark
+10 more
  1. Sr. Data Engineer:

 Core Skills – Data Engineering, Big Data, Pyspark, Spark SQL and Python

Candidate with prior Palantir Cloud Foundry OR Clinical Trial Data Model background is preferred

Major accountabilities:

  • Responsible for Data Engineering, Foundry Data Pipeline Creation, Foundry Analysis & Reporting, Slate Application development, re-usable code development & management and Integrating Internal or External System with Foundry for data ingestion with high quality.
  • Have good understanding on Foundry Platform landscape and it’s capabilities
  • Performs data analysis required to troubleshoot data related issues and assist in the resolution of data issues.
  • Defines company data assets (data models), Pyspark, spark SQL, jobs to populate data models.
  • Designs data integrations and data quality framework.
  • Design & Implement integration with Internal, External Systems, F1 AWS platform using Foundry Data Connector or Magritte Agent
  • Collaboration with data scientists, data analyst and technology teams to document and leverage their understanding of the Foundry integration with different data sources - Actively participate in agile work practices
  • Coordinating with Quality Engineer to ensure the all quality controls, naming convention & best practices have been followed

Desired Candidate Profile :

  • Strong data engineering background
  • Experience with Clinical Data Model is preferred
  • Experience in
    • SQL Server ,Postgres, Cassandra, Hadoop, and Spark for distributed data storage and parallel computing
    • Java and Groovy for our back-end applications and data integration tools
    • Python for data processing and analysis
    • Cloud infrastructure based on AWS EC2 and S3
  • 7+ years IT experience, 2+ years’ experience in Palantir Foundry Platform, 4+ years’ experience in Big Data platform
  • 5+ years of Python and Pyspark development experience
  • Strong troubleshooting and problem solving skills
  • BTech or master's degree in computer science or a related technical field
  • Experience designing, building, and maintaining big data pipelines systems
  • Hands-on experience on Palantir Foundry Platform and Foundry custom Apps development
  • Able to design and implement data integration between Palantir Foundry and external Apps based on Foundry data connector framework
  • Hands-on in programming languages primarily Python, R, Java, Unix shell scripts
  • Hand-on experience in AWS / Azure cloud platform and stack
  • Strong in API based architecture and concept, able to do quick PoC using API integration and development
  • Knowledge of machine learning and AI
  • Skill and comfort working in a rapidly changing environment with dynamic objectives and iteration with users.

 Demonstrated ability to continuously learn, work independently, and make decisions with minimal supervision

Read more
MNC

at MNC

Agency job
via Fragma Data Systems by Harpreet kour
Bengaluru (Bangalore)
2 - 4 yrs
₹10L - ₹15L / yr
PySpark
SQL
• Responsible for developing and maintaining applications with PySpark 
• Contribute to the overall design and architecture of the application developed and deployed.
• Performance Tuning wrt to executor sizing and other environmental parameters, code optimization, partitions tuning, etc.
• Interact with business users to understand requirements and troubleshoot issues.
• Implement Projects based on functional specifications.
Read more
Fragma Data Systems

at Fragma Data Systems

8 recruiters
Evelyn Charles
Posted by Evelyn Charles
Remote, Bengaluru (Bangalore)
3.5 - 8 yrs
₹5L - ₹18L / yr
PySpark
Data engineering
Data Warehouse (DWH)
SQL
Spark
+1 more
Must-Have Skills:
• Good experience in Pyspark - Including Dataframe core functions and Spark SQL
• Good experience in SQL DBs - Be able to write queries including fair complexity.
• Should have excellent experience in Big Data programming for data transformation and aggregations
• Good at ELT architecture. Business rules processing and data extraction from Data Lake into data streams for business consumption.
• Good customer communication.
• Good Analytical skill
 
 
Technology Skills (Good to Have):
  • Building and operationalizing large scale enterprise data solutions and applications using one or more of AZURE data and analytics services in combination with custom solutions - Azure Synapse/Azure SQL DWH, Azure Data Lake, Azure Blob Storage, Spark, HDInsights, Databricks, CosmosDB, EventHub/IOTHub.
  • Experience in migrating on-premise data warehouses to data platforms on AZURE cloud. 
  • Designing and implementing data engineering, ingestion, and transformation functions
  • Azure Synapse or Azure SQL data warehouse
  • Spark on Azure is available in HD insights and data bricks
 
Good to Have: 
  • Experience with Azure Analysis Services
  • Experience in Power BI
  • Experience with third-party solutions like Attunity/Stream sets, Informatica
  • Experience with PreSales activities (Responding to RFPs, Executing Quick POCs)
  • Capacity Planning and Performance Tuning on Azure Stack and Spark.
Read more
RentoMojo

at RentoMojo

1 video
5 recruiters
Anand Pandey
Posted by Anand Pandey
Bengaluru (Bangalore)
1 - 2 yrs
₹5L - ₹7L / yr
Business Analysis
Windows Azure
PySpark
SQL
Data Warehouse (DWH)
+4 more
RESPONSIBILITIES & OWNERSHIP: THINGS THE ROLE CAN'T MISS
  • Setting KPIs, monitoring key trends, and helping stakeholders by generating insights from the data delivered.
  • Understanding user behaviour and performing root-cause analysis of changes in data trends across different verticals.
  • Get answers to business questions, identify areas of improvement, and identify opportunities for growth.
  • Work on ad-hoc requests for data and analysis.
  • Work with Cross functional Teams as when required to automate reports and create informative dashboards based on problem statements.

WHO COULD BE A GREAT FIT:
Functional Experience
  • 1-2 years of experience working in Analytics as a Business or Data Analyst.
  • Analytical mind with a problem-solving aptitude.
  • Familiarity with Microsoft Azure & AWS PySpark, Python, Data Bricks, Metabase, Understanding of APIs, data warehouse and ETL etc.
  • Proficient in writing Complex Queries in SQL.
  • Experience in Performing hands-on analysis on data and across multiple datasets and databases primarily using Excel, Google Sheets and R.
  • Ability to work across cross-functional teams proactively.
Read more
Curl Analytics
Agency job
via wrackle by Naveen Taalanki
Bengaluru (Bangalore)
5 - 10 yrs
₹15L - ₹30L / yr
ETL
Big Data
Data engineering
Apache Kafka
PySpark
+11 more
What you will do
  • Bring in industry best practices around creating and maintaining robust data pipelines for complex data projects with/without AI component
    • programmatically ingesting data from several static and real-time sources (incl. web scraping)
    • rendering results through dynamic interfaces incl. web / mobile / dashboard with the ability to log usage and granular user feedbacks
    • performance tuning and optimal implementation of complex Python scripts (using SPARK), SQL (using stored procedures, HIVE), and NoSQL queries in a production environment
  • Industrialize ML / DL solutions and deploy and manage production services; proactively handle data issues arising on live apps
  • Perform ETL on large and complex datasets for AI applications - work closely with data scientists on performance optimization of large-scale ML/DL model training
  • Build data tools to facilitate fast data cleaning and statistical analysis
  • Ensure data architecture is secure and compliant
  • Resolve issues escalated from Business and Functional areas on data quality, accuracy, and availability
  • Work closely with APAC CDO and coordinate with a fully decentralized team across different locations in APAC and global HQ (Paris).

You should be

  •  Expert in structured and unstructured data in traditional and Big data environments – Oracle / SQLserver, MongoDB, Hive / Pig, BigQuery, and Spark
  • Have excellent knowledge of Python programming both in traditional and distributed models (PySpark)
  • Expert in shell scripting and writing schedulers
  • Hands-on experience with Cloud - deploying complex data solutions in hybrid cloud / on-premise environment both for data extraction/storage and computation
  • Hands-on experience in deploying production apps using large volumes of data with state-of-the-art technologies like Dockers, Kubernetes, and Kafka
  • Strong knowledge of data security best practices
  • 5+ years experience in a data engineering role
  • Science / Engineering graduate from a Tier-1 university in the country
  • And most importantly, you must be a passionate coder who really cares about building apps that can help people do things better, smarter, and faster even when they sleep
Read more
Bengaluru (Bangalore)
5 - 6 yrs
₹10L - ₹12L / yr
Mlops
Kubernetes
Docker
Ansible
PySpark
+3 more
  • Automate and maintain ML and Data pipelines at scale
  • Collaborate with Data Scientists and Data Engineers on feature development teams to containerize and build out deployment pipelines for new modules
  • Maintain and expand our on-prem deployments with spark clusters
  • Design, build and optimize applications containerization and orchestration with Docker and Kubernetes and AWS or Azure
Skills:
  • 5 years of IT experience in data-driven or AI technology products
  • Understanding of ML Model Deployment and Lifecycle
  • Extensive experience in Apache airflow for MLOps workflow automation
  • Experience is building and automating data pipelines
  • Experience in working on Spark Cluster architecture
  • Extensive experience with Unix/Linux environments
  • Experience with standard concepts and technologies used in CI/CD build, deployment pipelines using Jenkins
  • Strong experience in Python and PySpark and building required automation (using standard technologies such as Docker, Jenkins, and Ansible).
  • Experience with Kubernetes or Docker Swarm
  • Working technical knowledge of current systems software, protocols, and standards, including firewalls, Active Directory, etc.
  • Basic knowledge of Multi-tier architectures: load balancers, caching, web servers, application servers, and databases.
  • Experience with various virtualization technologies and multi-tenant, private and hybrid cloud environments.
  • Hands-on software and hardware troubleshooting experience.
  • Experience documenting and maintaining configuration and process information.
  • Basic Knowledge of machine learning frameworks: Tensorflow, Caffe/Caffe2, Pytorch
Read more
Marktine

at Marktine

1 recruiter
Vishal Sharma
Posted by Vishal Sharma
Remote, Bengaluru (Bangalore)
3 - 6 yrs
₹10L - ₹20L / yr
Big Data
Spark
PySpark
Data engineering
Data Warehouse (DWH)
+5 more

Azure – Data Engineer

  • At least 2 years hands on experience working with an Agile data engineering team working on big data pipelines using Azure in a commercial environment.
  • Dealing with senior stakeholders/leadership
  • Understanding of Azure data security and encryption best practices. [ADFS/ACLs]

Data Bricks –experience writing in and using data bricks Using Python to transform, manipulate data.

Data Factory – experience using data factory in an enterprise solution to build data pipelines. Experience calling rest APIs.

Synapse/data warehouse – experience using synapse/data warehouse to present data securely and to build & manage data models.

Microsoft SQL server – We’d expect the candidate to have come from a SQL/Data background and progressed into Azure

PowerBI – Experience with this is preferred

Additionally

  • Experience using GIT as a source control system
  • Understanding of DevOps concepts and application
  • Understanding of Azure Cloud costs/management and running platforms efficiently
Read more
MNC

at MNC

Agency job
via Fragma Data Systems by Priyanka U
Remote, Bengaluru (Bangalore)
5 - 8 yrs
₹12L - ₹20L / yr
PySpark
SQL
Data Warehouse (DWH)
ETL
SQL Developer with Relevant experience of 7 Yrs with Strong Communication Skills.
 
Key responsibilities:
 
  • Creating, designing and developing data models
  • Prepare plans for all ETL (Extract/Transformation/Load) procedures and architectures
  • Validating results and creating business reports
  • Monitoring and tuning data loads and queries
  • Develop and prepare a schedule for a new data warehouse
  • Analyze large databases and recommend appropriate optimization for the same
  • Administer all requirements and design various functional specifications for data
  • Provide support to the Software Development Life cycle
  • Prepare various code designs and ensure efficient implementation of the same
  • Evaluate all codes and ensure the quality of all project deliverables
  • Monitor data warehouse work and provide subject matter expertise
  • Hands-on BI practices, data structures, data modeling, SQL skills
  • Minimum 1 year experience in Pyspark
Read more
MNC

at MNC

Agency job
via Fragma Data Systems by Priyanka U
Remote, Bengaluru (Bangalore)
2 - 6 yrs
₹6L - ₹15L / yr
Spark
Apache Kafka
PySpark
Internet of Things (IOT)
Real time media streaming

JD for IOT DE:

 

The role requires experience in Azure core technologies – IoT Hub/ Event Hub, Stream Analytics, IoT Central, Azure Data Lake Storage, Azure Cosmos, Azure Data Factory, Azure SQL Database, Azure HDInsight / Databricks, SQL data warehouse.

 

You Have:

  • Minimum 2 years of software development experience
  • Minimum 2 years of experience in IoT/streaming data pipelines solution development
  • Bachelor's and/or Master’s degree in computer science
  • Strong Consulting skills in data management including data governance, data quality, security, data integration, processing, and provisioning
  • Delivered data management projects with real-time/near real-time data insights delivery on Azure Cloud
  • Translated complex analytical requirements into the technical design including data models, ETLs, and Dashboards / Reports
  • Experience deploying dashboards and self-service analytics solutions on both relational and non-relational databases
  • Experience with different computing paradigms in databases such as In-Memory, Distributed, Massively Parallel Processing
  • Successfully delivered large scale IOT data management initiatives covering Plan, Design, Build and Deploy phases leveraging different delivery methodologies including Agile
  • Experience in handling telemetry data with Spark Streaming, Kafka, Flink, Scala, Pyspark, Spark SQL.
  • Hands-on experience on containers and Dockers
  • Exposure to streaming protocols like MQTT and AMQP
  • Knowledge of OT network protocols like OPC UA, CAN Bus, and similar protocols
  • Strong knowledge of continuous integration, static code analysis, and test-driven development
  • Experience in delivering projects in a highly collaborative delivery model with teams at onsite and offshore
  • Must have excellent analytical and problem-solving skills
  • Delivered change management initiatives focused on driving data platforms adoption across the enterprise
  • Strong verbal and written communications skills are a must, as well as the ability to work effectively across internal and external organizations
     

Roles & Responsibilities
 

You Will:

  • Translate functional requirements into technical design
  • Interact with clients and internal stakeholders to understand the data and platform requirements in detail and determine core Azure services needed to fulfill the technical design
  • Design, Develop and Deliver data integration interfaces in ADF and Azure Databricks
  • Design, Develop and Deliver data provisioning interfaces to fulfill consumption needs
  • Deliver data models on Azure platform, it could be on Azure Cosmos, SQL DW / Synapse, or SQL
  • Advise clients on ML Engineering and deploying ML Ops at Scale on AKS
  • Automate core activities to minimize the delivery lead times and improve the overall quality
  • Optimize platform cost by selecting the right platform services and architecting the solution in a cost-effective manner
  • Deploy Azure DevOps and CI CD processes
  • Deploy logging and monitoring across the different integration points for critical alerts

 

Read more
Marktine

at Marktine

1 recruiter
Vishal Sharma
Posted by Vishal Sharma
Remote, Bengaluru (Bangalore)
3 - 10 yrs
₹5L - ₹15L / yr
Big Data
ETL
PySpark
SSIS
Microsoft Windows Azure
+4 more

Must Have Skills:

- Solid Knowledge on DWH, ETL and Big Data Concepts

- Excellent SQL Skills (With knowledge of SQL Analytics Functions)

- Working Experience on any ETL tool i.e. SSIS / Informatica

- Working Experience on any Azure or AWS Big Data Tools.

- Experience on Implementing Data Jobs (Batch / Real time Streaming)

- Excellent written and verbal communication skills in English, Self-motivated with strong sense of ownership and Ready to learn new tools and technologies

Preferred Skills:

- Experience on Py-Spark / Spark SQL

- AWS Data Tools (AWS Glue, AWS Athena)

- Azure Data Tools (Azure Databricks, Azure Data Factory)

Other Skills:

- Knowledge about Azure Blob, Azure File Storage, AWS S3, Elastic Search / Redis Search

- Knowledge on domain/function (across pricing, promotions and assortment).

- Implementation Experience on Schema and Data Validator framework (Python / Java / SQL),

- Knowledge on DQS and MDM.

Key Responsibilities:

- Independently work on ETL / DWH / Big data Projects

- Gather and process raw data at scale.

- Design and develop data applications using selected tools and frameworks as required and requested.

- Read, extract, transform, stage and load data to selected tools and frameworks as required and requested.

- Perform tasks such as writing scripts, web scraping, calling APIs, write SQL queries, etc.

- Work closely with the engineering team to integrate your work into our production systems.

- Process unstructured data into a form suitable for analysis.

- Analyse processed data.

- Support business decisions with ad hoc analysis as needed.

- Monitoring data performance and modifying infrastructure as needed.

Responsibility: Smart Resource, having excellent communication skills

 

 
Read more
Marktine

at Marktine

1 recruiter
Vishal Sharma
Posted by Vishal Sharma
Remote, Bengaluru (Bangalore)
3 - 7 yrs
₹5L - ₹10L / yr
Data Warehouse (DWH)
Spark
Data engineering
Python
PySpark
+5 more

Basic Qualifications

- Need to have a working knowledge of AWS Redshift.

- Minimum 1 year of designing and implementing a fully operational production-grade large-scale data solution on Snowflake Data Warehouse.

- 3 years of hands-on experience with building productized data ingestion and processing pipelines using Spark, Scala, Python

- 2 years of hands-on experience designing and implementing production-grade data warehousing solutions

- Expertise and excellent understanding of Snowflake Internals and integration of Snowflake with other data processing and reporting technologies

- Excellent presentation and communication skills, both written and verbal

- Ability to problem-solve and architect in an environment with unclear requirements

Read more
Fragma Data Systems

at Fragma Data Systems

8 recruiters
Evelyn Charles
Posted by Evelyn Charles
Remote, Bengaluru (Bangalore)
1 - 5 yrs
₹5L - ₹15L / yr
Spark
PySpark
Big Data
Python
SQL
+1 more
Must-Have Skills:
• Good experience in Pyspark - Including Dataframe core functions and Spark SQL
• Good experience in SQL DBs - Be able to write queries including fair complexity.
• Should have excellent experience in Big Data programming for data transformation and aggregations
• Good at ELT architecture. Business rules processing and data extraction from Data Lake into data streams for business consumption.
• Good customer communication.
• Good Analytical skill
 
 
Technology Skills (Good to Have):
  • Building and operationalizing large scale enterprise data solutions and applications using one or more of AZURE data and analytics services in combination with custom solutions - Azure Synapse/Azure SQL DWH, Azure Data Lake, Azure Blob Storage, Spark, HDInsights, Databricks, CosmosDB, EventHub/IOTHub.
  • Experience in migrating on-premise data warehouses to data platforms on AZURE cloud. 
  • Designing and implementing data engineering, ingestion, and transformation functions
  • Azure Synapse or Azure SQL data warehouse
  • Spark on Azure is available in HD insights and data bricks
Read more
Fragma Data Systems

at Fragma Data Systems

8 recruiters
Evelyn Charles
Posted by Evelyn Charles
Remote, Bengaluru (Bangalore), Hyderabad
3 - 9 yrs
₹8L - ₹20L / yr
PySpark
Data engineering
Data Engineer
Windows Azure
ADF
+2 more
Must-Have Skills:
• Good experience in Pyspark - Including Dataframe core functions and Spark SQL
• Good experience in SQL DBs - Be able to write queries including fair complexity.
• Should have excellent experience in Big Data programming for data transformation and aggregations
• Good at ELT architecture. Business rules processing and data extraction from Data Lake into data streams for business consumption.
• Good customer communication.
• Good Analytical skill
 
 
Technology Skills (Good to Have):
  • Building and operationalizing large scale enterprise data solutions and applications using one or more of AZURE data and analytics services in combination with custom solutions - Azure Synapse/Azure SQL DWH, Azure Data Lake, Azure Blob Storage, Spark, HDInsights, Databricks, CosmosDB, EventHub/IOTHub.
  • Experience in migrating on-premise data warehouses to data platforms on AZURE cloud. 
  • Designing and implementing data engineering, ingestion, and transformation functions
  • Azure Synapse or Azure SQL data warehouse
  • Spark on Azure is available in HD insights and data bricks
 
Good to Have: 
  • Experience with Azure Analysis Services
  • Experience in Power BI
  • Experience with third-party solutions like Attunity/Stream sets, Informatica
  • Experience with PreSales activities (Responding to RFPs, Executing Quick POCs)
  • Capacity Planning and Performance Tuning on Azure Stack and Spark.
Read more
A Global IT Service company
Bengaluru (Bangalore)
5 - 8 yrs
₹20L - ₹30L / yr
Data engineering
Data Bricks
data engineer
PySpark
ETL
+3 more
  • Insurance P&C and Specialty domain experience a plus
  • Experience in a cloud-based architecture preferred, such as Databricks, Azure Data Lake, Azure Data Factory, etc.
  • Strong understanding of ETL fundamentals and solutions. Should be proficient in writing advanced / complex SQL, expertise in performance tuning and optimization of SQL queries required.
  • Strong experience in Python/PySpark and Spark SQL
  • Experience in troubleshooting data issues, analyzing end to end data pipelines, and working with various teams in resolving issues and solving complex problems.
  • Strong experience developing Spark applications using PySpark and SQL for data extraction, transformation, and aggregation from multiple formats for analyzing & transforming the data to uncover insights and actionable intelligence for internal and external use
Read more
BDI Plus Lab

at BDI Plus Lab

2 recruiters
Silita S
Posted by Silita S
Bengaluru (Bangalore)
3 - 7 yrs
₹5L - ₹12L / yr
Big Data
Hadoop
Java
Python
PySpark
+1 more

Roles and responsibilities:

 

  1. Responsible for development and maintenance of applications with technologies involving Enterprise Java and Distributed  technologies.
  2. Experience in Hadoop, Kafka, Spark, Elastic Search, SQL, Kibana, Python, experience w/ machine learning and Analytics     etc.
  3. Collaborate with developers, product manager, business analysts and business users in conceptualizing, estimating and developing new software applications and enhancements..
  4. Collaborate with QA team to define test cases, metrics, and resolve questions about test results.
  5. Assist in the design and implementation process for new products, research and create POC for possible solutions.
  6. Develop components based on business and/or application requirements
  7. Create unit tests in accordance with team policies & procedures
  8. Advise, and mentor team members in specialized technical areas as well as fulfill administrative duties as defined by support process
  9. Work with cross-functional teams during crisis to address and resolve complex incidents and problems in addition to assessment, analysis, and resolution of cross-functional issues. 
Read more
MNC
Remote, Bengaluru (Bangalore)
4 - 12 yrs
₹15L - ₹30L / yr
PySpark
Pyspark Developer
Scala
DevOps

EXP-Developer-4 to 12 years

 

 

         Must have low-level design and development skills.  Should able to design a solution for given use cases. 

  • Agile delivery-  Person must able to show design and code on a daily basis
  • Must be an experienced PySpark developer and  Scala coding.   Primary skill is PySpark
  • Must have experience in designing job orchestration, sequence, metadata design, Audit trail, dynamic parameter passing and error/exception handling
  • Good experience with unit, integration and UAT support
  • Able to design and code reusable components and functions
  • Should able to review design, code & provide review comments with justification
  • Zeal to learn new tool/technologies and adoption
  • Good to have experience with Devops and CICD
Read more
Fragma Data Systems

at Fragma Data Systems

8 recruiters
Evelyn Charles
Posted by Evelyn Charles
Remote, Bengaluru (Bangalore), Hyderabad, Chennai, Mumbai, Pune
8 - 15 yrs
₹16L - ₹28L / yr
PySpark
SQL Azure
azure synapse
Windows Azure
Azure Data Engineer
+3 more
Technology Skills:
  • Building and operationalizing large scale enterprise data solutions and applications using one or more of AZURE data and analytics services in combination with custom solutions - Azure Synapse/Azure SQL DWH, Azure Data Lake, Azure Blob Storage, Spark, HDInsights, Databricks, CosmosDB, EventHub/IOTHub.
  • Experience in migrating on-premise data warehouses to data platforms on AZURE cloud. 
  • Designing and implementing data engineering, ingestion, and transformation functions
Good to Have: 
  • Experience with Azure Analysis Services
  • Experience in Power BI
  • Experience with third-party solutions like Attunity/Stream sets, Informatica
  • Experience with PreSales activities (Responding to RFPs, Executing Quick POCs)
  • Capacity Planning and Performance Tuning on Azure Stack and Spark.
Read more
Antuit

at Antuit

1 recruiter
Purnendu Shakunt
Posted by Purnendu Shakunt
Bengaluru (Bangalore)
4 - 7 yrs
₹15L - ₹20L / yr
Data Science
Machine Learning (ML)
Artificial Intelligence (AI)
Python
Algorithms
+4 more

About antuit.ai

 

Antuit.ai is the leader in AI-powered SaaS solutions for Demand Forecasting & Planning, Merchandising and Pricing. We have the industry’s first solution portfolio – powered by Artificial Intelligence and Machine Learning – that can help you digitally transform your Forecasting, Assortment, Pricing, and Personalization solutions. World-class retailers and consumer goods manufacturers leverage antuit.ai solutions, at scale, to drive outsized business results globally with higher sales, margin and sell-through.

 

Antuit.ai’s executives, comprised of industry leaders from McKinsey, Accenture, IBM, and SAS, and our team of Ph.Ds., data scientists, technologists, and domain experts, are passionate about delivering real value to our clients. Antuit.ai is funded by Goldman Sachs and Zodius Capital.

 

The Role:

 

Antuit is looking for a Data / Sr. Data Scientist who has the knowledge and experience in developing machine learning algorithms, particularly in supply chain and forecasting domain with data science toolkits like Python.

 

In this role, you will design the approach, develop and test machine learning algorithms, implement the solution.  The candidate should have excellent communication skills and be results driven with a customer centric approach to problem solving.  Experience working in the demand forecasting or supply chain domain is a plus. This job also requires the ability to operate in a multi-geographic delivery environment and a good understanding of cross-cultural sensitivities.

 

Responsibilities:

 

Responsibilities includes, but are not limited to the following:

 

  • Design, build, test, and implement predictive Machine Learning models.
  • Collaborate with client to align business requirements with data science systems and process solutions that ensure client’s overall objectives are met.
  • Create meaningful presentations and analysis that tell a “story” focused on insights, to communicate the results/ideas to key decision makers.
  • Collaborate cross-functionally with domain experts to identify gaps and structural problems.
  • Contribute to standard business processes and practices as part of a community of practise.
  • Be the subject matter expert across multiple work streams and clients.
  • Mentor and coach team members.
  • Set a clear vision for the team members and working cohesively to attain it.

 

Qualifications and Skills:

 

Requirements

  • Experience / Education:
    • Master’s or Ph.D. in Computer Science, Computer Engineering, Electrical Engineering, Statistics, Applied Mathematics or other related 
  • 5+ years’ experience working in applied machine learning or relevant research experience for recent Ph.D. graduates.
  • Highly technical:
  • Skilled in machine learning, problem-solving, pattern recognition and predictive modeling with expertise in PySpark and Python.
  • Understanding of data structures and data modeling.
  • Effective communication and presentation skills
  • Able to collaborate closely and effectively with teams.
  • Experience in time series forecasting is preferred.
  • Experience working in start-up type environment preferred.
  • Experience in CPG and/or Retail preferred.
  • Effective communication and presentation skills.
  • Strong management track record.
  • Strong inter-personal skills and leadership qualities.

 

Information Security Responsibilities

  • Understand and adhere to Information Security policies, guidelines and procedure, practice them for protection of organizational data and Information System.
  • Take part in Information Security training and act accordingly while handling information.
  • Report all suspected security and policy breach to Infosec team or appropriate authority (CISO).

 

EEOC

 

Antuit.ai is an at-will, equal opportunity employer.  We consider applicants for all positions without regard to race, color, religion, national origin or ancestry, gender identity, sex, age (40+), marital status, disability, veteran status, or any other legally protected status under local, state, or federal law.
Read more
MNC

at MNC

Agency job
via Fragma Data Systems by Priyanka U
Bengaluru (Bangalore)
3 - 7 yrs
₹8L - ₹16L / yr
PySpark
Python
Spark
Roles and Responsibilities:

• Responsible for developing and maintaining applications with PySpark 
• Contribute to the overall design and architecture of the application developed and deployed.
• Performance Tuning wrt to executor sizing and other environmental parameters, code optimization, partitions tuning, etc.
• Interact with business users to understand requirements and troubleshoot issues.
• Implement Projects based on functional specifications.

Must-Have Skills:

• Good experience in Pyspark - Including Dataframe core functions and Spark SQL
• Good experience in SQL DBs - Be able to write queries including fair complexity.
• Should have excellent experience in Big Data programming for data transformation and aggregations
• Good at ETL architecture. Business rules processing and data extraction from Data Lake into data streams for business consumption.
• Good customer communication.
• Good Analytical skills
Read more
Get to hear about interesting companies hiring right now
Company logo
Company logo
Company logo
Company logo
Company logo
Linkedin iconFollow Cutshort
Why apply via Cutshort?
Connect with actual hiring teams and get their fast response. No spam.
Find more jobs
Get to hear about interesting companies hiring right now
Company logo
Company logo
Company logo
Company logo
Company logo
Linkedin iconFollow Cutshort