Cutshort logo
PySpark Jobs in Mumbai

23+ PySpark Jobs in Mumbai | PySpark Job openings in Mumbai

Apply to 23+ PySpark Jobs in Mumbai on CutShort.io. Explore the latest PySpark Job opportunities across top companies like Google, Amazon & Adobe.

icon
venanalytics

at venanalytics

2 candid answers
Rincy jain
Posted by Rincy jain
Mumbai
4 - 5 yrs
₹10L - ₹15L / yr
skill iconPython
Google Cloud Platform (GCP)
Big Data
DevOps
CI/CD
+3 more

About the Role


We are looking for a highly motivated Project Manager with a strong background in cloud technologies, big data ecosystems, and software development lifecycles to lead cross-functional teams in delivering high-impact projects. The ideal candidate will combine excellent project management skills with technical acumen in GCP, DevOps, and Python-based applications.


Key Responsibilities

  • Lead end-to-end project planning, execution, and delivery, ensuring alignment with business goals and timelines.
  • Create and maintain project documentation including detailed timelines, sprint boards, risk logs, and weekly status reports.
  • Facilitate Agile ceremonies: daily stand-ups, sprint planning, retrospectives, and backlog grooming.
  • Actively manage risks, scope changes, resource allocation, and project dependencies to ensure delivery without disruptions.
  • Ensure compliance with QA processes and security/compliance standards throughout the SDLC.
  • Collaborate with stakeholders and senior leadership to communicate progress, blockers, and key milestones.
  • Provide mentorship and support to cross-functional team members to drive continuous improvement and team performance.
  • Coordinate with clients and act as a key point of contact for requirement gathering, updates, and escalations.


Required Skills & Experience


Cloud & DevOps

  • Proficient in Google Cloud Platform (GCP) services: Compute, Storage, Networking, IAM.
  • Hands-on experience with cloud deployments and infrastructure as code.
  • Strong working knowledge of CI/CD pipelines, Docker, Kubernetes, and Terraform (or similar tools).

Big Data & Data Engineering

  • Experience with large-scale data processing using tools like PySpark, Hadoop, Hive, HDFS, and Spark Streaming (preferred).
  • Proven experience in managing and optimizing big data pipelines and ensuring high performance.

Programming & Frameworks

  • Strong proficiency in Python with experience in Django (REST APIs, ORM, deployment workflows).
  • Familiarity with Git and version control best practices.
  • Basic knowledge of Linux administration and shell scripting.


Nice to Have

  • Knowledge or prior experience in the Media & Advertising domain.
  • Experience in client-facing roles and handling stakeholder communications.
  • Proven ability to manage technical teams (5–6 members).


Why Join Us?

  • Work on cutting-edge cloud and data engineering projects
  • Collaborate with a talented, fast-paced team
  • Flexible work setup and culture of ownership
  • Continuous learning and upskilling environment
  • Inclusive health benefits included







Read more
Wissen Technology

at Wissen Technology

4 recruiters
Praffull Shinde
Posted by Praffull Shinde
Pune, Mumbai, Bengaluru (Bangalore)
4 - 8 yrs
₹14L - ₹26L / yr
skill iconPython
PySpark
skill iconDjango
skill iconFlask
RESTful APIs
+3 more

Job title - Python developer

Exp – 4 to 6 years

Location – Pune/Mum/B’lore

 

PFB JD

Requirements:

  • Proven experience as a Python Developer
  • Strong knowledge of core Python and Pyspark concepts
  • Experience with web frameworks such as Django or Flask
  • Good exposure to any cloud platform (GCP Preferred)
  • CI/CD exposure required
  • Solid understanding of RESTful APIs and how to build them
  • Experience working with databases like Oracle DB and MySQL
  • Ability to write efficient SQL queries and optimize database performance
  • Strong problem-solving skills and attention to detail
  • Strong SQL programing (stored procedure, functions)
  • Excellent communication and interpersonal skill

Roles and Responsibilities

  • Design, develop, and maintain data pipelines and ETL processes using pyspark
  • Work closely with data scientists and analysts to provide them with clean, structured data.
  • Optimize data storage and retrieval for performance and scalability.
  • Collaborate with cross-functional teams to gather data requirements.
  • Ensure data quality and integrity through data validation and cleansing processes.
  • Monitor and troubleshoot data-related issues to ensure data pipeline reliability.
  • Stay up to date with industry best practices and emerging technologies in data engineering.


Read more
Wissen Technology

at Wissen Technology

4 recruiters
Rutuja Patil
Posted by Rutuja Patil
Mumbai
4 - 10 yrs
Best in industry
skill iconJava
J2EE
Hibernate (Java)
skill iconSpring Boot
Spring MVC
+2 more

Company Name – Wissen Technology

Group of companies in India – Wissen Technology & Wissen Infotech

Work Location - Senior Backend Developer – Java (with Python Exposure)- Mumbai


Experience - 4 to 10 years


Kindly revert over mail if you are interested.


Java Developer – Job Description


We are seeking a Senior Backend Developer with strong expertise in Java (Spring Boot) and working knowledge of Python. In this role, Java will be your primary development language, with Python used for scripting, automation, or selected service modules. You’ll be part of a collaborative backend team building scalable and high-performance systems.


Key Responsibilities


  • Design and develop robust backend services and APIs primarily using Java (Spring Boot)
  • Contribute to Python-based components where needed for automation, scripting, or lightweight services
  • Build, integrate, and optimize RESTful APIs and microservices
  • Work with relational and NoSQL databases
  • Write unit and integration tests (JUnit, PyTest)
  • Collaborate closely with DevOps, QA, and product teams
  • Participate in architecture reviews and design discussions
  • Help maintain code quality, organization, and automation


Required Skills & Qualifications

  • 4 to 10 years of hands-on Java development experience
  • Strong experience with Spring Boot, JPA/Hibernate, and REST APIs
  • At least 1–2 years of hands-on experience with Python (e.g., for scripting, automation, or small services)
  • Familiarity with Python frameworks like Flask or FastAPI is a plus
  • Experience with SQL/NoSQL databases (e.g., PostgreSQL, MongoDB)
  • Good understanding of OOPdesign patterns, and software engineering best practices
  • Familiarity with DockerGit, and CI/CD pipelines


Read more
Tata Consultancy Services
Agency job
via Risk Resources LLP hyd by Jhansi Padiy
AnyWhareIndia, Bengaluru (Bangalore), Mumbai, Delhi, Gurugram, Noida, Ghaziabad, Faridabad, Pune, Hyderabad, Indore, Kolkata
5 - 11 yrs
₹6L - ₹30L / yr
Snowflake
skill iconPython
PySpark
SQL

Role descriptions / Expectations from the Role

·        6-7 years of IT development experience with min 3+ years hands-on experience in Snowflake

·        Strong experience in building/designing the data warehouse or data lake, and data mart end-to-end implementation experience focusing on large enterprise scale and Snowflake implementations on any of the hyper scalers.

·        Strong experience with building productionized data ingestion and data pipelines in Snowflake

·        Good knowledge of Snowflake's architecture, features likie  Zero-Copy Cloning, Time Travel, and performance tuning capabilities

·        Should have good exp on Snowflake RBAC and data security.

·        Strong experience in Snowflake features including new snowflake features.

·        Should have good experience in Python/Pyspark.

·        Should have experience in AWS services (S3, Glue, Lambda, Secrete Manager, DMS) and few Azure services (Blob storage, ADLS, ADF)

·        Should have experience/knowledge in orchestration and scheduling tools experience like Airflow

·        Should have good understanding on ETL or ELT processes and ETL tools.

Read more
Deqode

at Deqode

1 recruiter
Alisha Das
Posted by Alisha Das
Bengaluru (Bangalore), Mumbai, Pune, Chennai, Gurugram
5.6 - 7 yrs
₹10L - ₹28L / yr
skill iconAmazon Web Services (AWS)
skill iconPython
PySpark
SQL

Job Summary:

As an AWS Data Engineer, you will be responsible for designing, developing, and maintaining scalable, high-performance data pipelines using AWS services. With 6+ years of experience, you’ll collaborate closely with data architects, analysts, and business stakeholders to build reliable, secure, and cost-efficient data infrastructure across the organization.

Key Responsibilities:

  • Design, develop, and manage scalable data pipelines using AWS Glue, Lambda, and other serverless technologies
  • Implement ETL workflows and transformation logic using PySpark and Python on AWS Glue
  • Leverage AWS Redshift for warehousing, performance tuning, and large-scale data queries
  • Work with AWS DMS and RDS for database integration and migration
  • Optimize data flows and system performance for speed and cost-effectiveness
  • Deploy and manage infrastructure using AWS CloudFormation templates
  • Collaborate with cross-functional teams to gather requirements and build robust data solutions
  • Ensure data integrity, quality, and security across all systems and processes

Required Skills & Experience:

  • 6+ years of experience in Data Engineering with strong AWS expertise
  • Proficient in Python and PySpark for data processing and ETL development
  • Hands-on experience with AWS Glue, Lambda, DMS, RDS, and Redshift
  • Strong SQL skills for building complex queries and performing data analysis
  • Familiarity with AWS CloudFormation and infrastructure as code principles
  • Good understanding of serverless architecture and cost-optimized design
  • Ability to write clean, modular, and maintainable code
  • Strong analytical thinking and problem-solving skills


Read more
Wissen Technology

at Wissen Technology

4 recruiters
Vishakha Walunj
Posted by Vishakha Walunj
Bengaluru (Bangalore), Pune, Mumbai
7 - 12 yrs
Best in industry
PySpark
databricks
SQL
skill iconPython

Required Skills:

  • Hands-on experience with Databricks, PySpark
  • Proficiency in SQL, Python, and Spark.
  • Understanding of data warehousing concepts and data modeling.
  • Experience with CI/CD pipelines and version control (e.g., Git).
  • Fundamental knowledge of any cloud services, preferably Azure or GCP.


Good to Have:

  • Bigquery
  • Experience with performance tuning and data governance.


Read more
Deqode

at Deqode

1 recruiter
Roshni Maji
Posted by Roshni Maji
Pune, Bengaluru (Bangalore), Gurugram, Chennai, Mumbai
5 - 7 yrs
₹6L - ₹20L / yr
skill iconAmazon Web Services (AWS)
Amazon Redshift
AWS Glue
skill iconPython
PySpark

Position: AWS Data Engineer

Experience: 5 to 7 Years

Location: Bengaluru, Pune, Chennai, Mumbai, Gurugram

Work Mode: Hybrid (3 days work from office per week)

Employment Type: Full-time

About the Role:

We are seeking a highly skilled and motivated AWS Data Engineer with 5–7 years of experience in building and optimizing data pipelines, architectures, and data sets. The ideal candidate will have strong experience with AWS services including Glue, Athena, Redshift, Lambda, DMS, RDS, and CloudFormation. You will be responsible for managing the full data lifecycle from ingestion to transformation and storage, ensuring efficiency and performance.

Key Responsibilities:

  • Design, develop, and optimize scalable ETL pipelines using AWS Glue, Python/PySpark, and SQL.
  • Work extensively with AWS services such as Glue, Athena, Lambda, DMS, RDS, Redshift, CloudFormation, and other serverless technologies.
  • Implement and manage data lake and warehouse solutions using AWS Redshift and S3.
  • Optimize data models and storage for cost-efficiency and performance.
  • Write advanced SQL queries to support complex data analysis and reporting requirements.
  • Collaborate with stakeholders to understand data requirements and translate them into scalable solutions.
  • Ensure high data quality and integrity across platforms and processes.
  • Implement CI/CD pipelines and best practices for infrastructure as code using CloudFormation or similar tools.

Required Skills & Experience:

  • Strong hands-on experience with Python or PySpark for data processing.
  • Deep knowledge of AWS Glue, Athena, Lambda, Redshift, RDS, DMS, and CloudFormation.
  • Proficiency in writing complex SQL queries and optimizing them for performance.
  • Familiarity with serverless architectures and AWS best practices.
  • Experience in designing and maintaining robust data architectures and data lakes.
  • Ability to troubleshoot and resolve data pipeline issues efficiently.
  • Strong communication and stakeholder management skills.


Read more
Deqode

at Deqode

1 recruiter
Roshni Maji
Posted by Roshni Maji
Bengaluru (Bangalore), Pune, Mumbai, Chennai, Gurugram
5 - 7 yrs
₹5L - ₹19L / yr
skill iconPython
PySpark
skill iconAmazon Web Services (AWS)
aws
Amazon Redshift
+1 more

Position: AWS Data Engineer

Experience: 5 to 7 Years

Location: Bengaluru, Pune, Chennai, Mumbai, Gurugram

Work Mode: Hybrid (3 days work from office per week)

Employment Type: Full-time

About the Role:

We are seeking a highly skilled and motivated AWS Data Engineer with 5–7 years of experience in building and optimizing data pipelines, architectures, and data sets. The ideal candidate will have strong experience with AWS services including Glue, Athena, Redshift, Lambda, DMS, RDS, and CloudFormation. You will be responsible for managing the full data lifecycle from ingestion to transformation and storage, ensuring efficiency and performance.

Key Responsibilities:

  • Design, develop, and optimize scalable ETL pipelines using AWS Glue, Python/PySpark, and SQL.
  • Work extensively with AWS services such as Glue, Athena, Lambda, DMS, RDS, Redshift, CloudFormation, and other serverless technologies.
  • Implement and manage data lake and warehouse solutions using AWS Redshift and S3.
  • Optimize data models and storage for cost-efficiency and performance.
  • Write advanced SQL queries to support complex data analysis and reporting requirements.
  • Collaborate with stakeholders to understand data requirements and translate them into scalable solutions.
  • Ensure high data quality and integrity across platforms and processes.
  • Implement CI/CD pipelines and best practices for infrastructure as code using CloudFormation or similar tools.

Required Skills & Experience:

  • Strong hands-on experience with Python or PySpark for data processing.
  • Deep knowledge of AWS Glue, Athena, Lambda, Redshift, RDS, DMS, and CloudFormation.
  • Proficiency in writing complex SQL queries and optimizing them for performance.
  • Familiarity with serverless architectures and AWS best practices.
  • Experience in designing and maintaining robust data architectures and data lakes.
  • Ability to troubleshoot and resolve data pipeline issues efficiently.
  • Strong communication and stakeholder management skills.


Read more
Deqode

at Deqode

1 recruiter
Mokshada Solanki
Posted by Mokshada Solanki
Bengaluru (Bangalore), Mumbai, Pune, Gurugram
4 - 5 yrs
₹4L - ₹20L / yr
SQL
skill iconAmazon Web Services (AWS)
Migration
PySpark
ETL

Job Summary:

Seeking a seasoned SQL + ETL Developer with 4+ years of experience in managing large-scale datasets and cloud-based data pipelines. The ideal candidate is hands-on with MySQL, PySpark, AWS Glue, and ETL workflows, with proven expertise in AWS migration and performance optimization.


Key Responsibilities:

  • Develop and optimize complex SQL queries and stored procedures to handle large datasets (100+ million records).
  • Build and maintain scalable ETL pipelines using AWS Glue and PySpark.
  • Work on data migration tasks in AWS environments.
  • Monitor and improve database performance; automate key performance indicators and reports.
  • Collaborate with cross-functional teams to support data integration and delivery requirements.
  • Write shell scripts for automation and manage ETL jobs efficiently.


Required Skills:

  • Strong experience with MySQL, complex SQL queries, and stored procedures.
  • Hands-on experience with AWS Glue, PySpark, and ETL processes.
  • Good understanding of AWS ecosystem and migration strategies.
  • Proficiency in shell scripting.
  • Strong communication and collaboration skills.


Nice to Have:

  • Working knowledge of Python.
  • Experience with AWS RDS.



Read more
Deqode

at Deqode

1 recruiter
Shraddha Katare
Posted by Shraddha Katare
Bengaluru (Bangalore), Pune, Chennai, Mumbai, Gurugram
5 - 7 yrs
₹5L - ₹19L / yr
skill iconAmazon Web Services (AWS)
skill iconPython
PySpark
SQL
redshift

Profile: AWS Data Engineer

Mode- Hybrid

Experience- 5+7 years

Locations - Bengaluru, Pune, Chennai, Mumbai, Gurugram


Roles and Responsibilities

  • Design and maintain ETL pipelines using AWS Glue and Python/PySpark
  • Optimize SQL queries for Redshift and Athena
  • Develop Lambda functions for serverless data processing
  • Configure AWS DMS for database migration and replication
  • Implement infrastructure as code with CloudFormation
  • Build optimized data models for performance
  • Manage RDS databases and AWS service integrations
  • Troubleshoot and improve data processing efficiency
  • Gather requirements from business stakeholders
  • Implement data quality checks and validation
  • Document data pipelines and architecture
  • Monitor workflows and implement alerting
  • Keep current with AWS services and best practices


Required Technical Expertise:

  • Python/PySpark for data processing
  • AWS Glue for ETL operations
  • Redshift and Athena for data querying
  • AWS Lambda and serverless architecture
  • AWS DMS and RDS management
  • CloudFormation for infrastructure
  • SQL optimization and performance tuning
Read more
Deqode

at Deqode

1 recruiter
Alisha Das
Posted by Alisha Das
Pune, Mumbai, Bengaluru (Bangalore), Chennai
4 - 7 yrs
₹5L - ₹15L / yr
skill iconAmazon Web Services (AWS)
skill iconPython
PySpark
Glue semantics
Amazon Redshift
+1 more

Job Overview:

We are seeking an experienced AWS Data Engineer to join our growing data team. The ideal candidate will have hands-on experience with AWS Glue, Redshift, PySpark, and other AWS services to build robust, scalable data pipelines. This role is perfect for someone passionate about data engineering, automation, and cloud-native development.

Key Responsibilities:

  • Design, build, and maintain scalable and efficient ETL pipelines using AWS Glue, PySpark, and related tools.
  • Integrate data from diverse sources and ensure its quality, consistency, and reliability.
  • Work with large datasets in structured and semi-structured formats across cloud-based data lakes and warehouses.
  • Optimize and maintain data infrastructure, including Amazon Redshift, for high performance.
  • Collaborate with data analysts, data scientists, and product teams to understand data requirements and deliver solutions.
  • Automate data validation, transformation, and loading processes to support real-time and batch data processing.
  • Monitor and troubleshoot data pipeline issues and ensure smooth operations in production environments.

Required Skills:

  • 5 to 7 years of hands-on experience in data engineering roles.
  • Strong proficiency in Python and PySpark for data transformation and scripting.
  • Deep understanding and practical experience with AWS Glue, AWS Redshift, S3, and other AWS data services.
  • Solid understanding of SQL and database optimization techniques.
  • Experience working with large-scale data pipelines and high-volume data environments.
  • Good knowledge of data modeling, warehousing, and performance tuning.

Preferred/Good to Have:

  • Experience with workflow orchestration tools like Airflow or Step Functions.
  • Familiarity with CI/CD for data pipelines.
  • Knowledge of data governance and security best practices on AWS.
Read more
Deqode

at Deqode

1 recruiter
Shraddha Katare
Posted by Shraddha Katare
Pune, Mumbai, Bengaluru (Bangalore), Gurugram
4 - 6 yrs
₹5L - ₹10L / yr
ETL
SQL
skill iconAmazon Web Services (AWS)
PySpark
KPI

Role - ETL Developer

Work ModeHybrid

Experience- 4+ years

Location - Pune, Gurgaon, Bengaluru, Mumbai

Required Skills - AWS, AWS Glue, Pyspark, ETL, SQL

Required Skills:

  • 4+ years of hands-on experience in MySQL, including SQL queries and procedure development
  • Experience in Pyspark, AWS, AWS Glue
  • Experience in AWS ,Migration
  • Experience with automated scripting and tracking KPIs/metrics for database performance
  • Proficiency in shell scripting and ETL.
  • Strong communication skills and a collaborative team player
  • Knowledge of Python and AWS RDS is a plus


Read more
Wissen Technology

at Wissen Technology

4 recruiters
Hanisha Pralayakaveri
Posted by Hanisha Pralayakaveri
Bengaluru (Bangalore), Mumbai
5 - 9 yrs
Best in industry
skill iconPython
skill iconAmazon Web Services (AWS)
PySpark
Data engineering

Job Description: Data Engineer 

Position Overview:

Role Overview

We are seeking a skilled Python Data Engineer with expertise in designing and implementing data solutions using the AWS cloud platform. The ideal candidate will be responsible for building and maintaining scalable, efficient, and secure data pipelines while leveraging Python and AWS services to enable robust data analytics and decision-making processes.

 

Key Responsibilities

· Design, develop, and optimize data pipelines using Python and AWS services such as Glue, Lambda, S3, EMR, Redshift, Athena, and Kinesis.

· Implement ETL/ELT processes to extract, transform, and load data from various sources into centralized repositories (e.g., data lakes or data warehouses).

· Collaborate with cross-functional teams to understand business requirements and translate them into scalable data solutions.

· Monitor, troubleshoot, and enhance data workflows for performance and cost optimization.

· Ensure data quality and consistency by implementing validation and governance practices.

· Work on data security best practices in compliance with organizational policies and regulations.

· Automate repetitive data engineering tasks using Python scripts and frameworks.

· Leverage CI/CD pipelines for deployment of data workflows on AWS.

Read more
ZeMoSo Technologies

at ZeMoSo Technologies

11 recruiters
Agency job
via TIGI HR Solution Pvt. Ltd. by Vaidehi Sarkar
Mumbai, Bengaluru (Bangalore), Hyderabad, Chennai, Pune
4 - 8 yrs
₹10L - ₹15L / yr
Data engineering
skill iconPython
SQL
Data Warehouse (DWH)
skill iconAmazon Web Services (AWS)
+3 more

Work Mode: Hybrid


Need B.Tech, BE, M.Tech, ME candidates - Mandatory



Must-Have Skills:

● Educational Qualification :- B.Tech, BE, M.Tech, ME in any field.

● Minimum of 3 years of proven experience as a Data Engineer.

● Strong proficiency in Python programming language and SQL.

● Experience in DataBricks and setting up and managing data pipelines, data warehouses/lakes.

● Good comprehension and critical thinking skills.


● Kindly note Salary bracket will vary according to the exp. of the candidate - 

- Experience from 4 yrs to 6 yrs - Salary upto 22 LPA

- Experience from 5 yrs to 8 yrs - Salary upto 30 LPA

- Experience more than 8 yrs - Salary upto 40 LPA

Read more
Deqode

at Deqode

1 recruiter
Alisha Das
Posted by Alisha Das
Bengaluru (Bangalore), Delhi, Gurugram, Noida, Ghaziabad, Faridabad, Mumbai, Pune, Hyderabad, Indore, Jaipur, Kolkata
4 - 5 yrs
₹2L - ₹18L / yr
skill iconPython
PySpark

We are looking for a skilled and passionate Data Engineers with a strong foundation in Python programming and hands-on experience working with APIs, AWS cloud, and modern development practices. The ideal candidate will have a keen interest in building scalable backend systems and working with big data tools like PySpark.

Key Responsibilities:

  • Write clean, scalable, and efficient Python code.
  • Work with Python frameworks such as PySpark for data processing.
  • Design, develop, update, and maintain APIs (RESTful).
  • Deploy and manage code using GitHub CI/CD pipelines.
  • Collaborate with cross-functional teams to define, design, and ship new features.
  • Work on AWS cloud services for application deployment and infrastructure.
  • Basic database design and interaction with MySQL or DynamoDB.
  • Debugging and troubleshooting application issues and performance bottlenecks.

Required Skills & Qualifications:

  • 4+ years of hands-on experience with Python development.
  • Proficient in Python basics with a strong problem-solving approach.
  • Experience with AWS Cloud services (EC2, Lambda, S3, etc.).
  • Good understanding of API development and integration.
  • Knowledge of GitHub and CI/CD workflows.
  • Experience in working with PySpark or similar big data frameworks.
  • Basic knowledge of MySQL or DynamoDB.
  • Excellent communication skills and a team-oriented mindset.

Nice to Have:

  • Experience in containerization (Docker/Kubernetes).
  • Familiarity with Agile/Scrum methodologies.


Read more
IT Service company

IT Service company

Agency job
via Vinprotoday by Vikas Gaur
Mumbai
4 - 10 yrs
₹8L - ₹30L / yr
Google Cloud Platform (GCP)
Workflow
TensorFlow
Deployment management
PySpark
+1 more

Key Responsibilities:

Design, develop, and optimize scalable data pipelines and ETL processes.

Work with large datasets using GCP services like BigQuery, Dataflow, and Cloud Storage.

Implement real-time data streaming and processing solutions using Pub/Sub and Dataproc.

Collaborate with cross-functional teams to ensure data quality and governance.


Technical Requirements:

4+ years of experience in Data Engineering.

Strong expertise in GCP services like Workflow,tensorflow, Dataproc, and Cloud Storage.

Proficiency in SQL and programming languages such as Python or Java

.Experience in designing and implementing data pipelines

and working with real-time data processing.

Familiarity with CI/CD pipelines and cloud security best practices.

Read more
ProtoGene Consulting Private Limited
Mumbai
3 - 8 yrs
₹7L - ₹18L / yr
PySpark
Data engineering
Big Data
Hadoop
Spark
+4 more

Data Engineer + Integration engineer + Support specialistExp – 5-8 years

Necessary Skills:· SQL & Python / PySpark

· AWS Services: Glue, Appflow, Redshift

· Data warehousing

· Data modelling

Job Description:· Experience of implementing and delivering data solutions and pipelines on AWS Cloud Platform. Design/ implement, and maintain the data architecture for all AWS data services

· A strong understanding of data modelling, data structures, databases (Redshift), and ETL processes

· Work with stakeholders to identify business needs and requirements for data-related projects

Strong SQL and/or Python or PySpark knowledge

· Creating data models that can be used to extract information from various sources & store it in a usable format

· Optimize data models for performance and efficiency

· Write SQL queries to support data analysis and reporting

· Monitor and troubleshoot data pipelines

· Collaborate with software engineers to design and implement data-driven features

· Perform root cause analysis on data issues

· Maintain documentation of the data architecture and ETL processes

· Identifying opportunities to improve performance by improving database structure or indexing methods

· Maintaining existing applications by updating existing code or adding new features to meet new requirements

· Designing and implementing security measures to protect data from unauthorized access or misuse

· Recommending infrastructure changes to improve capacity or performance

Experience in Process industry

Data Engineer + Integration engineer + Support specialistExp – 3-5 years

Necessary Skills:· SQL & Python / PySpark

· AWS Services: Glue, Appflow, Redshift

· Data warehousing basics

· Data modelling basics

Job Description:· Experience of implementing and delivering data solutions and pipelines on AWS Cloud Platform.

· A strong understanding of data modelling, data structures, databases (Redshift)

Strong SQL and/or Python or PySpark knowledge

· Design and implement ETL processes to load data into the data warehouse

· Creating data models that can be used to extract information from various sources & store it in a usable format

· Optimize data models for performance and efficiency

· Write SQL queries to support data analysis and reporting

· Collaborate with team to design and implement data-driven features

· Monitor and troubleshoot data pipelines

· Perform root cause analysis on data issues

· Maintain documentation of the data architecture and ETL processes

· Maintaining existing applications by updating existing code or adding new features to meet new requirements

· Designing and implementing security measures to protect data from unauthorized access or misuse

· Identifying opportunities to improve performance by improving database structure or indexing methods

· Designing and implementing security measures to protect data from unauthorized access or misuse

· Recommending infrastructure changes to improve capacity or performance


Read more
Bengaluru (Bangalore), Mumbai, Delhi, Gurugram, Pune, Hyderabad, Ahmedabad, Chennai
3 - 7 yrs
₹8L - ₹15L / yr
AWS Lambda
Amazon S3
Amazon VPC
Amazon EC2
Amazon Redshift
+3 more

Technical Skills:


  • Ability to understand and translate business requirements into design.
  • Proficient in AWS infrastructure components such as S3, IAM, VPC, EC2, and Redshift.
  • Experience in creating ETL jobs using Python/PySpark.
  • Proficiency in creating AWS Lambda functions for event-based jobs.
  • Knowledge of automating ETL processes using AWS Step Functions.
  • Competence in building data warehouses and loading data into them.


Responsibilities:


  • Understand business requirements and translate them into design.
  • Assess AWS infrastructure needs for development work.
  • Develop ETL jobs using Python/PySpark to meet requirements.
  • Implement AWS Lambda for event-based tasks.
  • Automate ETL processes using AWS Step Functions.
  • Build data warehouses and manage data loading.
  • Engage with customers and stakeholders to articulate the benefits of proposed solutions and frameworks.
Read more
Numantra Technologies

at Numantra Technologies

2 recruiters
Vandana Saxena
Posted by Vandana Saxena
Mumbai, Navi Mumbai
2 - 8 yrs
₹5L - ₹12L / yr
Microsoft Windows Azure
ADF
NumPy
PySpark
Databricks
+1 more
Experience and expertise in using Azure cloud services. Azure certification will be a plus.

- Experience and expertise in Python Development and its different libraries like Pyspark, pandas, NumPy

- Expertise in ADF, Databricks.

- Creating and maintaining data interfaces across a number of different protocols (file, API.).

- Creating and maintaining internal business process solutions to keep our corporate system data in sync and reduce manual processes where appropriate.

- Creating and maintaining monitoring and alerting workflows to improve system transparency.

- Facilitate the development of our Azure cloud infrastructure relative to Data and Application systems.

- Design and lead development of our data infrastructure including data warehouses, data marts, and operational data stores.

- Experience in using Azure services such as ADLS Gen 2, Azure Functions, Azure messaging services, Azure SQL Server, Azure KeyVault, Azure Cognitive services etc.
Read more
Navi Mumbai
3 - 5 yrs
₹7L - ₹18L / yr
PySpark
Data engineering
Big Data
Hadoop
Spark
+6 more
  • Proficiency in shell scripting
  • Proficiency in automation of tasks
  • Proficiency in Pyspark/Python
  • Proficiency in writing and understanding of sqoop
  • Understanding of CloudEra manager
  • Good understanding of RDBMS
  • Good understanding of Excel

 

Read more
Virtusa

at Virtusa

2 recruiters
Agency job
via Response Informatics by Anupama Lavanya Uppala
Chennai, Bengaluru (Bangalore), Mumbai, Hyderabad, Pune
3 - 10 yrs
₹10L - ₹25L / yr
PySpark
skill iconPython
  • Minimum 1 years of relevant experience, in PySpark (mandatory)
  • Hands on experience in development, test, deploy, maintain and improving data integration pipeline in AWS cloud environment is added plus 
  • Ability to play lead role and independently manage 3-5 member of Pyspark development team 
  • EMR ,Python and PYspark mandate.
  • Knowledge and awareness working with AWS Cloud technologies like Apache Spark, , Glue, Kafka, Kinesis, and Lambda in S3, Redshift, RDS
Read more
Numantra Technologies

at Numantra Technologies

2 recruiters
nisha mattas
Posted by nisha mattas
Remote, Mumbai, powai
2 - 12 yrs
₹8L - ₹18L / yr
ADF
PySpark
Jupyter Notebook
Big Data
Windows Azure
+3 more
      • Data pre-processing, data transformation, data analysis, and feature engineering
      • Performance optimization of scripts (code) and Productionizing of code (SQL, Pandas, Python or PySpark, etc.)
    • Required skills:
      • Bachelors in - in Computer Science, Data Science, Computer Engineering, IT or equivalent
      • Fluency in Python (Pandas), PySpark, SQL, or similar
      • Azure data factory experience (min 12 months)
      • Able to write efficient code using traditional, OO concepts, modular programming following the SDLC process.
      • Experience in production optimization and end-to-end performance tracing (technical root cause analysis)
      • Ability to work independently with demonstrated experience in project or program management
      • Azure experience ability to translate data scientist code in Python and make it efficient (production) for cloud deployment
 
Read more
Fragma Data Systems

at Fragma Data Systems

8 recruiters
Evelyn Charles
Posted by Evelyn Charles
Remote, Bengaluru (Bangalore), Hyderabad, Chennai, Mumbai, Pune
8 - 15 yrs
₹16L - ₹28L / yr
PySpark
SQL Azure
azure synapse
Windows Azure
Azure Data Engineer
+3 more
Technology Skills:
  • Building and operationalizing large scale enterprise data solutions and applications using one or more of AZURE data and analytics services in combination with custom solutions - Azure Synapse/Azure SQL DWH, Azure Data Lake, Azure Blob Storage, Spark, HDInsights, Databricks, CosmosDB, EventHub/IOTHub.
  • Experience in migrating on-premise data warehouses to data platforms on AZURE cloud. 
  • Designing and implementing data engineering, ingestion, and transformation functions
Good to Have: 
  • Experience with Azure Analysis Services
  • Experience in Power BI
  • Experience with third-party solutions like Attunity/Stream sets, Informatica
  • Experience with PreSales activities (Responding to RFPs, Executing Quick POCs)
  • Capacity Planning and Performance Tuning on Azure Stack and Spark.
Read more
Get to hear about interesting companies hiring right now
Company logo
Company logo
Company logo
Company logo
Company logo
Linkedin iconFollow Cutshort
Why apply via Cutshort?
Connect with actual hiring teams and get their fast response. No spam.
Find more jobs
Get to hear about interesting companies hiring right now
Company logo
Company logo
Company logo
Company logo
Company logo
Linkedin iconFollow Cutshort