Pyspark jobs

50+ PySpark Jobs in India

Apply to 50+ PySpark Jobs on CutShort.io. Find your next job, effortlessly. Browse PySpark Jobs and apply today!

AWS data engineer

at Tekit Software solution Pvt Ltd

Posted by himanshi Tripathi

Hyderabad, Bengaluru (Bangalore)

8 - 10 yrs

₹15L - ₹27L / yr

Amazon Web Services (AWS)

Python

PySpark

SQL

🔍 Job Description:

We are looking for an experienced and highly skilled Technical Lead to guide the development and enhancement of a large-scale Data Observability solution built on AWS. This platform is pivotal in delivering monitoring, reporting, and actionable insights across the client's data landscape.

The Technical Lead will drive end-to-end feature delivery, mentor junior engineers, and uphold engineering best practices. The position reports to the Programme Technical Lead / Architect and involves close collaboration to align on platform vision, technical priorities, and success KPIs.

🎯 Key Responsibilities:

Lead the design, development, and delivery of features for the data observability solution.
Mentor and guide junior engineers, promoting technical growth and engineering excellence.
Collaborate with the architect to align on platform roadmap, vision, and success metrics.
Ensure high quality, scalability, and performance in data engineering solutions.
Contribute to code reviews, architecture discussions, and operational readiness.

🔧 Primary Must-Have Skills (Non-Negotiable):

5+ years in Data Engineering or Software Engineering roles.
3+ years in a technical team or squad leadership capacity.
Deep expertise in AWS Data Services: Glue, EMR, Kinesis, Lambda, Athena, S3.
Advanced programming experience with PySpark, Python, and SQL.
Proven experience in building scalable, production-grade data pipelines on cloud platforms.

🔍 Job Description:

🎯 Key Responsibilities:

Lead the design, development, and delivery of features for the data observability solution.
Mentor and guide junior engineers, promoting technical growth and engineering excellence.
Collaborate with the architect to align on platform roadmap, vision, and success metrics.
Ensure high quality, scalability, and performance in data engineering solutions.
Contribute to code reviews, architecture discussions, and operational readiness.

🔧 Primary Must-Have Skills (Non-Negotiable):

5+ years in Data Engineering or Software Engineering roles.
3+ years in a technical team or squad leadership capacity.
Deep expertise in AWS Data Services: Glue, EMR, Kinesis, Lambda, Athena, S3.
Advanced programming experience with PySpark, Python, and SQL.
Proven experience in building scalable, production-grade data pipelines on cloud platforms.

Data Engineer – GCP + Spark + DBT

at NeoGenCode Technologies Pvt Ltd

2 candid answers

Posted by Akshay Patil

Bengaluru (Bangalore)

8 - 12 yrs

₹15L - ₹22L / yr

Data engineering

Google Cloud Platform (GCP)

Data Transformation Tool (DBT)

Google Dataform

BigQuery

+6 more

Job Title : Data Engineer – GCP + Spark + DBT

Location : Bengaluru (On-site at Client Location | 3 Days WFO)

Experience : 8 to 12 Years

Level : Associate Architect

Type : Full-time

Job Overview :

We are looking for a seasoned Data Engineer to join the Data Platform Engineering team supporting a Unified Data Platform (UDP). This role requires hands-on expertise in DBT, GCP, BigQuery, and PySpark, with a solid foundation in CI/CD, data pipeline optimization, and agile delivery.

Mandatory Skills : GCP, DBT, Google Dataform, BigQuery, PySpark/Spark SQL, Advanced SQL, CI/CD, Git, Agile Methodologies.

Key Responsibilities :

Design, build, and optimize scalable data pipelines using BigQuery, DBT, and PySpark.
Leverage GCP-native services like Cloud Storage, Pub/Sub, Dataproc, Cloud Functions, and Composer for ETL/ELT workflows.
Implement and maintain CI/CD for data engineering projects with Git-based version control.
Collaborate with cross-functional teams including Infra, Security, and DataOps for reliable, secure, and high-quality data delivery.
Lead code reviews, mentor junior engineers, and enforce best practices in data engineering.
Participate in Agile sprints, backlog grooming, and Jira-based project tracking.

Must-Have Skills :

Strong experience with DBT, Google Dataform, and BigQuery
Hands-on expertise with PySpark/Spark SQL
Proficient in GCP for data engineering workflows
Solid knowledge of SQL optimization, Git, and CI/CD pipelines
Agile team experience and strong problem-solving abilities

Nice-to-Have Skills :

Familiarity with Databricks, Delta Lake, or Kafka
Exposure to data observability and quality frameworks (e.g., Great Expectations, Soda)
Knowledge of MDM patterns, Terraform, or IaC is a plus

Job Title : Data Engineer – GCP + Spark + DBT

Location : Bengaluru (On-site at Client Location | 3 Days WFO)

Experience : 8 to 12 Years

Level : Associate Architect

Type : Full-time

Job Overview :

Mandatory Skills : GCP, DBT, Google Dataform, BigQuery, PySpark/Spark SQL, Advanced SQL, CI/CD, Git, Agile Methodologies.

Key Responsibilities :

Design, build, and optimize scalable data pipelines using BigQuery, DBT, and PySpark.
Leverage GCP-native services like Cloud Storage, Pub/Sub, Dataproc, Cloud Functions, and Composer for ETL/ELT workflows.
Implement and maintain CI/CD for data engineering projects with Git-based version control.
Collaborate with cross-functional teams including Infra, Security, and DataOps for reliable, secure, and high-quality data delivery.
Lead code reviews, mentor junior engineers, and enforce best practices in data engineering.
Participate in Agile sprints, backlog grooming, and Jira-based project tracking.

Must-Have Skills :

Strong experience with DBT, Google Dataform, and BigQuery
Hands-on expertise with PySpark/Spark SQL
Proficient in GCP for data engineering workflows
Solid knowledge of SQL optimization, Git, and CI/CD pipelines
Agile team experience and strong problem-solving abilities

Nice-to-Have Skills :

Familiarity with Databricks, Delta Lake, or Kafka
Exposure to data observability and quality frameworks (e.g., Great Expectations, Soda)
Knowledge of MDM patterns, Terraform, or IaC is a plus

Azure Data Engineer / Lead

Service Based

Agency job

via Vikash Technologies by Rishika Teja

Cochin, Chennai, Coimbatore

5 - 8 yrs

₹10L - ₹16L / yr

Windows Azure

Azure Data Bricks

Azure Data Factory

PySpark

- Proven experience working with : Azure Fabric SQL Server Azure Data Factory Azure Data Lake Cosmos DB.

- Strong hands-on expertise in : Complex SQL queries SQL query efficiency and optimization

- Database design and data modeling.

- Data migration techniques and performance tuning.

- Solid understanding of cloud infrastructure and data integration patterns in Azure.

- Experience working in agile environments with CI/CD practices.

- Proven experience working with : Azure Fabric SQL Server Azure Data Factory Azure Data Lake Cosmos DB.

- Strong hands-on expertise in : Complex SQL queries SQL query efficiency and optimization

- Database design and data modeling.

- Data migration techniques and performance tuning.

- Solid understanding of cloud infrastructure and data integration patterns in Azure.

- Experience working in agile environments with CI/CD practices.

Data Engineer

at Moative

3 candid answers

Posted by Eman Khan

Chennai

3 - 5 yrs

₹10L - ₹25L / yr

Python

PySpark

Scala

Data engineering

ETL

+12 more

About Moative

Moative, an Applied AI company, designs and builds transformation AI solutions for traditional industries in energy, utilities, healthcare & lifesciences, and more. Through Moative Labs, we build AI micro-products and launch AI startups with partners in vertical markets that align with our theses.

Our Past: We have built and sold two companies, one of which was an AI company. Our founders and leaders are Math PhDs, Ivy League University Alumni, Ex-Googlers, and successful entrepreneurs.

Our Team: Our team of 20+ employees consist of data scientists, AI/ML Engineers, and mathematicians from top engineering and research institutes such as IITs, CERN, IISc, UZH, Ph.Ds. Our team includes academicians, IBM Research Fellows, and former founders.

Work you’ll do

As a Data Engineer, you will work on data architecture, large-scale processing systems, and data flow management. You will build and maintain optimal data architecture and data pipelines, assemble large, complex data sets, and ensure that data is readily available to data scientists, analysts, and other users. In close collaboration with ML engineers, data scientists, and domain experts, you’ll deliver robust, production-grade solutions that directly impact business outcomes. Ultimately, you will be responsible for developing and implementing systems that optimize the organization’s data use and data quality.

Responsibilities

Create and maintain optimal data architecture and data pipelines on cloud infrastructure (such as AWS/ Azure/ GCP)
Assemble large, complex data sets that meet functional / non-functional business requirements
Identify, design, and implement internal process improvements
Build the pipeline infrastructure required for optimal extraction, transformation, and loading of data from a wide variety of data sources
Support development of analytics that utilize the data pipeline to provide actionable insights into key business metrics
Work with stakeholders to assist with data-related technical issues and support their data infrastructure needs

Who you are

You are a passionate and results-oriented engineer who understands the importance of data architecture and data quality to impact solution development, enhance products, and ultimately improve business applications. You thrive in dynamic environments and are comfortable navigating ambiguity. You possess a strong sense of ownership and are eager to take initiative, advocating for your technical decisions while remaining open to feedback and collaboration.

You have experience in developing and deploying data pipelines to support real-world applications. You have a good understanding of data structures and are excellent at writing clean, efficient code to extract, create and manage large data sets for analytical uses. You have the ability to conduct regular testing and debugging to ensure optimal data pipeline performance. You are excited at the possibility of contributing to intelligent applications that can directly impact business services and make a positive difference to users.

Skills & Requirements

3+ years of hands-on experience as a data engineer, data architect or similar role, with a good understanding of data structures and data engineering.
Solid knowledge of cloud infra and data-related services on AWS (EC2, EMR, RDS, Redshift) and/ or Azure.
Advanced knowledge of SQL, including writing complex queries, stored procedures, views, etc.
Strong experience with data pipeline and workflow management tools (such as Luigi, Airflow).
Experience with common relational SQL, NoSQL and Graph databases.
Strong experience with scripting languages: Python, PySpark, Scala, etc.
Practical experience with basic DevOps concepts: CI/CD, containerization (Docker, Kubernetes), etc
Experience with big data tools (Spark, Kafka, etc) and stream processing.
Excellent communication skills to collaborate with colleagues from both technical and business backgrounds, discuss and convey ideas and findings effectively.
Ability to analyze complex problems, think critically for troubleshooting and develop robust data solutions.
Ability to identify and tackle issues efficiently and proactively, conduct thorough research and collaborate to find long-term, scalable solutions.

Working at Moative

Moative is a young company, but we believe strongly in thinking long-term, while acting with urgency. Our ethos is rooted in innovation, efficiency and high-quality outcomes. We believe the future of work is AI-augmented and boundary less. Here are some of our guiding principles:

Think in decades. Act in hours. As an independent company, our moat is time. While our decisions are for the long-term horizon, our execution will be fast – measured in hours and days, not weeks and months.
Own the canvas. Throw yourself in to build, fix or improve – anything that isn’t done right, irrespective of who did it. Be selfish about improving across the organization – because once the rot sets in, we waste years in surgery and recovery.
Use data or don’t use data. Use data where you ought to but not as a ‘cover-my-back’ political tool. Be capable of making decisions with partial or limited data. Get better at intuition and pattern-matching. Whichever way you go, be mostly right about it.
Avoid work about work. Process creeps on purpose, unless we constantly question it. We are deliberate about committing to rituals that take time away from the actual work. We truly believe that a meeting that could be an email, should be an email and you don’t need a person with the highest title to say that out loud.
High revenue per person. We work backwards from this metric. Our default is to automate instead of hiring. We multi-skill our people to own more outcomes than hiring someone who has less to do. We don’t like squatting and hoarding that comes in the form of hiring for growth. High revenue per person comes from high quality work from everyone. We demand it.

If this role and our work is of interest to you, please apply. We encourage you to apply even if you believe you do not meet all the requirements listed above.

That said, you should demonstrate that you are in the 90th percentile or above. This may mean that you have studied in top-notch institutions, won competitions that are intellectually demanding, built something of your own, or rated as an outstanding performer by your current or previous employers.

The position is based out of Chennai. Our work currently involves significant in-person collaboration and we expect you to work out of our offices in Chennai.

About Moative

Our Past: We have built and sold two companies, one of which was an AI company. Our founders and leaders are Math PhDs, Ivy League University Alumni, Ex-Googlers, and successful entrepreneurs.

Work you’ll do

Responsibilities

Create and maintain optimal data architecture and data pipelines on cloud infrastructure (such as AWS/ Azure/ GCP)
Assemble large, complex data sets that meet functional / non-functional business requirements
Identify, design, and implement internal process improvements
Build the pipeline infrastructure required for optimal extraction, transformation, and loading of data from a wide variety of data sources
Support development of analytics that utilize the data pipeline to provide actionable insights into key business metrics
Work with stakeholders to assist with data-related technical issues and support their data infrastructure needs

Who you are

Skills & Requirements

3+ years of hands-on experience as a data engineer, data architect or similar role, with a good understanding of data structures and data engineering.
Solid knowledge of cloud infra and data-related services on AWS (EC2, EMR, RDS, Redshift) and/ or Azure.
Advanced knowledge of SQL, including writing complex queries, stored procedures, views, etc.
Strong experience with data pipeline and workflow management tools (such as Luigi, Airflow).
Experience with common relational SQL, NoSQL and Graph databases.
Strong experience with scripting languages: Python, PySpark, Scala, etc.
Practical experience with basic DevOps concepts: CI/CD, containerization (Docker, Kubernetes), etc
Experience with big data tools (Spark, Kafka, etc) and stream processing.
Excellent communication skills to collaborate with colleagues from both technical and business backgrounds, discuss and convey ideas and findings effectively.
Ability to analyze complex problems, think critically for troubleshooting and develop robust data solutions.
Ability to identify and tackle issues efficiently and proactively, conduct thorough research and collaborate to find long-term, scalable solutions.

Working at Moative

Think in decades. Act in hours. As an independent company, our moat is time. While our decisions are for the long-term horizon, our execution will be fast – measured in hours and days, not weeks and months.
Own the canvas. Throw yourself in to build, fix or improve – anything that isn’t done right, irrespective of who did it. Be selfish about improving across the organization – because once the rot sets in, we waste years in surgery and recovery.
Use data or don’t use data. Use data where you ought to but not as a ‘cover-my-back’ political tool. Be capable of making decisions with partial or limited data. Get better at intuition and pattern-matching. Whichever way you go, be mostly right about it.
Avoid work about work. Process creeps on purpose, unless we constantly question it. We are deliberate about committing to rituals that take time away from the actual work. We truly believe that a meeting that could be an email, should be an email and you don’t need a person with the highest title to say that out loud.
High revenue per person. We work backwards from this metric. Our default is to automate instead of hiring. We multi-skill our people to own more outcomes than hiring someone who has less to do. We don’t like squatting and hoarding that comes in the form of hiring for growth. High revenue per person comes from high quality work from everyone. We demand it.

If this role and our work is of interest to you, please apply. We encourage you to apply even if you believe you do not meet all the requirements listed above.

The position is based out of Chennai. Our work currently involves significant in-person collaboration and we expect you to work out of our offices in Chennai.

Data engineer - Bangalore Location

at QAgile Services

1 recruiter

Posted by Radhika Chotai

Bengaluru (Bangalore)

3 - 8 yrs

₹17L - ₹25L / yr

PySpark

Windows Azure

Microsoft Windows Azure

Amazon Web Services (AWS)

SQL

+3 more

Employment type- Contract basis

Key Responsibilities

Design, develop, and maintain scalable data pipelines using PySpark and distributed computing frameworks.
Implement ETL processes and integrate data from structured and unstructured sources into cloud data warehouses.
Work across Azure or AWS cloud ecosystems to deploy and manage big data workflows.
Optimize performance of SQL queries and develop stored procedures for data transformation and analytics.
Collaborate with Data Scientists, Analysts, and Business teams to ensure reliable data availability and quality.
Maintain documentation and implement best practices for data architecture, governance, and security.

⚙️ Required Skills

Programming: Proficient in PySpark, Python, and SQL.
Cloud Platforms: Hands-on experience with Azure Data Factory, Databricks, or AWS Glue/Redshift.
Data Engineering Tools: Familiarity with Apache Spark, Kafka, Airflow, or similar tools.
Data Warehousing: Strong knowledge of designing and working with data warehouses like Snowflake, BigQuery, Synapse, or Redshift.
Data Modeling: Experience in dimensional modeling, star/snowflake schema, and data lake architecture.
CI/CD & Version Control: Exposure to Git, Terraform, or other DevOps tools is a plus.

🧰 Preferred Qualifications

Bachelor's or Master's in Computer Science, Engineering, or related field.
Certifications in Azure/AWS are highly desirable.
Knowledge of business intelligence tools (Power BI, Tableau) is a bonus.

Employment type- Contract basis

Key Responsibilities

Design, develop, and maintain scalable data pipelines using PySpark and distributed computing frameworks.
Implement ETL processes and integrate data from structured and unstructured sources into cloud data warehouses.
Work across Azure or AWS cloud ecosystems to deploy and manage big data workflows.
Optimize performance of SQL queries and develop stored procedures for data transformation and analytics.
Collaborate with Data Scientists, Analysts, and Business teams to ensure reliable data availability and quality.
Maintain documentation and implement best practices for data architecture, governance, and security.

⚙️ Required Skills

Programming: Proficient in PySpark, Python, and SQL.
Cloud Platforms: Hands-on experience with Azure Data Factory, Databricks, or AWS Glue/Redshift.
Data Engineering Tools: Familiarity with Apache Spark, Kafka, Airflow, or similar tools.
Data Warehousing: Strong knowledge of designing and working with data warehouses like Snowflake, BigQuery, Synapse, or Redshift.
Data Modeling: Experience in dimensional modeling, star/snowflake schema, and data lake architecture.
CI/CD & Version Control: Exposure to Git, Terraform, or other DevOps tools is a plus.

🧰 Preferred Qualifications

Bachelor's or Master's in Computer Science, Engineering, or related field.
Certifications in Azure/AWS are highly desirable.
Knowledge of business intelligence tools (Power BI, Tableau) is a bonus.

GCP Data Engineer - BigQuery

at Tata Consultancy Services

2 recruiters

Agency job

via Risk Resources LLP hyd by susmitha o

Bengaluru (Bangalore), Hyderabad, Kolkata

6 - 10 yrs

₹7L - ₹30L / yr

Google Cloud Platform (GCP)

Big Data

BIGQuery

PySpark

Work with the team in capacity of GCP Data Engineer on day-to-day activities.
Solve problems at hand with utmost clarity and speed.
Work with Data analysts and architects to help them solve any specific issues with tooling/processes.
Design, build and operationalize large scale enterprise data solutions and applications using one or more of GCP data and analytics services in combination with 3rd parties - Python/Java/React.js, AirFlow ETL skills - GCP services (BigQuery, Dataflow, Cloud SQL, Cloud Functions, Data Lake.
Design and build production data pipelines from ingestion to consumption within a big data architecture.
GCP BQ modeling and performance tuning techniques.
RDBMS and No-SQL database experience.
Knowledge on orchestrating workloads on cloud.

Work with the team in capacity of GCP Data Engineer on day-to-day activities.
Solve problems at hand with utmost clarity and speed.
Work with Data analysts and architects to help them solve any specific issues with tooling/processes.
Design, build and operationalize large scale enterprise data solutions and applications using one or more of GCP data and analytics services in combination with 3rd parties - Python/Java/React.js, AirFlow ETL skills - GCP services (BigQuery, Dataflow, Cloud SQL, Cloud Functions, Data Lake.
Design and build production data pipelines from ingestion to consumption within a big data architecture.
GCP BQ modeling and performance tuning techniques.
RDBMS and No-SQL database experience.
Knowledge on orchestrating workloads on cloud.

Azure Data bricks

at Tata Consultancy Services

2 recruiters

Agency job

via Risk Resources LLP hyd by susmitha o

Bengaluru (Bangalore), Pune, Hyderabad, Delhi, Gurugram, Noida, Faridabad, Mumbai, Chennai

5 - 10 yrs

₹7L - ₹30L / yr

azure

databricks

PySpark

Data engineering

Very good understanding of Azure Monitor, Log Analytics, Alerting, Dashboards & Workbooks.

Good working knowledge of Kusto Query Language (KQL) or Python for data analysis and querying.

Good working knowledge of Azure Functions for automation.

Apply Powershell programming skills to develop clean code that is stable, consistent, and well[1]performing.

Knowledge of Terraform, ARM or Bicep for infrastructure as code.

GitHub/Gitlab for devops pipelines.

Experience in monitoring infrastructure, managing escalations, and ensuring timely issue resolution to maintain operational stability and stakeholder satisfaction.

Good-to-Have

Knowledge of Terraform, ARM or Bicep for infrastructure as code

Very good understanding of Azure Monitor, Log Analytics, Alerting, Dashboards & Workbooks.

Good working knowledge of Kusto Query Language (KQL) or Python for data analysis and querying.

Good working knowledge of Azure Functions for automation.

Apply Powershell programming skills to develop clean code that is stable, consistent, and well[1]performing.

Knowledge of Terraform, ARM or Bicep for infrastructure as code.

GitHub/Gitlab for devops pipelines.

Experience in monitoring infrastructure, managing escalations, and ensuring timely issue resolution to maintain operational stability and stakeholder satisfaction.

Good-to-Have

Knowledge of Terraform, ARM or Bicep for infrastructure as code

Data Engineer

A leading software company

Agency job

via BOS consultants by Manka Joshi

Remote only

6 - 9 yrs

₹12L - ₹15L / yr

databricks

PySpark

Large Language Models (LLM)

Vector database

Google Cloud Platform (GCP)

+1 more

1. Solid Databricks & pyspark experience

2. Must have worked in projects dealing with data at terabyte scale

3. Must have knowledge of spark optimization techniques

4. Must have experience setting up job pipelines in Databricks

5. Basic knowledge of gcp and big query is required

6. Understanding LLMs and vector db

1. Solid Databricks & pyspark experience

2. Must have worked in projects dealing with data at terabyte scale

3. Must have knowledge of spark optimization techniques

4. Must have experience setting up job pipelines in Databricks

5. Basic knowledge of gcp and big query is required

6. Understanding LLMs and vector db

DevOps Engineer – Big Data Systems

at venanalytics

2 candid answers

Posted by Rincy jain

Mumbai

3 - 5 yrs

₹10L - ₹15L / yr

Google Cloud Platform (GCP)

DevOps

Python

Big Data

CI/CD

+3 more

About the Role

We are looking for a highly motivated Project Manager with a strong background in cloud technologies, big data ecosystems, and software development lifecycles to lead cross-functional teams in delivering high-impact projects. The ideal candidate will combine excellent project management skills with technical acumen in GCP, DevOps, and Python-based applications.

Key Responsibilities

Lead end-to-end project planning, execution, and delivery, ensuring alignment with business goals and timelines.
Create and maintain project documentation including detailed timelines, sprint boards, risk logs, and weekly status reports.
Facilitate Agile ceremonies: daily stand-ups, sprint planning, retrospectives, and backlog grooming.
Actively manage risks, scope changes, resource allocation, and project dependencies to ensure delivery without disruptions.
Ensure compliance with QA processes and security/compliance standards throughout the SDLC.
Collaborate with stakeholders and senior leadership to communicate progress, blockers, and key milestones.
Provide mentorship and support to cross-functional team members to drive continuous improvement and team performance.
Coordinate with clients and act as a key point of contact for requirement gathering, updates, and escalations.

Required Skills & Experience

Cloud & DevOps

Proficient in Google Cloud Platform (GCP) services: Compute, Storage, Networking, IAM.
Hands-on experience with cloud deployments and infrastructure as code.
Strong working knowledge of CI/CD pipelines, Docker, Kubernetes, and Terraform (or similar tools).

Big Data & Data Engineering

Experience with large-scale data processing using tools like PySpark, Hadoop, Hive, HDFS, and Spark Streaming (preferred).
Proven experience in managing and optimizing big data pipelines and ensuring high performance.

Programming & Frameworks

Strong proficiency in Python with experience in Django (REST APIs, ORM, deployment workflows).
Familiarity with Git and version control best practices.
Basic knowledge of Linux administration and shell scripting.

Nice to Have

Knowledge or prior experience in the Media & Advertising domain.
Experience in client-facing roles and handling stakeholder communications.
Proven ability to manage technical teams (5–6 members).

Why Join Us?

Work on cutting-edge cloud and data engineering projects
Collaborate with a talented, fast-paced team
Flexible work setup and culture of ownership
Continuous learning and upskilling environment
Inclusive health benefits included

About the Role

Key Responsibilities

Lead end-to-end project planning, execution, and delivery, ensuring alignment with business goals and timelines.
Create and maintain project documentation including detailed timelines, sprint boards, risk logs, and weekly status reports.
Facilitate Agile ceremonies: daily stand-ups, sprint planning, retrospectives, and backlog grooming.
Actively manage risks, scope changes, resource allocation, and project dependencies to ensure delivery without disruptions.
Ensure compliance with QA processes and security/compliance standards throughout the SDLC.
Collaborate with stakeholders and senior leadership to communicate progress, blockers, and key milestones.
Provide mentorship and support to cross-functional team members to drive continuous improvement and team performance.
Coordinate with clients and act as a key point of contact for requirement gathering, updates, and escalations.

Required Skills & Experience

Cloud & DevOps

Proficient in Google Cloud Platform (GCP) services: Compute, Storage, Networking, IAM.
Hands-on experience with cloud deployments and infrastructure as code.
Strong working knowledge of CI/CD pipelines, Docker, Kubernetes, and Terraform (or similar tools).

Big Data & Data Engineering

Experience with large-scale data processing using tools like PySpark, Hadoop, Hive, HDFS, and Spark Streaming (preferred).
Proven experience in managing and optimizing big data pipelines and ensuring high performance.

Programming & Frameworks

Strong proficiency in Python with experience in Django (REST APIs, ORM, deployment workflows).
Familiarity with Git and version control best practices.
Basic knowledge of Linux administration and shell scripting.

Nice to Have

Knowledge or prior experience in the Media & Advertising domain.
Experience in client-facing roles and handling stakeholder communications.
Proven ability to manage technical teams (5–6 members).

Why Join Us?

Work on cutting-edge cloud and data engineering projects
Collaborate with a talented, fast-paced team
Flexible work setup and culture of ownership
Continuous learning and upskilling environment
Inclusive health benefits included

Python developer

at Wissen Technology

4 recruiters

Posted by Praffull Shinde

Pune, Mumbai, Bengaluru (Bangalore)

4 - 8 yrs

₹14L - ₹26L / yr

Python

PySpark

Django

Flask

RESTful APIs

+3 more

Job title - Python developer

Exp – 4 to 6 years

Location – Pune/Mum/B’lore

PFB JD

Requirements:

Proven experience as a Python Developer
Strong knowledge of core Python and Pyspark concepts
Experience with web frameworks such as Django or Flask
Good exposure to any cloud platform (GCP Preferred)
CI/CD exposure required
Solid understanding of RESTful APIs and how to build them
Experience working with databases like Oracle DB and MySQL
Ability to write efficient SQL queries and optimize database performance
Strong problem-solving skills and attention to detail
Strong SQL programing (stored procedure, functions)
Excellent communication and interpersonal skill

Roles and Responsibilities

Design, develop, and maintain data pipelines and ETL processes using pyspark
Work closely with data scientists and analysts to provide them with clean, structured data.
Optimize data storage and retrieval for performance and scalability.
Collaborate with cross-functional teams to gather data requirements.
Ensure data quality and integrity through data validation and cleansing processes.
Monitor and troubleshoot data-related issues to ensure data pipeline reliability.
Stay up to date with industry best practices and emerging technologies in data engineering.

Job title - Python developer

Exp – 4 to 6 years

Location – Pune/Mum/B’lore

PFB JD

Requirements:

Proven experience as a Python Developer
Strong knowledge of core Python and Pyspark concepts
Experience with web frameworks such as Django or Flask
Good exposure to any cloud platform (GCP Preferred)
CI/CD exposure required
Solid understanding of RESTful APIs and how to build them
Experience working with databases like Oracle DB and MySQL
Ability to write efficient SQL queries and optimize database performance
Strong problem-solving skills and attention to detail
Strong SQL programing (stored procedure, functions)
Excellent communication and interpersonal skill

Roles and Responsibilities

Design, develop, and maintain data pipelines and ETL processes using pyspark
Work closely with data scientists and analysts to provide them with clean, structured data.
Optimize data storage and retrieval for performance and scalability.
Collaborate with cross-functional teams to gather data requirements.
Ensure data quality and integrity through data validation and cleansing processes.
Monitor and troubleshoot data-related issues to ensure data pipeline reliability.
Stay up to date with industry best practices and emerging technologies in data engineering.

Data Engineer

at Tecblic Private LImited

Posted by HR HR

Ahmedabad

4 - 5 yrs

₹8L - ₹12L / yr

Microsoft Windows Azure

SQL

Python

PySpark

ETL

+2 more

🚀 We Are Hiring: Data Engineer | 4+ Years Experience 🚀

Job description

🔍 Job Title: Data Engineer

📍 Location: Ahmedabad

🚀 Work Mode: On-Site Opportunity

📅 Experience: 4+ Years

🕒 Employment Type: Full-Time

⏱️ Availability : Immediate Joiner Preferred

Join Our Team as a Data Engineer

We are seeking a passionate and experienced Data Engineer to be a part of our dynamic and forward-thinking team in Ahmedabad. This is an exciting opportunity for someone who thrives on transforming raw data into powerful insights and building scalable, high-performance data infrastructure.

As a Data Engineer, you will work closely with data scientists, analysts, and cross-functional teams to design robust data pipelines, optimize data systems, and enable data-driven decision-making across the organization.

Your Key Responsibilities

Architect, build, and maintain scalable and reliable data pipelines from diverse data sources.

Design effective data storage, retrieval mechanisms, and data models to support analytics and business needs.

Implement data validation, transformation, and quality monitoring processes.

Collaborate with cross-functional teams to deliver impactful, data-driven solutions.

Proactively identify bottlenecks and optimize existing workflows and processes.

Provide guidance and mentorship to junior engineers in the team.

Skills & Expertise We’re Looking For

3+ years of hands-on experience in Data Engineering or related roles.

Strong expertise in Python and data pipeline design.

Experience working with Big Data tools like Hadoop, Spark, Hive.

Proficiency with SQL, NoSQL databases, and data warehousing solutions.

Solid experience in cloud platforms - Azure

Familiar with distributed computing, data modeling, and performance tuning.

Understanding of DevOps, Power Automate, and Microsoft Fabric is a plus.

Strong analytical thinking, collaboration skills, Excellent Communication Skill and the ability to work independently or as part of a team.

Qualifications

Bachelor’s degree in Computer Science, Data Science, or a related field.

🚀 We Are Hiring: Data Engineer | 4+ Years Experience 🚀

Job description

🔍 Job Title: Data Engineer

📍 Location: Ahmedabad

🚀 Work Mode: On-Site Opportunity

📅 Experience: 4+ Years

🕒 Employment Type: Full-Time

⏱️ Availability : Immediate Joiner Preferred

Join Our Team as a Data Engineer

Your Key Responsibilities

Architect, build, and maintain scalable and reliable data pipelines from diverse data sources.

Design effective data storage, retrieval mechanisms, and data models to support analytics and business needs.

Implement data validation, transformation, and quality monitoring processes.

Collaborate with cross-functional teams to deliver impactful, data-driven solutions.

Proactively identify bottlenecks and optimize existing workflows and processes.

Provide guidance and mentorship to junior engineers in the team.

Skills & Expertise We’re Looking For

3+ years of hands-on experience in Data Engineering or related roles.

Strong expertise in Python and data pipeline design.

Experience working with Big Data tools like Hadoop, Spark, Hive.

Proficiency with SQL, NoSQL databases, and data warehousing solutions.

Solid experience in cloud platforms - Azure

Familiar with distributed computing, data modeling, and performance tuning.

Understanding of DevOps, Power Automate, and Microsoft Fabric is a plus.

Strong analytical thinking, collaboration skills, Excellent Communication Skill and the ability to work independently or as part of a team.

Qualifications

Bachelor’s degree in Computer Science, Data Science, or a related field.

Java + Python Developer

at Wissen Technology

4 recruiters

Posted by Rutuja Patil

Mumbai

4 - 10 yrs

Best in industry

Java

J2EE

Hibernate (Java)

Spring Boot

Spring MVC

+2 more

Company Name – Wissen Technology

Group of companies in India – Wissen Technology & Wissen Infotech

Work Location - Senior Backend Developer – Java (with Python Exposure)- Mumbai

Experience - 4 to 10 years

Kindly revert over mail if you are interested.

Java Developer – Job Description

We are seeking a Senior Backend Developer with strong expertise in Java (Spring Boot) and working knowledge of Python. In this role, Java will be your primary development language, with Python used for scripting, automation, or selected service modules. You’ll be part of a collaborative backend team building scalable and high-performance systems.

Key Responsibilities

Design and develop robust backend services and APIs primarily using Java (Spring Boot)
Contribute to Python-based components where needed for automation, scripting, or lightweight services
Build, integrate, and optimize RESTful APIs and microservices
Work with relational and NoSQL databases
Write unit and integration tests (JUnit, PyTest)
Collaborate closely with DevOps, QA, and product teams
Participate in architecture reviews and design discussions
Help maintain code quality, organization, and automation

Required Skills & Qualifications

4 to 10 years of hands-on Java development experience
Strong experience with Spring Boot, JPA/Hibernate, and REST APIs
At least 1–2 years of hands-on experience with Python (e.g., for scripting, automation, or small services)
Familiarity with Python frameworks like Flask or FastAPI is a plus
Experience with SQL/NoSQL databases (e.g., PostgreSQL, MongoDB)
Good understanding of OOP, design patterns, and software engineering best practices
Familiarity with Docker, Git, and CI/CD pipelines

Company Name – Wissen Technology

Group of companies in India – Wissen Technology & Wissen Infotech

Work Location - Senior Backend Developer – Java (with Python Exposure)- Mumbai

Experience - 4 to 10 years

Kindly revert over mail if you are interested.

Java Developer – Job Description

Key Responsibilities

Design and develop robust backend services and APIs primarily using Java (Spring Boot)
Contribute to Python-based components where needed for automation, scripting, or lightweight services
Build, integrate, and optimize RESTful APIs and microservices
Work with relational and NoSQL databases
Write unit and integration tests (JUnit, PyTest)
Collaborate closely with DevOps, QA, and product teams
Participate in architecture reviews and design discussions
Help maintain code quality, organization, and automation

Required Skills & Qualifications

4 to 10 years of hands-on Java development experience
Strong experience with Spring Boot, JPA/Hibernate, and REST APIs
At least 1–2 years of hands-on experience with Python (e.g., for scripting, automation, or small services)
Familiarity with Python frameworks like Flask or FastAPI is a plus
Experience with SQL/NoSQL databases (e.g., PostgreSQL, MongoDB)
Good understanding of OOP, design patterns, and software engineering best practices
Familiarity with Docker, Git, and CI/CD pipelines

Data Engineer

at Tata Consultancy Services

2 recruiters

Agency job

via Risk Resources LLP hyd by Jhansi Padiy

AnyWhareIndia, Bengaluru (Bangalore), Mumbai, Delhi, Gurugram, Noida, Ghaziabad, Faridabad, Pune, Hyderabad, Indore, Kolkata

5 - 11 yrs

₹6L - ₹30L / yr

Snowflake

Python

PySpark

SQL

Role descriptions / Expectations from the Role

· 6-7 years of IT development experience with min 3+ years hands-on experience in Snowflake

· Strong experience in building/designing the data warehouse or data lake, and data mart end-to-end implementation experience focusing on large enterprise scale and Snowflake implementations on any of the hyper scalers.

· Strong experience with building productionized data ingestion and data pipelines in Snowflake

· Good knowledge of Snowflake's architecture, features likie Zero-Copy Cloning, Time Travel, and performance tuning capabilities

· Should have good exp on Snowflake RBAC and data security.

· Strong experience in Snowflake features including new snowflake features.

· Should have good experience in Python/Pyspark.

· Should have experience in AWS services (S3, Glue, Lambda, Secrete Manager, DMS) and few Azure services (Blob storage, ADLS, ADF)

· Should have experience/knowledge in orchestration and scheduling tools experience like Airflow

· Should have good understanding on ETL or ELT processes and ETL tools.

Role descriptions / Expectations from the Role

· 6-7 years of IT development experience with min 3+ years hands-on experience in Snowflake

· Strong experience with building productionized data ingestion and data pipelines in Snowflake

· Good knowledge of Snowflake's architecture, features likie Zero-Copy Cloning, Time Travel, and performance tuning capabilities

· Should have good exp on Snowflake RBAC and data security.

· Strong experience in Snowflake features including new snowflake features.

· Should have good experience in Python/Pyspark.

· Should have experience in AWS services (S3, Glue, Lambda, Secrete Manager, DMS) and few Azure services (Blob storage, ADLS, ADF)

· Should have experience/knowledge in orchestration and scheduling tools experience like Airflow

· Should have good understanding on ETL or ELT processes and ETL tools.

Bigdata, pyspark

at Tata Consultancy Services

2 recruiters

Agency job

via Risk Resources LLP hyd by susmitha o

Pune

7 - 10 yrs

₹7L - ₹20L / yr

Big Data

PySpark

Google Cloud Platform (GCP)

True Hands-On Developer in Programming Languages like Java or Scala . Expertise in Apache Spark . Database modelling and working with any of the SQL or NoSQL Database is must. Working knowledge of Scripting languages like shell/python. Experience of working with Cloudera is Preferred Orchestration tools like Airflow or Oozie would be a value addition. Knowledge of Table formats like Delta or Iceberg is plus to have. Working experience of Version controls like Git, build tools like Maven is recommended. Having software development experience is good to have along with Data Engineering experience

AWS Data Engineer

at Deqode

1 recruiter

Posted by Alisha Das

Bengaluru (Bangalore), Mumbai, Pune, Chennai, Gurugram

5.6 - 7 yrs

₹10L - ₹28L / yr

Amazon Web Services (AWS)

Python

PySpark

SQL

Job Summary:

As an AWS Data Engineer, you will be responsible for designing, developing, and maintaining scalable, high-performance data pipelines using AWS services. With 6+ years of experience, you’ll collaborate closely with data architects, analysts, and business stakeholders to build reliable, secure, and cost-efficient data infrastructure across the organization.

Key Responsibilities:

Design, develop, and manage scalable data pipelines using AWS Glue, Lambda, and other serverless technologies
Implement ETL workflows and transformation logic using PySpark and Python on AWS Glue
Leverage AWS Redshift for warehousing, performance tuning, and large-scale data queries
Work with AWS DMS and RDS for database integration and migration
Optimize data flows and system performance for speed and cost-effectiveness
Deploy and manage infrastructure using AWS CloudFormation templates
Collaborate with cross-functional teams to gather requirements and build robust data solutions
Ensure data integrity, quality, and security across all systems and processes

Required Skills & Experience:

6+ years of experience in Data Engineering with strong AWS expertise
Proficient in Python and PySpark for data processing and ETL development
Hands-on experience with AWS Glue, Lambda, DMS, RDS, and Redshift
Strong SQL skills for building complex queries and performing data analysis
Familiarity with AWS CloudFormation and infrastructure as code principles
Good understanding of serverless architecture and cost-optimized design
Ability to write clean, modular, and maintainable code
Strong analytical thinking and problem-solving skills

Job Summary:

Key Responsibilities:

Design, develop, and manage scalable data pipelines using AWS Glue, Lambda, and other serverless technologies
Implement ETL workflows and transformation logic using PySpark and Python on AWS Glue
Leverage AWS Redshift for warehousing, performance tuning, and large-scale data queries
Work with AWS DMS and RDS for database integration and migration
Optimize data flows and system performance for speed and cost-effectiveness
Deploy and manage infrastructure using AWS CloudFormation templates
Collaborate with cross-functional teams to gather requirements and build robust data solutions
Ensure data integrity, quality, and security across all systems and processes

Required Skills & Experience:

6+ years of experience in Data Engineering with strong AWS expertise
Proficient in Python and PySpark for data processing and ETL development
Hands-on experience with AWS Glue, Lambda, DMS, RDS, and Redshift
Strong SQL skills for building complex queries and performing data analysis
Familiarity with AWS CloudFormation and infrastructure as code principles
Good understanding of serverless architecture and cost-optimized design
Ability to write clean, modular, and maintainable code
Strong analytical thinking and problem-solving skills

Data Engineer

at Deqode

1 recruiter

Posted by purvisha Bhavsar

Gurugram, Delhi, Noida, Ghaziabad, Faridabad

6 - 10 yrs

₹5L - ₹15L / yr

Google Cloud Platform (GCP)

Python

PySpark

.NET

Scala

🚀 Hiring: Data Engineer | GCP + Spark + Python + .NET |

| 6–10 Yrs | Gurugram (Hybrid)

We’re looking for a skilled Data Engineer with strong hands-on experience in GCP, Spark-Scala, Python, and .NET.

📍 Location: Suncity, Sector 54, Gurugram (Hybrid – 3 days onsite)

💼 Experience: 6–10 Years

⏱️ Notice Period :- Immediate Joiner

Required Skills:

5+ years of experience in distributed computing (Spark) and software development.
3+ years of experience in Spark-Scala
5+ years of experience in Data Engineering.
5+ years of experience in Python.
Fluency in working with databases (preferably Postgres).
Have a sound understanding of object-oriented programming and development principles.
Experience working in an Agile Scrum or Kanban development environment.
Experience working with version control software (preferably Git).
Experience with CI/CD pipelines.
Experience with automated testing, including integration/delta, Load, and Performance

🚀 Hiring: Data Engineer | GCP + Spark + Python + .NET |

| 6–10 Yrs | Gurugram (Hybrid)

We’re looking for a skilled Data Engineer with strong hands-on experience in GCP, Spark-Scala, Python, and .NET.

📍 Location: Suncity, Sector 54, Gurugram (Hybrid – 3 days onsite)

💼 Experience: 6–10 Years

⏱️ Notice Period :- Immediate Joiner

Required Skills:

5+ years of experience in distributed computing (Spark) and software development.
3+ years of experience in Spark-Scala
5+ years of experience in Data Engineering.
5+ years of experience in Python.
Fluency in working with databases (preferably Postgres).
Have a sound understanding of object-oriented programming and development principles.
Experience working in an Agile Scrum or Kanban development environment.
Experience working with version control software (preferably Git).
Experience with CI/CD pipelines.
Experience with automated testing, including integration/delta, Load, and Performance

Lead Data Engineer

Data Havn

Agency job

via Infinium Associate by Toshi Srivastava

Noida

5 - 9 yrs

₹40L - ₹60L / yr

Python

SQL

Data engineering

Snowflake

ETL

+5 more

About the Role:

We are seeking a talented Lead Data Engineer to join our team and play a pivotal role in transforming raw data into valuable insights. As a Data Engineer, you will design, develop, and maintain robust data pipelines and infrastructure to support our organization's analytics and decision-making processes.

Responsibilities:

Data Pipeline Development: Build and maintain scalable data pipelines to extract, transform, and load (ETL) data from various sources (e.g., databases, APIs, files) into data warehouses or data lakes.
Data Infrastructure: Design, implement, and manage data infrastructure components, including data warehouses, data lakes, and data marts.
Data Quality: Ensure data quality by implementing data validation, cleansing, and standardization processes.
Team Management: Able to handle team.
Performance Optimization: Optimize data pipelines and infrastructure for performance and efficiency.
Collaboration: Collaborate with data analysts, scientists, and business stakeholders to understand their data needs and translate them into technical requirements.
Tool and Technology Selection: Evaluate and select appropriate data engineering tools and technologies (e.g., SQL, Python, Spark, Hadoop, cloud platforms).
Documentation: Create and maintain clear and comprehensive documentation for data pipelines, infrastructure, and processes.

Skills:

Strong proficiency in SQL and at least one programming language (e.g., Python, Java).
Experience with data warehousing and data lake technologies (e.g., Snowflake, AWS Redshift, Databricks).
Knowledge of cloud platforms (e.g., AWS, GCP, Azure) and cloud-based data services.
Understanding of data modeling and data architecture concepts.
Experience with ETL/ELT tools and frameworks.
Excellent problem-solving and analytical skills.
Ability to work independently and as part of a team.

Preferred Qualifications:

Experience with real-time data processing and streaming technologies (e.g., Kafka, Flink).
Knowledge of machine learning and artificial intelligence concepts.
Experience with data visualization tools (e.g., Tableau, Power BI).
Certification in cloud platforms or data engineering.

About the Role:

Responsibilities:

Data Pipeline Development: Build and maintain scalable data pipelines to extract, transform, and load (ETL) data from various sources (e.g., databases, APIs, files) into data warehouses or data lakes.
Data Infrastructure: Design, implement, and manage data infrastructure components, including data warehouses, data lakes, and data marts.
Data Quality: Ensure data quality by implementing data validation, cleansing, and standardization processes.
Team Management: Able to handle team.
Performance Optimization: Optimize data pipelines and infrastructure for performance and efficiency.
Collaboration: Collaborate with data analysts, scientists, and business stakeholders to understand their data needs and translate them into technical requirements.
Tool and Technology Selection: Evaluate and select appropriate data engineering tools and technologies (e.g., SQL, Python, Spark, Hadoop, cloud platforms).
Documentation: Create and maintain clear and comprehensive documentation for data pipelines, infrastructure, and processes.

Skills:

Strong proficiency in SQL and at least one programming language (e.g., Python, Java).
Experience with data warehousing and data lake technologies (e.g., Snowflake, AWS Redshift, Databricks).
Knowledge of cloud platforms (e.g., AWS, GCP, Azure) and cloud-based data services.
Understanding of data modeling and data architecture concepts.
Experience with ETL/ELT tools and frameworks.
Excellent problem-solving and analytical skills.
Ability to work independently and as part of a team.

Preferred Qualifications:

Experience with real-time data processing and streaming technologies (e.g., Kafka, Flink).
Knowledge of machine learning and artificial intelligence concepts.
Experience with data visualization tools (e.g., Tableau, Power BI).
Certification in cloud platforms or data engineering.

ETL Automation Tester

at E2E Infoware Management Services

Posted by Monika S

Bengaluru (Bangalore), Pune, Chennai

5 - 12 yrs

₹5L - ₹25L / yr

PySpark

Automation

SQL

Skill Name: ETL Automation Testing

Location: Bangalore, Chennai and Pune

Experience: 5+ Years

Required:

Experience in ETL Automation Testing

Strong experience in Pyspark.

Skill Name: ETL Automation Testing

Location: Bangalore, Chennai and Pune

Experience: 5+ Years

Required:

Experience in ETL Automation Testing

Strong experience in Pyspark.

Senior Data Engineer

at Wissen Technology

4 recruiters

Posted by Vishakha Walunj

Bengaluru (Bangalore), Pune, Mumbai

7 - 12 yrs

Best in industry

PySpark

databricks

SQL

Python

Required Skills:

Hands-on experience with Databricks, PySpark
Proficiency in SQL, Python, and Spark.
Understanding of data warehousing concepts and data modeling.
Experience with CI/CD pipelines and version control (e.g., Git).
Fundamental knowledge of any cloud services, preferably Azure or GCP.

Good to Have:

Bigquery
Experience with performance tuning and data governance.

Required Skills:

Hands-on experience with Databricks, PySpark
Proficiency in SQL, Python, and Spark.
Understanding of data warehousing concepts and data modeling.
Experience with CI/CD pipelines and version control (e.g., Git).
Fundamental knowledge of any cloud services, preferably Azure or GCP.

Good to Have:

Bigquery
Experience with performance tuning and data governance.

Lead Data Engineer- An IT Start Up

Data Havn

Agency job

via Infinium Associate by Toshi Srivastava

Noida

5 - 8 yrs

₹25L - ₹40L / yr

Data engineering

Python

SQL

Data Warehouse (DWH)

ETL

+6 more

About the Role:

We are seeking a talented Lead Data Engineer to join our team and play a pivotal role in transforming raw data into valuable insights. As a Data Engineer, you will design, develop, and maintain robust data pipelines and infrastructure to support our organization's analytics and decision-making processes.

Responsibilities:

Data Pipeline Development: Build and maintain scalable data pipelines to extract, transform, and load (ETL) data from various sources (e.g., databases, APIs, files) into data warehouses or data lakes.
Data Infrastructure: Design, implement, and manage data infrastructure components, including data warehouses, data lakes, and data marts.
Data Quality: Ensure data quality by implementing data validation, cleansing, and standardization processes.
Team Management: Able to handle team.
Performance Optimization: Optimize data pipelines and infrastructure for performance and efficiency.
Collaboration: Collaborate with data analysts, scientists, and business stakeholders to understand their data needs and translate them into technical requirements.
Tool and Technology Selection: Evaluate and select appropriate data engineering tools and technologies (e.g., SQL, Python, Spark, Hadoop, cloud platforms).
Documentation: Create and maintain clear and comprehensive documentation for data pipelines, infrastructure, and processes.

Skills:

Strong proficiency in SQL and at least one programming language (e.g., Python, Java).
Experience with data warehousing and data lake technologies (e.g., Snowflake, AWS Redshift, Databricks).
Knowledge of cloud platforms (e.g., AWS, GCP, Azure) and cloud-based data services.
Understanding of data modeling and data architecture concepts.
Experience with ETL/ELT tools and frameworks.
Excellent problem-solving and analytical skills.
Ability to work independently and as part of a team.

Preferred Qualifications:

Experience with real-time data processing and streaming technologies (e.g., Kafka, Flink).
Knowledge of machine learning and artificial intelligence concepts.
Experience with data visualization tools (e.g., Tableau, Power BI).
Certification in cloud platforms or data engineering.

About the Role:

Responsibilities:

Data Pipeline Development: Build and maintain scalable data pipelines to extract, transform, and load (ETL) data from various sources (e.g., databases, APIs, files) into data warehouses or data lakes.
Data Infrastructure: Design, implement, and manage data infrastructure components, including data warehouses, data lakes, and data marts.
Data Quality: Ensure data quality by implementing data validation, cleansing, and standardization processes.
Team Management: Able to handle team.
Performance Optimization: Optimize data pipelines and infrastructure for performance and efficiency.
Collaboration: Collaborate with data analysts, scientists, and business stakeholders to understand their data needs and translate them into technical requirements.
Tool and Technology Selection: Evaluate and select appropriate data engineering tools and technologies (e.g., SQL, Python, Spark, Hadoop, cloud platforms).
Documentation: Create and maintain clear and comprehensive documentation for data pipelines, infrastructure, and processes.

Skills:

Strong proficiency in SQL and at least one programming language (e.g., Python, Java).
Experience with data warehousing and data lake technologies (e.g., Snowflake, AWS Redshift, Databricks).
Knowledge of cloud platforms (e.g., AWS, GCP, Azure) and cloud-based data services.
Understanding of data modeling and data architecture concepts.
Experience with ETL/ELT tools and frameworks.
Excellent problem-solving and analytical skills.
Ability to work independently and as part of a team.

Preferred Qualifications:

Experience with real-time data processing and streaming technologies (e.g., Kafka, Flink).
Knowledge of machine learning and artificial intelligence concepts.
Experience with data visualization tools (e.g., Tableau, Power BI).
Certification in cloud platforms or data engineering.

AWS Data Engineer

at Deqode

1 recruiter

Posted by Roshni Maji

Pune, Bengaluru (Bangalore), Gurugram, Chennai, Mumbai

5 - 7 yrs

₹6L - ₹20L / yr

Amazon Web Services (AWS)

Amazon Redshift

AWS Glue

Python

PySpark

Position: AWS Data Engineer

Experience: 5 to 7 Years

Location: Bengaluru, Pune, Chennai, Mumbai, Gurugram

Work Mode: Hybrid (3 days work from office per week)

Employment Type: Full-time

About the Role:

We are seeking a highly skilled and motivated AWS Data Engineer with 5–7 years of experience in building and optimizing data pipelines, architectures, and data sets. The ideal candidate will have strong experience with AWS services including Glue, Athena, Redshift, Lambda, DMS, RDS, and CloudFormation. You will be responsible for managing the full data lifecycle from ingestion to transformation and storage, ensuring efficiency and performance.

Key Responsibilities:

Design, develop, and optimize scalable ETL pipelines using AWS Glue, Python/PySpark, and SQL.
Work extensively with AWS services such as Glue, Athena, Lambda, DMS, RDS, Redshift, CloudFormation, and other serverless technologies.
Implement and manage data lake and warehouse solutions using AWS Redshift and S3.
Optimize data models and storage for cost-efficiency and performance.
Write advanced SQL queries to support complex data analysis and reporting requirements.
Collaborate with stakeholders to understand data requirements and translate them into scalable solutions.
Ensure high data quality and integrity across platforms and processes.
Implement CI/CD pipelines and best practices for infrastructure as code using CloudFormation or similar tools.

Required Skills & Experience:

Strong hands-on experience with Python or PySpark for data processing.
Deep knowledge of AWS Glue, Athena, Lambda, Redshift, RDS, DMS, and CloudFormation.
Proficiency in writing complex SQL queries and optimizing them for performance.
Familiarity with serverless architectures and AWS best practices.
Experience in designing and maintaining robust data architectures and data lakes.
Ability to troubleshoot and resolve data pipeline issues efficiently.
Strong communication and stakeholder management skills.

Position: AWS Data Engineer

Experience: 5 to 7 Years

Location: Bengaluru, Pune, Chennai, Mumbai, Gurugram

Work Mode: Hybrid (3 days work from office per week)

Employment Type: Full-time

About the Role:

Key Responsibilities:

Design, develop, and optimize scalable ETL pipelines using AWS Glue, Python/PySpark, and SQL.
Work extensively with AWS services such as Glue, Athena, Lambda, DMS, RDS, Redshift, CloudFormation, and other serverless technologies.
Implement and manage data lake and warehouse solutions using AWS Redshift and S3.
Optimize data models and storage for cost-efficiency and performance.
Write advanced SQL queries to support complex data analysis and reporting requirements.
Collaborate with stakeholders to understand data requirements and translate them into scalable solutions.
Ensure high data quality and integrity across platforms and processes.
Implement CI/CD pipelines and best practices for infrastructure as code using CloudFormation or similar tools.

Required Skills & Experience:

Strong hands-on experience with Python or PySpark for data processing.
Deep knowledge of AWS Glue, Athena, Lambda, Redshift, RDS, DMS, and CloudFormation.
Proficiency in writing complex SQL queries and optimizing them for performance.
Familiarity with serverless architectures and AWS best practices.
Experience in designing and maintaining robust data architectures and data lakes.
Ability to troubleshoot and resolve data pipeline issues efficiently.
Strong communication and stakeholder management skills.

AWS Data Engineer

at Deqode

1 recruiter

Posted by Roshni Maji

Bengaluru (Bangalore), Pune, Mumbai, Chennai, Gurugram

5 - 7 yrs

₹5L - ₹19L / yr

Python

PySpark

Amazon Web Services (AWS)

aws

Amazon Redshift

+1 more

Position: AWS Data Engineer

Experience: 5 to 7 Years

Location: Bengaluru, Pune, Chennai, Mumbai, Gurugram

Work Mode: Hybrid (3 days work from office per week)

Employment Type: Full-time

About the Role:

Key Responsibilities:

Design, develop, and optimize scalable ETL pipelines using AWS Glue, Python/PySpark, and SQL.
Work extensively with AWS services such as Glue, Athena, Lambda, DMS, RDS, Redshift, CloudFormation, and other serverless technologies.
Implement and manage data lake and warehouse solutions using AWS Redshift and S3.
Optimize data models and storage for cost-efficiency and performance.
Write advanced SQL queries to support complex data analysis and reporting requirements.
Collaborate with stakeholders to understand data requirements and translate them into scalable solutions.
Ensure high data quality and integrity across platforms and processes.
Implement CI/CD pipelines and best practices for infrastructure as code using CloudFormation or similar tools.

Required Skills & Experience:

Strong hands-on experience with Python or PySpark for data processing.
Deep knowledge of AWS Glue, Athena, Lambda, Redshift, RDS, DMS, and CloudFormation.
Proficiency in writing complex SQL queries and optimizing them for performance.
Familiarity with serverless architectures and AWS best practices.
Experience in designing and maintaining robust data architectures and data lakes.
Ability to troubleshoot and resolve data pipeline issues efficiently.
Strong communication and stakeholder management skills.

Position: AWS Data Engineer

Experience: 5 to 7 Years

Location: Bengaluru, Pune, Chennai, Mumbai, Gurugram

Work Mode: Hybrid (3 days work from office per week)

Employment Type: Full-time

About the Role:

Key Responsibilities:

Design, develop, and optimize scalable ETL pipelines using AWS Glue, Python/PySpark, and SQL.
Work extensively with AWS services such as Glue, Athena, Lambda, DMS, RDS, Redshift, CloudFormation, and other serverless technologies.
Implement and manage data lake and warehouse solutions using AWS Redshift and S3.
Optimize data models and storage for cost-efficiency and performance.
Write advanced SQL queries to support complex data analysis and reporting requirements.
Collaborate with stakeholders to understand data requirements and translate them into scalable solutions.
Ensure high data quality and integrity across platforms and processes.
Implement CI/CD pipelines and best practices for infrastructure as code using CloudFormation or similar tools.

Required Skills & Experience:

Strong hands-on experience with Python or PySpark for data processing.
Deep knowledge of AWS Glue, Athena, Lambda, Redshift, RDS, DMS, and CloudFormation.
Proficiency in writing complex SQL queries and optimizing them for performance.
Familiarity with serverless architectures and AWS best practices.
Experience in designing and maintaining robust data architectures and data lakes.
Ability to troubleshoot and resolve data pipeline issues efficiently.
Strong communication and stakeholder management skills.

ETL Developer

at Deqode

1 recruiter

Posted by Mokshada Solanki

Bengaluru (Bangalore), Mumbai, Pune, Gurugram

4 - 5 yrs

₹4L - ₹20L / yr

SQL

Amazon Web Services (AWS)

Migration

PySpark

ETL

Job Summary:

Seeking a seasoned SQL + ETL Developer with 4+ years of experience in managing large-scale datasets and cloud-based data pipelines. The ideal candidate is hands-on with MySQL, PySpark, AWS Glue, and ETL workflows, with proven expertise in AWS migration and performance optimization.

Key Responsibilities:

Develop and optimize complex SQL queries and stored procedures to handle large datasets (100+ million records).
Build and maintain scalable ETL pipelines using AWS Glue and PySpark.
Work on data migration tasks in AWS environments.
Monitor and improve database performance; automate key performance indicators and reports.
Collaborate with cross-functional teams to support data integration and delivery requirements.
Write shell scripts for automation and manage ETL jobs efficiently.

Required Skills:

Strong experience with MySQL, complex SQL queries, and stored procedures.
Hands-on experience with AWS Glue, PySpark, and ETL processes.
Good understanding of AWS ecosystem and migration strategies.
Proficiency in shell scripting.
Strong communication and collaboration skills.

Nice to Have:

Working knowledge of Python.
Experience with AWS RDS.

Job Summary:

Key Responsibilities:

Develop and optimize complex SQL queries and stored procedures to handle large datasets (100+ million records).
Build and maintain scalable ETL pipelines using AWS Glue and PySpark.
Work on data migration tasks in AWS environments.
Monitor and improve database performance; automate key performance indicators and reports.
Collaborate with cross-functional teams to support data integration and delivery requirements.
Write shell scripts for automation and manage ETL jobs efficiently.

Required Skills:

Strong experience with MySQL, complex SQL queries, and stored procedures.
Hands-on experience with AWS Glue, PySpark, and ETL processes.
Good understanding of AWS ecosystem and migration strategies.
Proficiency in shell scripting.
Strong communication and collaboration skills.

Nice to Have:

Working knowledge of Python.
Experience with AWS RDS.

Data Engineer - AWS

at Deqode

1 recruiter

Posted by Shraddha Katare

Bengaluru (Bangalore), Pune, Chennai, Mumbai, Gurugram

5 - 7 yrs

₹5L - ₹19L / yr

Amazon Web Services (AWS)

Python

PySpark

SQL

redshift

Profile: AWS Data Engineer

Mode- Hybrid

Experience- 5+7 years

Locations - Bengaluru, Pune, Chennai, Mumbai, Gurugram

Roles and Responsibilities

Design and maintain ETL pipelines using AWS Glue and Python/PySpark
Optimize SQL queries for Redshift and Athena
Develop Lambda functions for serverless data processing
Configure AWS DMS for database migration and replication
Implement infrastructure as code with CloudFormation
Build optimized data models for performance
Manage RDS databases and AWS service integrations
Troubleshoot and improve data processing efficiency
Gather requirements from business stakeholders
Implement data quality checks and validation
Document data pipelines and architecture
Monitor workflows and implement alerting
Keep current with AWS services and best practices

Required Technical Expertise:

Python/PySpark for data processing
AWS Glue for ETL operations
Redshift and Athena for data querying
AWS Lambda and serverless architecture
AWS DMS and RDS management
CloudFormation for infrastructure
SQL optimization and performance tuning

Profile: AWS Data Engineer

Mode- Hybrid

Experience- 5+7 years

Locations - Bengaluru, Pune, Chennai, Mumbai, Gurugram

Roles and Responsibilities

Design and maintain ETL pipelines using AWS Glue and Python/PySpark
Optimize SQL queries for Redshift and Athena
Develop Lambda functions for serverless data processing
Configure AWS DMS for database migration and replication
Implement infrastructure as code with CloudFormation
Build optimized data models for performance
Manage RDS databases and AWS service integrations
Troubleshoot and improve data processing efficiency
Gather requirements from business stakeholders
Implement data quality checks and validation
Document data pipelines and architecture
Monitor workflows and implement alerting
Keep current with AWS services and best practices

Required Technical Expertise:

Python/PySpark for data processing
AWS Glue for ETL operations
Redshift and Athena for data querying
AWS Lambda and serverless architecture
AWS DMS and RDS management
CloudFormation for infrastructure
SQL optimization and performance tuning

AWS Data Engineer

at Deqode

1 recruiter

Posted by Alisha Das

Pune, Mumbai, Bengaluru (Bangalore), Chennai

4 - 7 yrs

₹5L - ₹15L / yr

Amazon Web Services (AWS)

Python

PySpark

Glue semantics

Amazon Redshift

+1 more

Job Overview:

We are seeking an experienced AWS Data Engineer to join our growing data team. The ideal candidate will have hands-on experience with AWS Glue, Redshift, PySpark, and other AWS services to build robust, scalable data pipelines. This role is perfect for someone passionate about data engineering, automation, and cloud-native development.

Key Responsibilities:

Design, build, and maintain scalable and efficient ETL pipelines using AWS Glue, PySpark, and related tools.
Integrate data from diverse sources and ensure its quality, consistency, and reliability.
Work with large datasets in structured and semi-structured formats across cloud-based data lakes and warehouses.
Optimize and maintain data infrastructure, including Amazon Redshift, for high performance.
Collaborate with data analysts, data scientists, and product teams to understand data requirements and deliver solutions.
Automate data validation, transformation, and loading processes to support real-time and batch data processing.
Monitor and troubleshoot data pipeline issues and ensure smooth operations in production environments.

Required Skills:

5 to 7 years of hands-on experience in data engineering roles.
Strong proficiency in Python and PySpark for data transformation and scripting.
Deep understanding and practical experience with AWS Glue, AWS Redshift, S3, and other AWS data services.
Solid understanding of SQL and database optimization techniques.
Experience working with large-scale data pipelines and high-volume data environments.
Good knowledge of data modeling, warehousing, and performance tuning.

Preferred/Good to Have:

Experience with workflow orchestration tools like Airflow or Step Functions.
Familiarity with CI/CD for data pipelines.
Knowledge of data governance and security best practices on AWS.

Job Overview:

Key Responsibilities:

Design, build, and maintain scalable and efficient ETL pipelines using AWS Glue, PySpark, and related tools.
Integrate data from diverse sources and ensure its quality, consistency, and reliability.
Work with large datasets in structured and semi-structured formats across cloud-based data lakes and warehouses.
Optimize and maintain data infrastructure, including Amazon Redshift, for high performance.
Collaborate with data analysts, data scientists, and product teams to understand data requirements and deliver solutions.
Automate data validation, transformation, and loading processes to support real-time and batch data processing.
Monitor and troubleshoot data pipeline issues and ensure smooth operations in production environments.

Required Skills:

5 to 7 years of hands-on experience in data engineering roles.
Strong proficiency in Python and PySpark for data transformation and scripting.
Deep understanding and practical experience with AWS Glue, AWS Redshift, S3, and other AWS data services.
Solid understanding of SQL and database optimization techniques.
Experience working with large-scale data pipelines and high-volume data environments.
Good knowledge of data modeling, warehousing, and performance tuning.

Preferred/Good to Have:

Experience with workflow orchestration tools like Airflow or Step Functions.
Familiarity with CI/CD for data pipelines.
Knowledge of data governance and security best practices on AWS.

ETL Developer

at Deqode

1 recruiter

Posted by Shraddha Katare

Pune, Mumbai, Bengaluru (Bangalore), Gurugram

4 - 6 yrs

₹5L - ₹10L / yr

ETL

SQL

Amazon Web Services (AWS)

PySpark

KPI

Role - ETL Developer

Work Mode - Hybrid

Experience- 4+ years

Location - Pune, Gurgaon, Bengaluru, Mumbai

Required Skills - AWS, AWS Glue, Pyspark, ETL, SQL

Required Skills:

4+ years of hands-on experience in MySQL, including SQL queries and procedure development
Experience in Pyspark, AWS, AWS Glue
Experience in AWS ,Migration
Experience with automated scripting and tracking KPIs/metrics for database performance
Proficiency in shell scripting and ETL.
Strong communication skills and a collaborative team player
Knowledge of Python and AWS RDS is a plus

Role - ETL Developer

Work Mode - Hybrid

Experience- 4+ years

Location - Pune, Gurgaon, Bengaluru, Mumbai

Required Skills - AWS, AWS Glue, Pyspark, ETL, SQL

Required Skills:

4+ years of hands-on experience in MySQL, including SQL queries and procedure development
Experience in Pyspark, AWS, AWS Glue
Experience in AWS ,Migration
Experience with automated scripting and tracking KPIs/metrics for database performance
Proficiency in shell scripting and ETL.
Strong communication skills and a collaborative team player
Knowledge of Python and AWS RDS is a plus

Data Engineer

at Wissen Technology

4 recruiters

Posted by Hanisha Pralayakaveri

Bengaluru (Bangalore), Mumbai

5 - 9 yrs

Best in industry

Python

Amazon Web Services (AWS)

PySpark

Data engineering

Job Description: Data Engineer

Position Overview:

Role Overview

We are seeking a skilled Python Data Engineer with expertise in designing and implementing data solutions using the AWS cloud platform. The ideal candidate will be responsible for building and maintaining scalable, efficient, and secure data pipelines while leveraging Python and AWS services to enable robust data analytics and decision-making processes.

Key Responsibilities

· Design, develop, and optimize data pipelines using Python and AWS services such as Glue, Lambda, S3, EMR, Redshift, Athena, and Kinesis.

· Implement ETL/ELT processes to extract, transform, and load data from various sources into centralized repositories (e.g., data lakes or data warehouses).

· Collaborate with cross-functional teams to understand business requirements and translate them into scalable data solutions.

· Monitor, troubleshoot, and enhance data workflows for performance and cost optimization.

· Ensure data quality and consistency by implementing validation and governance practices.

· Work on data security best practices in compliance with organizational policies and regulations.

· Automate repetitive data engineering tasks using Python scripts and frameworks.

· Leverage CI/CD pipelines for deployment of data workflows on AWS.