Cutshort logo

50+ PySpark Jobs in India

Apply to 50+ PySpark Jobs on CutShort.io. Find your next job, effortlessly. Browse PySpark Jobs and apply today!

icon
Kanerika Software

at Kanerika Software

3 candid answers
2 recruiters
Ariba Khan
Posted by Ariba Khan
Hyderabad, Indore, Ahmedabad
7 - 10 yrs
Upto ₹35L / yr (Varies
)
Snow flake schema
skill iconPython
SQL
databricks
PySpark

About Kanerika:

Kanerika Inc. is a premier global software products and services firm that specializes in providing innovative solutions and services for data-driven enterprises. Our focus is to empower businesses to achieve their digital transformation goals and maximize their business impact through the effective use of data and AI.


We leverage cutting-edge technologies in data analytics, data governance, AI-ML, GenAI/ LLM and industry best practices to deliver custom solutions that help organizations optimize their operations, enhance customer experiences, and drive growth.


Awards and Recognitions:

Kanerika has won several awards over the years, including:

1. Best Place to Work 2023 by Great Place to Work®

2. Top 10 Most Recommended RPA Start-Ups in 2022 by RPA Today

3. NASSCOM Emerge 50 Award in 2014

4. Frost & Sullivan India 2021 Technology Innovation Award for its Kompass composable solution architecture

5. Kanerika has also been recognized for its commitment to customer privacy and data security, having achieved ISO 27701, SOC2, and GDPR compliances.


Working for us:

Kanerika is rated 4.6/5 on Glassdoor, for many good reasons. We truly value our employees' growth, well-being, and diversity, and people’s experiences bear this out. At Kanerika, we offer a host of enticing benefits that create an environment where you can thrive both personally and professionally. From our inclusive hiring practices and mandatory training on creating a safe work environment to our flexible working hours and generous parental leave, we prioritize the well-being and success of our employees.


Our commitment to professional development is evident through our mentorship programs, job training initiatives, and support for professional certifications. Additionally, our company-sponsored outings and various time-off benefits ensure a healthy work-life balance. Join us at Kanerika and become part of a vibrant and diverse community where your talents are recognized, your growth is nurtured, and your contributions make a real impact. See the benefits section below for the perks you’ll get while working for Kanerika.


Role Responsibilities: 

Following are high level responsibilities that you will play but not limited to: 

  • Design, development, and implementation of modern data pipelines, data models, and ETL/ELT processes.
  • Architect and optimize data lake and warehouse solutions using Microsoft Fabric, Databricks, or Snowflake.
  • Enable business analytics and self-service reporting through Power BI and other visualization tools.
  • Collaborate with data scientists, analysts, and business users to deliver reliable and high-performance data solutions.
  • Implement and enforce best practices for data governance, data quality, and security.
  • Mentor and guide junior data engineers; establish coding and design standards.
  • Evaluate emerging technologies and tools to continuously improve the data ecosystem.


Required Qualifications:

  • Bachelor's degree in computer science, Information Technology, Engineering, or a related field.
  • Bachelor’s/ Master’s degree in Computer Science, Information Technology, Engineering, or related field.
  • 7-10 years of experience in data engineering or data platform development
  • Strong hands-on experience in SQL, Snowflake, Python, and Airflow
  • Solid understanding of data modeling, data governance, security, and CI/CD practices.

Preferred Qualifications:

  • Experience in leading a team
  • Familiarity with data modeling techniques and practices for Power BI.
  • Knowledge of Azure Databricks or other data processing frameworks.
  • Knowledge of Microsoft Fabric or other Cloud Platforms.


What we need?

· B. Tech computer science or equivalent.


Why join us?

  • Work with a passionate and innovative team in a fast-paced, growth-oriented environment.
  • Gain hands-on experience in content marketing with exposure to real-world projects.
  • Opportunity to learn from experienced professionals and enhance your marketing skills.
  • Contribute to exciting initiatives and make an impact from day one.
  • Competitive stipend and potential for growth within the company.
  • Recognized for excellence in data and AI solutions with industry awards and accolades.


Employee Benefits:

1. Culture:

  • Open Door Policy: Encourages open communication and accessibility to management.
  • Open Office Floor Plan: Fosters a collaborative and interactive work environment.
  • Flexible Working Hours: Allows employees to have flexibility in their work schedules.
  • Employee Referral Bonus: Rewards employees for referring qualified candidates.
  • Appraisal Process Twice a Year: Provides regular performance evaluations and feedback.


2. Inclusivity and Diversity:

  • Hiring practices that promote diversity: Ensures a diverse and inclusive workforce.
  • Mandatory POSH training: Promotes a safe and respectful work environment.


3. Health Insurance and Wellness Benefits:

  • GMC and Term Insurance: Offers medical coverage and financial protection.
  • Health Insurance: Provides coverage for medical expenses.
  • Disability Insurance: Offers financial support in case of disability.


4. Child Care & Parental Leave Benefits:

  • Company-sponsored family events: Creates opportunities for employees and their families to bond.
  • Generous Parental Leave: Allows parents to take time off after the birth or adoption of a child.
  • Family Medical Leave: Offers leave for employees to take care of family members' medical needs.


5. Perks and Time-Off Benefits:

  • Company-sponsored outings: Organizes recreational activities for employees.
  • Gratuity: Provides a monetary benefit as a token of appreciation.
  • Provident Fund: Helps employees save for retirement.
  • Generous PTO: Offers more than the industry standard for paid time off.
  • Paid sick days: Allows employees to take paid time off when they are unwell.
  • Paid holidays: Gives employees paid time off for designated holidays.
  • Bereavement Leave: Provides time off for employees to grieve the loss of a loved one.


6. Professional Development Benefits:

  • L&D with FLEX- Enterprise Learning Repository: Provides access to a learning repository for professional development.
  • Mentorship Program: Offers guidance and support from experienced professionals.
  • Job Training: Provides training to enhance job-related skills.
  • Professional Certification Reimbursements: Assists employees in obtaining professional   certifications.
  • Promote from Within: Encourages internal growth and advancement opportunities.
Read more
Global Digital Transformation Solutions Provider

Global Digital Transformation Solutions Provider

Agency job
via Peak Hire Solutions by Dhara Thakkar
Hyderabad
5 - 7 yrs
₹15L - ₹21L / yr
skill iconPython
Terraform
PySpark
skill iconAmazon Web Services (AWS)

Job Details

Job Title: Lead I - Data Engineering (Python, AWS Glue, Pyspark, Terraform)

Industry: Global digital transformation solutions provider

Domain - Information technology (IT)

Experience Required: 5-7 years

Employment Type: Full Time

Job Location: Hyderabad

CTC Range: Best in Industry

 

Job Description

Data Engineer with AWS, Python, Glue, Terraform, Step function and Spark

 

Skills: Python, AWS Glue, Pyspark, Terraform - All are mandatory

 

******

Notice period - 0 to 15 days only

Job stability is mandatory

Location: Hyderabad 

Read more
Quantiphi

at Quantiphi

3 candid answers
1 video
Nikita Sinha
Posted by Nikita Sinha
Bengaluru (Bangalore)
7 - 10 yrs
Upto ₹40L / yr (Varies
)
skill iconAmazon Web Services (AWS)
PySpark
SQL

We are hiring an Associate Technical Architect with strong expertise in AWS-based Data Platforms to design scalable data lakes, warehouses, and enterprise data pipelines while working with global teams.


Key Responsibilities

  • Design and implement scalable data warehouse, data lake, and lakehouse architectures on AWS
  • Build resilient and modular data pipelines using native AWS services
  • Architect cloud-based data platforms and evaluate service trade-offs
  • Optimize large-scale data processing and query performance
  • Collaborate with global cross-functional teams (Engineering, QA, PMs, Stakeholders)
  • Communicate technical roadmap, risks, and mitigation strategies

Must-Have Skills

  • 8+ years of experience in AWS Data Engineering / Data Architecture
  • Hands-on experience with AWS services:
  • Amazon S3
  • AWS Glue
  • AWS Lambda
  • Amazon EMR
  • AWS Kinesis (Streams & Firehose)
  • AWS Step Functions / MWAA
  • Amazon Redshift (Spectrum & Serverless)
  • Amazon Athena
  • Amazon RDS
  • AWS Lake Formation
  • AWS DMS, EventBridge, SNS, SQS
  • Strong programming skills in Python & PySpark
  • Advanced SQL with query optimization & performance tuning
  • Deep understanding of:
  • MPP databases
  • Partitioning & indexing strategies
  • Data modeling (Dimensional, Normalized, Lakehouse)
  • Experience building resilient ETL/data pipelines
  • Knowledge of AWS fundamentals:
  • Security
  • Networking
  • Disaster Recovery
  • Scalability & resilience
  • Experience with on-prem → AWS migrations
  • AWS Certification (Solution Architect Associate / Data Engineer Associate)

Good-to-Have Skills

  • Domain experience: FSI / Retail / CPG
  • Data governance & virtualization tools:
  • Collibra
  • Denodo
  • QuickSight / Power BI / Tableau
  • Exposure to:
  • Terraform (IaC)
  • CI/CD pipelines
  • SSIS
  • Apache NiFi, Hive, HDFS, Sqoop
  • Data Mesh architecture
  • Experience with NoSQL databases:
  • DynamoDB
  • MongoDB
  • DocumentDB

Soft Skills

  • Strong problem-solving and analytical mindset
  • Excellent communication and stakeholder management skills
  • Ability to translate technical concepts into business outcomes
  • Experience working with distributed/global teams
Read more
Quantiphi

at Quantiphi

3 candid answers
1 video
Nikita Sinha
Posted by Nikita Sinha
Bengaluru (Bangalore)
3 - 6 yrs
Upto ₹28L / yr (Varies
)
skill iconAmazon Web Services (AWS)
Data engineering
PySpark
SQL
Data migration

As a Senior Data Engineer, you will be responsible for building and delivering a Lakehouse-based data pipeline. This is a hands-on role focused on implementing real-time and batch data ingestion, processing, and delivery workflows, while ensuring strong monitoring, observability, and data quality across the entire pipeline.

Must-Have Skills

  • 3+ years of hands-on experience building large-scale data pipelines
  • Strong experience with Spark Streaming, AWS Glue, and EMR for real-time and batch processing
  • Proficiency in PySpark/Python, including building Kafka producers for data ingestion
  • Experience working with Confluent Kafka and Spark Streaming for ingestion from on-premise sources
  • Solid understanding of AWS services including:
  1. S3
  2. Redshift
  3. Glue
  4. CloudWatch
  5. Secrets Manager
  • Experience working with Medallion Architecture and hybrid data destinations (e.g., Redshift + on-prem Oracle)
  • Ability to implement monitoring dashboards and observability using tools like CloudWatch or Datadog
  • Strong SQL skills for data validation and job-level metrics development
  • Experience building alerting mechanisms for pipeline failures and performance issues
  • Strong collaboration and communication skills
  • Proven ownership mindset — driving deliverables from design to deployment
  • Experience mentoring junior engineers, conducting code reviews, and guiding best practices
  • AWS Certified Data Engineer – Associate (preferred/required)

Good-to-Have Skills

  • Experience with orchestration tools such as Apache Airflow or AWS Step Functions
  • Exposure to Big Data ecosystem tools:
  1. Sqoop
  2. HDFS
  3. Hive
  4. NiFi
  • Exposure to Terraform for infrastructure automation
  • Familiarity with CI/CD pipelines for data workflows
Read more
Global digital transformation solutions provider.

Global digital transformation solutions provider.

Agency job
via Peak Hire Solutions by Dhara Thakkar
Hyderabad
5 - 8 yrs
₹11L - ₹20L / yr
PySpark
Apache Kafka
Data architecture
skill iconAmazon Web Services (AWS)
EMR
+32 more

JOB DETAILS:

* Job Title: Lead II - Software Engineering - AWS, Apache Spark (PySpark/Scala), Apache Kafka

* Industry: Global digital transformation solutions provider

* Salary: Best in Industry

* Experience: 5-8 years

* Location: Hyderabad

 

Job Summary

We are seeking a skilled Data Engineer to design, build, and optimize scalable data pipelines and cloud-based data platforms. The role involves working with large-scale batch and real-time data processing systems, collaborating with cross-functional teams, and ensuring data reliability, security, and performance across the data lifecycle.


Key Responsibilities

ETL Pipeline Development & Optimization

  • Design, develop, and maintain complex end-to-end ETL pipelines for large-scale data ingestion and processing.
  • Optimize data pipelines for performance, scalability, fault tolerance, and reliability.

Big Data Processing

  • Develop and optimize batch and real-time data processing solutions using Apache Spark (PySpark/Scala) and Apache Kafka.
  • Ensure fault-tolerant, scalable, and high-performance data processing systems.

Cloud Infrastructure Development

  • Build and manage scalable, cloud-native data infrastructure on AWS.
  • Design resilient and cost-efficient data pipelines adaptable to varying data volume and formats.

Real-Time & Batch Data Integration

  • Enable seamless ingestion and processing of real-time streaming and batch data sources (e.g., AWS MSK).
  • Ensure consistency, data quality, and a unified view across multiple data sources and formats.

Data Analysis & Insights

  • Partner with business teams and data scientists to understand data requirements.
  • Perform in-depth data analysis to identify trends, patterns, and anomalies.
  • Deliver high-quality datasets and present actionable insights to stakeholders.

CI/CD & Automation

  • Implement and maintain CI/CD pipelines using Jenkins or similar tools.
  • Automate testing, deployment, and monitoring to ensure smooth production releases.

Data Security & Compliance

  • Collaborate with security teams to ensure compliance with organizational and regulatory standards (e.g., GDPR, HIPAA).
  • Implement data governance practices ensuring data integrity, security, and traceability.

Troubleshooting & Performance Tuning

  • Identify and resolve performance bottlenecks in data pipelines.
  • Apply best practices for monitoring, tuning, and optimizing data ingestion and storage.

Collaboration & Cross-Functional Work

  • Work closely with engineers, data scientists, product managers, and business stakeholders.
  • Participate in agile ceremonies, sprint planning, and architectural discussions.


Skills & Qualifications

Mandatory (Must-Have) Skills

  1. AWS Expertise
  • Hands-on experience with AWS Big Data services such as EMR, Managed Apache Airflow, Glue, S3, DMS, MSK, and EC2.
  • Strong understanding of cloud-native data architectures.
  1. Big Data Technologies
  • Proficiency in PySpark or Scala Spark and SQL for large-scale data transformation and analysis.
  • Experience with Apache Spark and Apache Kafka in production environments.
  1. Data Frameworks
  • Strong knowledge of Spark DataFrames and Datasets.
  1. ETL Pipeline Development
  • Proven experience in building scalable and reliable ETL pipelines for both batch and real-time data processing.
  1. Database Modeling & Data Warehousing
  • Expertise in designing scalable data models for OLAP and OLTP systems.
  1. Data Analysis & Insights
  • Ability to perform complex data analysis and extract actionable business insights.
  • Strong analytical and problem-solving skills with a data-driven mindset.
  1. CI/CD & Automation
  • Basic to intermediate experience with CI/CD pipelines using Jenkins or similar tools.
  • Familiarity with automated testing and deployment workflows.

 

Good-to-Have (Preferred) Skills

  • Knowledge of Java for data processing applications.
  • Experience with NoSQL databases (e.g., DynamoDB, Cassandra, MongoDB).
  • Familiarity with data governance frameworks and compliance tooling.
  • Experience with monitoring and observability tools such as AWS CloudWatch, Splunk, or Dynatrace.
  • Exposure to cost optimization strategies for large-scale cloud data platforms.

 

Skills: big data, scala spark, apache spark, ETL pipeline development

 

******

Notice period - 0 to 15 days only

Job stability is mandatory

Location: Hyderabad

Note: If a candidate is a short joiner, based in Hyderabad, and fits within the approved budget, we will proceed with an offer

F2F Interview: 14th Feb 2026

3 days in office, Hybrid model.

 


Read more
Wissen Technology

at Wissen Technology

4 recruiters
Janane Mohanasankaran
Posted by Janane Mohanasankaran
Mumbai, Pune
3 - 6 yrs
Best in industry
skill iconPython
PySpark
pandas
SQL
ADF
+2 more

* Python (3 to 6 years): Strong expertise in data workflows and automation

* Spark (PySpark): Hands-on experience with large-scale data processing

* Pandas: For detailed data analysis and validation

* Delta Lake: Managing structured and semi-structured datasets at scale

* SQL: Querying and performing operations on Delta tables

* Azure Cloud: Compute and storage services

* Orchestrator: Good experience with either ADF or Airflow

Read more
Auxo AI
kusuma Gullamajji
Posted by kusuma Gullamajji
Bengaluru (Bangalore), Mumbai, Hyderabad, Gurugram
3 - 5 yrs
₹10L - ₹25L / yr
python
PySpark
skill iconAmazon Web Services (AWS)
glue

AuxoAI is seeking a skilled and experienced Data Engineer to join our dynamic team. The ideal candidate will have 3- 5 years of prior experience in data engineering, with a strong background in AWS (Amazon Web Services) technologies. This role offers an exciting opportunity to work on diverse projects, collaborating with cross-functional teams to design, build, and optimize data pipelines and infrastructure.

Experience : 3 - 5years

Notice : Immediate to 15days

Responsibilities :

Design, develop, and maintain scalable data pipelines and ETL processes leveraging AWS services such as S3, Glue, EMR, Lambda, and Redshift.

Collaborate with data scientists and analysts to understand data requirements and implement solutions that support analytics and machine learning initiatives.

Optimize data storage and retrieval mechanisms to ensure performance, reliability, and cost-effectiveness.

Implement data governance and security best practices to ensure compliance and data integrity.

Troubleshoot and debug data pipeline issues, providing timely resolution and proactive monitoring.

Stay abreast of emerging technologies and industry trends, recommending innovative solutions to enhance data engineering capabilities.

Qualifications :

Bachelor's or Master's degree in Computer Science, Engineering, or a related field.

3 - 5 years of prior experience in data engineering, with a focus on designing and building data pipelines.

Proficiency in AWS services, particularly S3, Glue, EMR, Lambda, and Redshift.

Strong programming skills in languages such as Python, Java, or Scala.

Experience with SQL and NoSQL databases, data warehousing concepts, and big data technologies.

Familiarity with containerization technologies (e.g., Docker, Kubernetes) and orchestration tools (e.g., Apache Airflow) is a plus.

Read more
Global digital transformation solutions provider

Global digital transformation solutions provider

Agency job
via Peak Hire Solutions by Dhara Thakkar
Trivandrum, Kochi (Cochin)
4 - 6 yrs
₹11L - ₹17L / yr
Windows Azure
skill iconPython
SQL Azure
databricks
PySpark
+15 more

JOB DETAILS:

* Job Title: Associate III - Azure Data Engineer 

* Industry: Global digital transformation solutions provide

* Salary: Best in Industry

* Experience: 4 -6 years

* Location: Trivandrum, Kochi

Job Description: Azure Data Engineer (4–6 Years Experience)

Job Type: Full-time 

Locations: Kochi, Trivandrum

 

Must-Have Skills

Azure & Data Engineering

  • Azure Data Factory (ADF)
  • Azure Databricks (PySpark)
  • Azure Synapse Analytics
  • Azure Data Lake Storage Gen2
  • Azure SQL Database

 

Programming & Querying

  • Python (PySpark)
  • SQL / Spark SQL

 

Data Modelling

  • Star & Snowflake schema
  • Dimensional modelling

 

Source Systems

  • SQL Server
  • Oracle
  • SAP
  • REST APIs
  • Flat files (CSV, JSON, XML)

 

CI/CD & Version Control

  • Git
  • Azure DevOps / GitHub Actions

 

Monitoring & Scheduling

  • ADF triggers
  • Databricks jobs
  • Log Analytics

 

Security

  • Managed Identity
  • Azure Key Vault
  • Azure RBAC / Access Control

 

Soft Skills

  • Strong analytical & problem-solving skills
  • Good communication and collaboration
  • Ability to work in Agile/Scrum environments
  • Self-driven and proactive

 

Good-to-Have Skills

  • Power BI basics
  • Delta Live Tables
  • Synapse Pipelines
  • Real-time processing (Event Hub / Stream Analytics)
  • Infrastructure as Code (Terraform / ARM templates)
  • Data governance tools like Azure Purview
  • Azure Data Engineer Associate (DP-203) certification

 

Educational Qualifications

  • Bachelor’s degree in Computer Science, Information Technology, or a related field.

 

Skills: Azure Data Factory, Azure Databricks, Azure Synapse, Azure Data Lake Storage

 

Must-Haves

Azure Data Factory (4-6 years), Azure Databricks/PySpark (4-6 years), Azure Synapse Analytics (4-6 years), SQL/Spark SQL (4-6 years), Git/Azure DevOps (4-6 years)

Skills: Azure, Azure data factory, Python, Pyspark, Sql, Rest Api, Azure Devops

Relevant 4 - 6 Years

python is mandatory

 

******

Notice period - 0 to 15 days only (Feb joiners’ profiles only)

Location: Kochi

F2F Interview 7th Feb

Read more
Ekloud INC
ashwini rathod
Posted by ashwini rathod
india
6 - 20 yrs
₹5L - ₹30L / yr
ADF
databricks
PySpark
SQL
skill iconPython
+2 more

Hiring : Azure Data Engineer


Experience level: 5 yrs – 12yrs

Location : Bangalore

Work arrangement : On-site

Budget Range: Flexible


Mandatory Skill :


Self-Rating (7+ is must)

ADF, Databricks , Pyspark , SQL - Mandatory

Good to have :-

Delta Live table , Python , Team handling-

Manager ( 7+yrs exp) ,

Azure functions, Unity catalog, real-time streaming , Data pipelines

Read more
The Blue Owls Solutions

at The Blue Owls Solutions

2 candid answers
Apoorvo Chakraborty
Posted by Apoorvo Chakraborty
Pune
6 - 10 yrs
₹20L - ₹30L / yr
Data governance
Data engineering
Team leadership
Data modeling
Synapse
+3 more

The Role


We are looking for a Azure Data Architect to join our team in Pune. You will be responsible for the end-to-end lifecycle of data solutions, from initial client requirement gathering and solution architecture design to leading the data engineering team through implementation. You will be the technical anchor for the project, ensuring that our data estates are scalable, governed, and high-performing.


Key Responsibilities

  • Architecture & Design: Design robust data architectures using Microsoft Fabric and Azure Synapse, focusing on Medallion architecture and metadata-driven frameworks.
  • End-to-End Delivery: Translate complex client business requirements into technical roadmaps and lead the team to deliver them on time.
  • Data Governance: Implement and manage enterprise-grade governance, data discovery, and lineage using Microsoft Purview.
  • Team Leadership: Act as the technical lead for the team, performing code reviews, mentoring junior engineers, and ensuring best practices in PySpark and SQL.
  • Client Management: Interface directly with stakeholders to define project scope and provide technical consultancy.


What We’re Looking For

  • 6+ Years in Data Engineering with at least 3+ years leading technical teams or designing architectures.
  • Expertise in Microsoft Fabric/Synapse: Deep experience with Lakehouses, Warehouses, and Spark-based processing.
  • Governance Specialist: Proven experience implementing Microsoft Purview for data cataloging, sensitivity labeling, and lineage.
  • Technical Breadth: Strong proficiency in PySpark, SQL, and Data Factory. Familiarity with Infrastructure as Code (Bicep/Terraform) is a major plus.

Why Work with Us?

  • Competitive Pay
  • Flexible Hours
  • Work on Microsoft’s latest (Fabric, Purview, Foundry) as a Designated Solutions Partner.
  • High-Stakes Impact: Solve complex, client-facing problems for enterprise leaders
  • Structured learning paths to help you master AI automation and Agentic AI.
Read more
Ganit Business Solutions

at Ganit Business Solutions

3 recruiters
Agency job
via hirezyai by HR Hirezyai
Bengaluru (Bangalore), Chennai, Mumbai
5.5 - 12 yrs
₹15L - ₹25L / yr
skill iconAmazon Web Services (AWS)
PySpark
SQL

Roles & Responsibilities

  • Data Engineering Excellence: Design and implement data pipelines using formats like JSON, Parquet, CSV, and ORC, utilizing batch and streaming ingestion.
  • Cloud Data Migration Leadership: Lead cloud migration projects, developing scalable Spark pipelines.
  • Medallion Architecture: Implement Bronze, Silver, and gold tables for scalable data systems.
  • Spark Code Optimization: Optimize Spark code to ensure efficient cloud migration.
  • Data Modeling: Develop and maintain data models with strong governance practices.
  • Data Cataloging & Quality: Implement cataloging strategies with Unity Catalog to maintain high-quality data.
  • Delta Live Table Leadership: Lead the design and implementation of Delta Live Tables (DLT) pipelines for secure, tamper-resistant data management.
  • Customer Collaboration: Collaborate with clients to optimize cloud migrations and ensure best practices in design and governance.

Educational Qualifications

  • Experience: Minimum 5 years of hands-on experience in data engineering, with a proven track record in complex pipeline development and cloud-based data migration projects.
  • Education: Bachelor’s or higher degree in Computer Science, Data Engineering, or a related field.
  • Skills
  • Must-have: Proficiency in Spark, SQL, Python, and other relevant data processing technologies. Strong knowledge of Databricks and its components, including Delta Live Table (DLT) pipeline implementations. Expertise in on-premises to cloud Spark code optimization and Medallion Architecture.

Good to Have

  • Familiarity with AWS services (experience with additional cloud platforms like GCP or Azure is a plus).

Soft Skills

  • Excellent communication and collaboration skills, with the ability to work effectively with clients and internal teams.
  • Certifications
  • AWS/GCP/Azure Data Engineer Certification.


Read more
-
Remote only
8 - 13 yrs
₹10L - ₹33L / yr
python
PySpark
Big Data
SQL

Role: Lead Data Engineer Core

Responsibilities: Lead end-to-end design, development, and delivery of complex cloud-based data pipelines.

Collaborate with architects and stakeholders to translate business requirements into technical data solutions.

Ensure scalability, reliability, and performance of data systems across environments. Provide mentorship and technical leadership to data engineering teams. Define and enforce best practices for data modeling, transformation, and governance.


Optimize data ingestion and transformation frameworks for efficiency and cost management. Contribute to data architecture design and review sessions across projects.


Qualifications: Bachelor’s or Master’s degree in Computer Science, Engineering, or related field.

8+ years of experience in data engineering with proven leadership in designing cloud native data systems.


Strong expertise in Python, SQL, Apache Spark, and at least one cloud platform (Azure, AWS, or GCP). Experience with Big Data, DataLake, DeltaLake, and Lakehouse architectures Proficient in one or more database technologies (e.g. PostgreSQL, Redshift, Snowflake, and NoSQL databases).


Ability to recommend and implement scalable data pipelines Preferred Qualifications: Cloud certification (AWS, Azure, or GCP). Experience with Databricks, Snowflake, or Terraform. Familiarity with data governance, lineage, and observability tools. Strong collaboration skills and ability to influence data-driven decisions across teams.

Read more
AI-First Company

AI-First Company

Agency job
via Peak Hire Solutions by Dhara Thakkar
Bengaluru (Bangalore), Mumbai, Hyderabad, Gurugram
5 - 17 yrs
₹30L - ₹45L / yr
Data engineering
Data architecture
SQL
Data modeling
GCS
+47 more

ROLES AND RESPONSIBILITIES:

You will be responsible for architecting, implementing, and optimizing Dremio-based data Lakehouse environments integrated with cloud storage, BI, and data engineering ecosystems. The role requires a strong balance of architecture design, data modeling, query optimization, and governance enablement in large-scale analytical environments.


  • Design and implement Dremio lakehouse architecture on cloud (AWS/Azure/Snowflake/Databricks ecosystem).
  • Define data ingestion, curation, and semantic modeling strategies to support analytics and AI workloads.
  • Optimize Dremio reflections, caching, and query performance for diverse data consumption patterns.
  • Collaborate with data engineering teams to integrate data sources via APIs, JDBC, Delta/Parquet, and object storage layers (S3/ADLS).
  • Establish best practices for data security, lineage, and access control aligned with enterprise governance policies.
  • Support self-service analytics by enabling governed data products and semantic layers.
  • Develop reusable design patterns, documentation, and standards for Dremio deployment, monitoring, and scaling.
  • Work closely with BI and data science teams to ensure fast, reliable, and well-modeled access to enterprise data.


IDEAL CANDIDATE:

  • Bachelor’s or Master’s in Computer Science, Information Systems, or related field.
  • 5+ years in data architecture and engineering, with 3+ years in Dremio or modern lakehouse platforms.
  • Strong expertise in SQL optimization, data modeling, and performance tuning within Dremio or similar query engines (Presto, Trino, Athena).
  • Hands-on experience with cloud storage (S3, ADLS, GCS), Parquet/Delta/Iceberg formats, and distributed query planning.
  • Knowledge of data integration tools and pipelines (Airflow, DBT, Kafka, Spark, etc.).
  • Familiarity with enterprise data governance, metadata management, and role-based access control (RBAC).
  • Excellent problem-solving, documentation, and stakeholder communication skills.


PREFERRED:

  • Experience integrating Dremio with BI tools (Tableau, Power BI, Looker) and data catalogs (Collibra, Alation, Purview).
  • Exposure to Snowflake, Databricks, or BigQuery environments.
  • Experience in high-tech, manufacturing, or enterprise data modernization programs.
Read more
Global digital transformation solutions provider.

Global digital transformation solutions provider.

Agency job
via Peak Hire Solutions by Dhara Thakkar
Bengaluru (Bangalore)
7 - 9 yrs
₹15L - ₹28L / yr
databricks
skill iconPython
SQL
PySpark
skill iconAmazon Web Services (AWS)
+9 more

Role Proficiency:

This role requires proficiency in developing data pipelines including coding and testing for ingesting wrangling transforming and joining data from various sources. The ideal candidate should be adept in ETL tools like Informatica Glue Databricks and DataProc with strong coding skills in Python PySpark and SQL. This position demands independence and proficiency across various data domains. Expertise in data warehousing solutions such as Snowflake BigQuery Lakehouse and Delta Lake is essential including the ability to calculate processing costs and address performance issues. A solid understanding of DevOps and infrastructure needs is also required.


Skill Examples:

  1. Proficiency in SQL Python or other programming languages used for data manipulation.
  2. Experience with ETL tools such as Apache Airflow Talend Informatica AWS Glue Dataproc and Azure ADF.
  3. Hands-on experience with cloud platforms like AWS Azure or Google Cloud particularly with data-related services (e.g. AWS Glue BigQuery).
  4. Conduct tests on data pipelines and evaluate results against data quality and performance specifications.
  5. Experience in performance tuning.
  6. Experience in data warehouse design and cost improvements.
  7. Apply and optimize data models for efficient storage retrieval and processing of large datasets.
  8. Communicate and explain design/development aspects to customers.
  9. Estimate time and resource requirements for developing/debugging features/components.
  10. Participate in RFP responses and solutioning.
  11. Mentor team members and guide them in relevant upskilling and certification.

 

Knowledge Examples:

  1. Knowledge of various ETL services used by cloud providers including Apache PySpark AWS Glue GCP DataProc/Dataflow Azure ADF and ADLF.
  2. Proficient in SQL for analytics and windowing functions.
  3. Understanding of data schemas and models.
  4. Familiarity with domain-related data.
  5. Knowledge of data warehouse optimization techniques.
  6. Understanding of data security concepts.
  7. Awareness of patterns frameworks and automation practices.


 

Additional Comments:

# of Resources: 22 Role(s): Technical Role Location(s): India Planned Start Date: 1/1/2026 Planned End Date: 6/30/2026

Project Overview:

Role Scope / Deliverables: We are seeking highly skilled Data Engineer with strong experience in Databricks, PySpark, Python, SQL, and AWS to join our data engineering team on or before 1st week of Dec, 2025.

The candidate will be responsible for designing, developing, and optimizing large-scale data pipelines and analytics solutions that drive business insights and operational efficiency.

Design, build, and maintain scalable data pipelines using Databricks and PySpark.

Develop and optimize complex SQL queries for data extraction, transformation, and analysis.

Implement data integration solutions across multiple AWS services (S3, Glue, Lambda, Redshift, EMR, etc.).

Collaborate with analytics, data science, and business teams to deliver clean, reliable, and timely datasets.

Ensure data quality, performance, and reliability across data workflows.

Participate in code reviews, data architecture discussions, and performance optimization initiatives.

Support migration and modernization efforts for legacy data systems to modern cloud-based solutions.


Key Skills:

Hands-on experience with Databricks, PySpark & Python for building ETL/ELT pipelines.

Proficiency in SQL (performance tuning, complex joins, CTEs, window functions).

Strong understanding of AWS services (S3, Glue, Lambda, Redshift, CloudWatch, etc.).

Experience with data modeling, schema design, and performance optimization.

Familiarity with CI/CD pipelines, version control (Git), and workflow orchestration (Airflow preferred).

Excellent problem-solving, communication, and collaboration skills.

 

Skills: Databricks, Pyspark & Python, Sql, Aws Services

 

Must-Haves

Python/PySpark (5+ years), SQL (5+ years), Databricks (3+ years), AWS Services (3+ years), ETL tools (Informatica, Glue, DataProc) (3+ years)

Hands-on experience with Databricks, PySpark & Python for ETL/ELT pipelines.

Proficiency in SQL (performance tuning, complex joins, CTEs, window functions).

Strong understanding of AWS services (S3, Glue, Lambda, Redshift, CloudWatch, etc.).

Experience with data modeling, schema design, and performance optimization.

Familiarity with CI/CD pipelines, Git, and workflow orchestration (Airflow preferred).


******

Notice period - Immediate to 15 days

Location: Bangalore

Read more
Wissen Technology

at Wissen Technology

4 recruiters
Swet Patel
Posted by Swet Patel
Bengaluru (Bangalore)
5 - 13 yrs
Best in industry
databricks
skill iconPython
SQL
PySpark
Spark

Key Responsibilities

We are seeking an experienced Data Engineer with a strong background in Databricks, Python, Spark/PySpark and SQL to design, develop, and optimize large-scale data processing applications. The ideal candidate will build scalable, high-performance data engineering solutions and ensure seamless data flow across cloud and on-premise platforms.

Key Responsibilities:

  • Design, develop, and maintain scalable data processing applications using DatabricksPython, and PySpark/Spark.
  • Write and optimize complex SQL queries for data extraction, transformation, and analysis.
  • Collaborate with data engineers, data scientists, and other stakeholders to understand business requirements and deliver high-quality solutions.
  • Ensure data integrity, performance, and reliability across all data processing pipelines.
  • Perform data analysis and implement data validation to ensure high data quality.
  • Implement and manage CI/CD pipelines for automated testing, integration, and deployment.
  • Contribute to continuous improvement of data engineering processes and tools.

Required Skills & Qualifications:

  • Bachelor’s or Master’s degree in Computer Science, Engineering, or a related field.
  • Proven experience as a Databricks with strong expertise in Python, SQL and Spark/PySpark.
  • Strong proficiency in SQL, including working with relational databases and writing optimized queries.
  • Solid programming experience in Python, including data processing and automation.


Read more
Vy Systems

at Vy Systems

1 recruiter
Kalki K
Posted by Kalki K
Remote only
4 - 12 yrs
₹18L - ₹28L / yr
databricks
skill iconAmazon Web Services (AWS)
SQL
skill iconPython
PySpark

Job Summary


We are seeking an experienced Databricks Developer with strong skills in PySpark, SQL, Python, and hands-on experience deploying data solutions on AWS (preferred), Azure. The role involves designing, developing, and optimizing scalable data pipelines and analytics workflows on the Databricks platform.


Key Responsibilities

- Develop and optimize ETL/ELT pipelines using Databricks and PySpark.

- Build scalable data workflows on AWS (EC2, S3, Glue, Lambda, IAM) or Azure (ADF, ADLS, Synapse).

- Implement and manage Delta Lake (ACID, schema evolution, time travel).

- Write efficient, complex SQL for transformation and analytics.

- Build and support batch and streaming ingestion (Kafka, Kinesis, EventHub).

- Optimize Databricks clusters, jobs, notebooks, and PySpark performance.

- Collaborate with cross-functional teams to deliver reliable data solutions.

- Ensure data governance, security, and compliance.

- Troubleshoot pipelines and support CI/CD deployments.


Required Skills & Experience

- 4–8 years in Data Engineering / Big Data development.

- Strong hands-on experience with Databricks (clusters, jobs, workflows).

- Advanced PySpark and strong Python skills.

- Expert-level SQL (complex queries, window functions).

- Practical experience with AWS (preferred) or Azure cloud services.

- Experience with Delta Lake, Parquet, and data lake architectures.

- Familiarity with CI/CD tools (GitHub Actions, Azure DevOps, Jenkins).

- Good understanding of data modeling, optimization, and distributed systems.

Read more
Tecblic Private LImited
Ahmedabad
5 - 6 yrs
₹5L - ₹15L / yr
Windows Azure
skill iconPython
SQL
Data Warehouse (DWH)
Data modeling
+5 more

Job Description: Data Engineer

Location: Ahmedabad

Experience: 5 to 6 years

Employment Type: Full-Time



We are looking for a highly motivated and experienced Data Engineer to join our  team. As a Data Engineer, you will play a critical role in designing, building, and optimizing data pipelines that ensure the availability, reliability, and performance of our data infrastructure. You will collaborate closely with data scientists, analysts, and cross-functional teams to provide timely and efficient data solutions.



Responsibilities


● Design and optimize data pipelines for various data sources


● Design and implement efficient data storage and retrieval mechanisms


● Develop data modelling solutions and data validation mechanisms


● Troubleshoot data-related issues and recommend process improvements


● Collaborate with data scientists and stakeholders to provide data-driven insights and solutions


● Coach and mentor junior data engineers in the team




Skills Required: 


● Minimum 4 years of experience in data engineering or related field


● Proficient in designing and optimizing data pipelines and data modeling


● Strong programming expertise in Python


● Hands-on experience with big data technologies such as Hadoop, Spark, and Hive


● Extensive experience with cloud data services such as AWS, Azure, and GCP


● Advanced knowledge of database technologies like SQL, NoSQL, and data warehousing


● Knowledge of distributed computing and storage systems


● Familiarity with DevOps practices and power automate and Microsoft Fabric will be an added advantage


● Strong analytical and problem-solving skills with outstanding communication and collaboration abilities




Qualifications


  • Bachelor's degree in Computer Science, Data Science, or a Computer related field


Read more
lulu international

lulu international

Agency job
via Episeio Business Solutions by Praveen Saulam
Bengaluru (Bangalore)
2.5 - 3 yrs
₹7L - ₹9L / yr
SQL
PySpark
databricks
Hypothesis testing
ANOVA gauge R&R

Role Overview

As a Lead Data Scientist / Data Analyst, you’ll combine analytical thinking, business acumen, and technical expertise to design and deliver impactful data-driven solutions. You’ll lead analytical problem-solving for retail clients — from data exploration and visualisation to predictive modelling and actionable business insights.

 

Key Responsibilities

  • Partner with business stakeholders to understand problems and translate them into analytical solutions.
  • Lead end-to-end analytics projects — from hypothesis framing and data wrangling to insight delivery and model implementation.
  • Drive exploratory data analysis (EDA), identify patterns/trends, and derive meaningful business stories from data.
  • Design and implement statistical and machine learning models (e.g., segmentation, propensity, CLTV, price/promo optimisation).
  • Build and automate dashboards, KPI frameworks, and reports for ongoing business monitoring.
  • Collaborate with data engineering and product teams to deploy solutions in production environments.
  • Present complex analyses in a clear, business-oriented way, influencing decision-making across retail categories.
  • Promote an agile, experiment-driven approach to analytics delivery.

 

Common Use Cases You’ll Work On

  • Customer segmentation (RFM, mission-based, behavioural)
  • Price and promo effectiveness
  • Assortment and space optimisation
  • CLTV and churn prediction
  • Store performance analytics and benchmarking
  • Campaign measurement and targeting
  • Category in-depth reviews and presentation to L1 leadership team

 

Required Skills and Experience

  • 3+ years of experience in data science, analytics, or consulting (preferably in the retail domain)
  • Proven ability to connect business questions to analytical solutions and communicate insights effectively
  • Strong SQL skills for data manipulation and querying large datasets
  • Advanced Python for statistical analysis, machine learning, and data processing
  • Intermediate PySpark / Databricks skills for working with big data
  • Comfortable with data visualisation tools (Power BI, Tableau, or similar)
  • Knowledge of statistical techniques (Hypothesis testing, ANOVA, regression, A/B testing, etc.)
  • Familiarity with agile project management tools (JIRA, Trello, etc.)

 

Good to Have

  • Experience designing data pipelines or analytical workflows in cloud environments (Azure preferred)
  • Strong understanding of retail KPIs (sales, margin, penetration, conversion, ATV, UPT, etc.)
  • Prior exposure to Promotion or Pricing analytics 
  • Dashboard development or reporting automation expertise


Read more
Deqode

at Deqode

1 recruiter
Shraddha Katare
Posted by Shraddha Katare
Pune, Gurugram, Jaipur, Bhopal
5 - 8 yrs
₹10L - ₹18L / yr
Data engineering
databricks
Data Structures
skill iconPython
PySpark

Job Description -

Position: Senior Data Engineer (Azure)

Experience - 6+ Years

Mode - Hybrid

Location - Gurgaon, Pune, Jaipur, Bangalore, Bhopal


Key Responsibilities:

  • Data Processing on Azure: Azure Data Factory, Streaming Analytics, Event Hubs, Azure Databricks, Data Migration Service, Data Pipeline.
  • Provisioning, configuring, and developing Azure solutions (ADB, ADF, ADW, etc.).
  • Design and implement scalable data models and migration strategies.
  • Work on distributed big data batch or streaming pipelines (Kafka or similar).
  • Develop data integration and transformation solutions for structured and unstructured data.
  • Collaborate with cross-functional teams for performance tuning and optimization.
  • Monitor data workflows and ensure compliance with data governance and quality standards.
  • Contribute to continuous improvement through automation and DevOps practices.

Required Skills & Experience:

  • 6–10 years of experience as a Data Engineer.
  • Strong proficiency in Azure Databricks, PySpark, Python, SQL, and Azure Data Factory.
  • Experience in Data Modelling, Data Migration, and Data Warehousing.
  • Good understanding of database structure principles and schema design.
  • Hands-on experience using MS SQL Server, Oracle, or similar RDBMS platforms.
  • Experience in DevOps tools (Azure DevOps, Jenkins, Airflow, Azure Monitor) – good to have.
  • Knowledge of distributed data processing and real-time streaming (Kafka/Event Hub).
  • Familiarity with visualization tools like Power BI or Tableau.
  • Strong analytical, problem-solving, and debugging skills.
  • Self-motivated, detail-oriented, and capable of managing priorities effectively.


Read more
Data Havn

Data Havn

Agency job
via Infinium Associate by Toshi Srivastava
Delhi, Gurugram, Noida, Ghaziabad, Faridabad
2.5 - 4.5 yrs
₹10L - ₹20L / yr
skill iconPython
SQL
Google Cloud Platform (GCP)
SQL server
ETL
+9 more

About the Role:


We are seeking a talented Data Engineer to join our team and play a pivotal role in transforming raw data into valuable insights. As a Data Engineer, you will design, develop, and maintain robust data pipelines and infrastructure to support our organization's analytics and decision-making processes.

Responsibilities:

  • Data Pipeline Development: Build and maintain scalable data pipelines to extract, transform, and load (ETL) data from various sources (e.g., databases, APIs, files) into data warehouses or data lakes.
  • Data Infrastructure: Design, implement, and manage data infrastructure components, including data warehouses, data lakes, and data marts.
  • Data Quality: Ensure data quality by implementing data validation, cleansing, and standardization processes.
  • Team Management: Able to handle team.
  • Performance Optimization: Optimize data pipelines and infrastructure for performance and efficiency.
  • Collaboration: Collaborate with data analysts, scientists, and business stakeholders to understand their data needs and translate them into technical requirements.
  • Tool and Technology Selection: Evaluate and select appropriate data engineering tools and technologies (e.g., SQL, Python, Spark, Hadoop, cloud platforms).
  • Documentation: Create and maintain clear and comprehensive documentation for data pipelines, infrastructure, and processes.

 

 Skills:

  • Strong proficiency in SQL and at least one programming language (e.g., Python, Java).
  • Experience with data warehousing and data lake technologies (e.g., Snowflake, AWS Redshift, Databricks).
  • Knowledge of cloud platforms (e.g., AWS, GCP, Azure) and cloud-based data services.
  • Understanding of data modeling and data architecture concepts.
  • Experience with ETL/ELT tools and frameworks.
  • Excellent problem-solving and analytical skills.
  • Ability to work independently and as part of a team.

Preferred Qualifications:

  • Experience with real-time data processing and streaming technologies (e.g., Kafka, Flink).
  • Knowledge of machine learning and artificial intelligence concepts.
  • Experience with data visualization tools (e.g., Tableau, Power BI).
  • Certification in cloud platforms or data engineering.
Read more
Corridor Platforms

at Corridor Platforms

3 recruiters
Aniket Agrawal
Posted by Aniket Agrawal
Bengaluru (Bangalore)
4 - 8 yrs
₹30L - ₹50L / yr
skill iconPython
PySpark
Apache Spark
NumPy
pandas
+8 more

About Corridor Platforms

Corridor Platforms is a leader in next-generation risk decisioning and responsible AI governance, empowering banks and lenders to build transparent, compliant, and data-driven solutions. Our platforms combine advanced analytics, real-time data integration, and GenAI to support complex financial decision workflows for regulated industries.

Role Overview

As a Backend Engineer at Corridor Platforms, you will:

  • Architect, develop, and maintain backend components for our Risk Decisioning Platform.
  • Build and orchestrate scalable backend services that automate, optimize, and monitor high-value credit and risk decisions in real time.
  • Integrate with ORM layers – such as SQLAlchemy – and multi RDBMS solutions (Postgres, MySQL, Oracle, MSSQL, etc) to ensure data integrity, scalability, and compliance.
  • Collaborate closely with Product Team, Data Scientists, QA Teams to create extensible APIs, workflow automation, and AI governance features.
  • Architect workflows for privacy, auditability, versioned traceability, and role-based access control, ensuring adherence to regulatory frameworks.
  • Take ownership from requirements to deployment, seeing your code deliver real impact in the lives of customers and end users.

Technical Skills

  • Languages: Python 3.9+, SQL, JavaScript/TypeScript, Angular
  • Frameworks: Flask, SQLAlchemy, Celery, Marshmallow, Apache Spark
  • Databases: PostgreSQL, Oracle, SQL Server, Redis
  • Tools: pytest, Docker, Git, Nx
  • Cloud: Experience with AWS, Azure, or GCP preferred
  • Monitoring: Familiarity with OpenTelemetry and logging frameworks


Why Join Us?

  • Cutting-Edge Tech: Work hands-on with the latest AI, cloud-native workflows, and big data tools—all within a single compliant platform.
  • End-to-End Impact: Contribute to mission-critical backend systems, from core data models to live production decision services.
  • Innovation at Scale: Engineer solutions that process vast data volumes, helping financial institutions innovate safely and effectively.
  • Mission-Driven: Join a passionate team advancing fair, transparent, and compliant risk decisioning at the forefront of fintech and AI governance.

What We’re Looking For

  • Proficiency in Python, SQLAlchemy (or similar ORM), and SQL databases.
  • Experience developing and maintaining scalable backend services, including API, data orchestration, ML workflows,  and workflow automation.
  • Solid understanding of data modeling, distributed systems, and backend architecture for regulated environments.
  • Curiosity and drive to work at the intersection of AI/ML, fintech, and regulatory technology.
  • Experience mentoring and guiding junior developers.


Ready to build backends that shape the future of decision intelligence and responsible AI?

Apply now and become part of the innovation at Corridor Platforms!



Read more
Intineri infosol Pvt Ltd

at Intineri infosol Pvt Ltd

2 candid answers
Shivani Pandey
Posted by Shivani Pandey
Remote only
6 - 8 yrs
₹5L - ₹10L / yr
PySpark
Spark
Snowflake
Data Transformation Tool (DBT)
Airflow
+2 more

About the Role:

 

We are seeking an experienced Data Engineer to lead and execute the migration of existing Databricks-based pipelines to Snowflake. The role requires strong expertise in PySpark/Spark, Snowflake, DBT, and Airflow with additional exposure to DevOps and CI/CD practices. The candidate will be responsible for re-architecting data

pipelines, ensuring data consistency, scalability, and performance in Snowflake, and enabling robust automation and monitoring across environments.


Key Responsibilities

Databricks to Snowflake Migration

·       Analyze and understand existing pipelines and frameworks in Databricks (PySpark/Spark).

·       Re-architect pipelines for execution in Snowflake using efficient SQL-based processing.

·       Translate Databricks notebooks/jobs into Snowflake/DBT equivalents.

·       Ensure a smooth transition with data consistency, performance, and scalability.

 

Snowflake

·       Hands-on experience with storage integrations, staging (internal/external), Snowpipe, tables/views, COPY INTO, CREATE OR ALTER, and file formats.

·       Implement RBAC (role-based access control), data governance, and performance tuning.

·       Design and optimize SQL queries for large-scale data processing.

 

DBT (with Snowflake)

·       Implement and manage models, macros, materializations, and SQL execution within DBT.

·       Use DBT for modular development, version control, and multi-environment deployments.

 

Airflow (Orchestration)

·       Design and manage DAGs to automate workflows and ensure reliability.

·       Handle task dependencies, error recovery, monitoring, and integrations (Cosmos, Astronomer, Docker).

 

DevOps & CI/CD

·       Develop and manage CI/CD pipelines for Snowflake and DBT using GitHub Actions, Azure DevOps, or equivalent.

·       Manage version-controlled environments and ensure smooth promotion of changes across dev, test, and prod.

 

Monitoring & Observability

·       Implement monitoring, alerting, and logging for data pipelines.

·       Build self-healing or alert-driven mechanisms for critical/severe issue detection.

·       Ensure system reliability and proactive issue resolution.



Required Skills & Qualifications

·       5+ years of experience in data engineering with focus on cloud data platforms.

·       Strong expertise in:

·       Databricks (PySpark/Spark) – analysis, transformations, dependencies.

·       Snowflake – architecture, SQL, performance tuning, security (RBAC).

·       DBT – modular model development, macros, deployments.

·       Airflow – DAG design, orchestration, and error handling.

·       Experience in CI/CD pipeline development (GitHub Actions, Azure DevOps).

·       Solid understanding of data modeling, ETL/ELT processes, and best practices.

·       Excellent problem-solving, communication, and stakeholder collaboration skills.

 

Good to Have

·       Exposure to Docker/Kubernetes for orchestration.

·       Knowledge of Azure Data Services (ADF, ADLS) or similar cloud tools.

·       Experience with data governance, lineage, and metadata management.

 

Education

·       Bachelor’s / Master’s degree in Computer Science, Engineering, or related field.

Read more
One of the reputed Client in India

One of the reputed Client in India

Bengaluru (Bangalore), Mumbai, Delhi, Gurugram, Noida, Hyderabad, Pune
6 - 8 yrs
₹12L - ₹13L / yr
skill iconAmazon Web Services (AWS)
skill iconPython
PySpark

Our Client is looking to hire Databricks Amin immediatly.


This is PAN-INDIA Bulk hiring


Minimum of 6-8+ years with Databricks, Pyspark/Python and AWS.

Must have AWS


Notice 15-30 days is preferred.


Share profiles at hr at etpspl dot com

Please refer/share our email to your friends/colleagues who are looking for job.

Read more
Wissen Technology

at Wissen Technology

4 recruiters
Nishita Bangera
Posted by Nishita Bangera
Bengaluru (Bangalore)
4 - 8 yrs
Best in industry
skill iconPython
SQL
PySpark
skill iconDjango

Key Responsibilities

  • Develop and maintain Python-based applications.
  • Design and optimize SQL queries and databases.
  • Collaborate with cross-functional teams to define, design, and ship new features.
  • Write clean, maintainable, and efficient code.
  • Troubleshoot and debug applications.
  • Participate in code reviews and contribute to team knowledge sharing.

Qualifications and Required Skills

  • Strong proficiency in Python programming.
  • Experience with SQL and database management.
  • Experience with web frameworks such as Django or Flask.
  • Knowledge of front-end technologies like HTML, CSS, and JavaScript.
  • Familiarity with version control systems like Git.
  • Strong problem-solving skills and attention to detail.
  • Excellent communication and teamwork skills.

Good to Have Skills

  • Experience with cloud platforms like AWS or Azure.
  • Knowledge of containerization technologies like Docker.
  • Familiarity with continuous integration and continuous deployment (CI/CD) pipelines


Read more
Wissen Technology

at Wissen Technology

4 recruiters
Gagandeep Kaur
Posted by Gagandeep Kaur
Bengaluru (Bangalore), Mumbai, Pune
4 - 7 yrs
Best in industry
skill iconPython
PySpark
pandas
Airflow
Data engineering

Wissen Technology is hiring for Data Engineer

About Wissen Technology: At Wissen Technology, we deliver niche, custom-built products that solve complex business challenges across industries worldwide. Founded in 2015, our core philosophy is built around a strong product engineering mindset—ensuring every solution is architected and delivered right the first time. Today, Wissen Technology has a global footprint with 2000+ employees across offices in the US, UK, UAE, India, and Australia. Our commitment to excellence translates into delivering 2X impact compared to traditional service providers. How do we achieve this? Through a combination of deep domain knowledge, cutting-edge technology expertise, and a relentless focus on quality. We don’t just meet expectations—we exceed them by ensuring faster time-to-market, reduced rework, and greater alignment with client objectives. We have a proven track record of building mission-critical systems across industries, including financial services, healthcare, retail, manufacturing, and more. Wissen stands apart through its unique delivery models. Our outcome-based projects ensure predictable costs and timelines, while our agile pods provide clients the flexibility to adapt to their evolving business needs. Wissen leverages its thought leadership and technology prowess to drive superior business outcomes. Our success is powered by top-tier talent. Our mission is clear: to be the partner of choice for building world-class custom products that deliver exceptional impact—the first time, every time.

Job Summary: Wissen Technology is hiring a Data Engineer with expertise in Python, Pandas, Airflow, and Azure Cloud Services. The ideal candidate will have strong communication skills and experience with Kubernetes.

Experience: 4-7 years

Notice Period: Immediate- 15 days

Location: Pune, Mumbai, Bangalore

Mode of Work: Hybrid

Key Responsibilities:

  • Develop and maintain data pipelines using Python and Pandas.
  • Implement and manage workflows using Airflow.
  • Utilize Azure Cloud Services for data storage and processing.
  • Collaborate with cross-functional teams to understand data requirements and deliver solutions.
  • Ensure data quality and integrity throughout the data lifecycle.
  • Optimize and scale data infrastructure to meet business needs.

Qualifications and Required Skills:

  • Proficiency in Python (Must Have).
  • Strong experience with Pandas (Must Have).
  • Expertise in Airflow (Must Have).
  • Experience with Azure Cloud Services.
  • Good communication skills.

Good to Have Skills:

  • Experience with Pyspark.
  • Knowledge of Kubernetes.

Wissen Sites:


Read more
Wissen Technology

at Wissen Technology

4 recruiters
Bipasha Rath
Posted by Bipasha Rath
Mumbai, Bengaluru (Bangalore), Pune
3 - 7 yrs
Best in industry
skill iconPython
pandas
PySpark

Experience: 3–7 Years

Locations: Pune / Bangalore / Mumbai

Notice Period :Immediate joiner only


Employment Type: Full-time

🛠️ Key Skills (Mandatory):

  • Python: Strong coding skills for data manipulation and automation.
  • PySpark: Experience with distributed data processing using Spark.
  • SQL: Proficient in writing complex queries for data extraction and transformation.
  • Azure Databricks: Hands-on experience with notebooks, Delta Lake, and MLflow


Interested candidates please share resume with details below.


Total Experience -

Relevant Experience in Python,Pyspark,AQL,Azure Data bricks-

Current CTC -

Expected CTC -

Notice period -

Current Location -

Desired Location -


Read more
Wissen Technology

at Wissen Technology

4 recruiters
Janane Mohanasankaran
Posted by Janane Mohanasankaran
Bengaluru (Bangalore), Pune, Mumbai
7 - 12 yrs
Best in industry
skill iconPython
pandas
PySpark
SQL
Data engineering

Wissen Technology is hiring for Data Engineer

About Wissen Technology:At Wissen Technology, we deliver niche, custom-built products that solve complex business challenges across industries worldwide. Founded in 2015, our core philosophy is built around a strong product engineering mindset—ensuring every solution is architected and delivered right the first time. Today, Wissen Technology has a global footprint with 2000+ employees across offices in the US, UK, UAE, India, and Australia. Our commitment to excellence translates into delivering 2X impact compared to traditional service providers. How do we achieve this? Through a combination of deep domain knowledge, cutting-edge technology expertise, and a relentless focus on quality. We don’t just meet expectations—we exceed them by ensuring faster time-to-market, reduced rework, and greater alignment with client objectives. We have a proven track record of building mission-critical systems across industries, including financial services, healthcare, retail, manufacturing, and more. Wissen stands apart through its unique delivery models. Our outcome-based projects ensure predictable costs and timelines, while our agile pods provide clients the flexibility to adapt to their evolving business needs. Wissen leverages its thought leadership and technology prowess to drive superior business outcomes. Our success is powered by top-tier talent. Our mission is clear: to be the partner of choice for building world-class custom products that deliver exceptional impact—the first time, every time.

Job Summary:Wissen Technology is hiring a Data Engineer with a strong background in Python, data engineering, and workflow optimization. The ideal candidate will have experience with Delta Tables, Parquet, and be proficient in Pandas and PySpark.

Experience:7+ years

Location:Pune, Mumbai, Bangalore

Mode of Work:Hybrid

Key Responsibilities:

  • Develop and maintain data pipelines using Python (Pandas, PySpark).
  • Optimize data workflows and ensure efficient data processing.
  • Work with Delta Tables and Parquet for data storage and management.
  • Collaborate with cross-functional teams to understand data requirements and deliver solutions.
  • Ensure data quality and integrity throughout the data lifecycle.
  • Implement best practices for data engineering and workflow optimization.

Qualifications and Required Skills:

  • Proficiency in Python, specifically with Pandas and PySpark.
  • Strong experience in data engineering and workflow optimization.
  • Knowledge of Delta Tables and Parquet.
  • Excellent problem-solving skills and attention to detail.
  • Ability to work collaboratively in a team environment.
  • Strong communication skills.

Good to Have Skills:

  • Experience with Databricks.
  • Knowledge of Apache Spark, DBT, and Airflow.
  • Advanced Pandas optimizations.
  • Familiarity with PyTest/DBT testing frameworks.

Wissen Sites:

 

Wissen | Driving Digital Transformation

A technology consultancy that drives digital innovation by connecting strategy and execution, helping global clients to strengthen their core technology.

 

Read more
Tata Consultancy Services
Chennai, Hyderabad, Kolkata, Delhi, Pune, Bengaluru (Bangalore)
4 - 10 yrs
₹6L - ₹30L / yr
Scala
PySpark
Spark
skill iconAmazon Web Services (AWS)

Job Title: PySpark/Scala Developer

 

Functional Skills: Experience in Credit Risk/Regulatory risk domain

Technical Skills: Spark ,PySpark, Python, Hive, Scala, MapReduce, Unix shell scripting

Good to Have Skills: Exposure to Machine Learning Techniques

Job Description:

5+ Years of experience with Developing/Fine tuning and implementing programs/applications

Using Python/PySpark/Scala on Big Data/Hadoop Platform.

Roles and Responsibilities:

a)     Work with a Leading Bank’s Risk Management team on specific projects/requirements pertaining to risk Models in

 consumer and wholesale banking

b)     Enhance Machine Learning Models using PySpark or Scala

c)     Work with Data Scientists to Build ML Models based on Business Requirements and Follow ML Cycle to Deploy them all

the way to Production Environment

d)     Participate Feature Engineering, Training Models, Scoring and retraining

e)     Architect Data Pipeline and Automate Data Ingestion and Model Jobs

 

Skills and competencies:

Required:

·       Strong analytical skills in conducting sophisticated statistical analysis using bureau/vendor data, customer performance

Data and macro-economic data to solve business problems.

·       Working experience in languages PySpark & Scala to develop code to validate and implement models and codes in

Credit Risk/Banking

·       Experience with distributed systems such as Hadoop/MapReduce, Spark, streaming data processing, cloud architecture.

  • Familiarity with machine learning frameworks and libraries (like scikit-learn, SparkML, tensorflow, pytorch etc.
  • Experience in systems integration, web services, batch processing
  • Experience in migrating codes to PySpark/Scala is big Plus
  • The ability to act as liaison conveying information needs of the business to IT and data constraints to the business

applies equal conveyance regarding business strategy and IT strategy, business processes and work flow

·       Flexibility in approach and thought process

·       Attitude to learn and comprehend the periodical changes in the regulatory requirement as per FED

 

 

Read more
Tata Consultancy Services
Bengaluru (Bangalore), Hyderabad, Pune, Delhi, Kolkata, Chennai
5 - 8 yrs
₹7L - ₹30L / yr
skill iconScala
skill iconPython
PySpark
Apache Hive
Spark
+3 more

Skills and competencies:

Required:

·        Strong analytical skills in conducting sophisticated statistical analysis using bureau/vendor data, customer performance

Data and macro-economic data to solve business problems.

·        Working experience in languages PySpark & Scala to develop code to validate and implement models and codes in

Credit Risk/Banking

·        Experience with distributed systems such as Hadoop/MapReduce, Spark, streaming data processing, cloud architecture.

  • Familiarity with machine learning frameworks and libraries (like scikit-learn, SparkML, tensorflow, pytorch etc.
  • Experience in systems integration, web services, batch processing
  • Experience in migrating codes to PySpark/Scala is big Plus
  • The ability to act as liaison conveying information needs of the business to IT and data constraints to the business

applies equal conveyance regarding business strategy and IT strategy, business processes and work flow

·        Flexibility in approach and thought process

·        Attitude to learn and comprehend the periodical changes in the regulatory requirement as per FED

Read more
Deqode

at Deqode

1 recruiter
Shraddha Katare
Posted by Shraddha Katare
Pune, Bengaluru (Bangalore)
5 - 8 yrs
₹5L - ₹13L / yr
skill iconAmazon Web Services (AWS)
databricks
PySpark
SQL

Profile: AWS Data Engineer

Mandate skills :AWS + Databricks + Pyspark + SQL role

Location: Bangalore/Pune/Hyderabad/Chennai/Gurgaon:

Notice Period: Immediate

Key Requirements :

  • Design, build, and maintain scalable data pipelines to collect, process, and store from multiple datasets.
  • Optimize data storage solutions for better performance, scalability, and cost-efficiency.
  • Develop and manage ETL/ELT processes to transform data as per schema definitions, apply slicing and dicing, and make it available for downstream jobs and other teams.
  • Collaborate closely with cross-functional teams to understand system and product functionalities, pace up feature development, and capture evolving data requirements.
  • Engage with stakeholders to gather requirements and create curated datasets for downstream consumption and end-user reporting.
  • Automate deployment and CI/CD processes using GitHub workflows, identifying areas to reduce manual, repetitive work.
  • Ensure compliance with data governance policies, privacy regulations, and security protocols.
  • Utilize cloud platforms like AWS and work on Databricks for data processing with S3 Storage.
  • Work with distributed systems and big data technologies such as Spark, SQL, and Delta Lake.
  • Integrate with SFTP to push data securely from Databricks to remote locations.
  • Analyze and interpret spark query execution plans to fine-tune queries for faster and more efficient processing.
  • Strong problem-solving and troubleshooting skills in large-scale distributed systems.


Read more
Remote, Bengaluru (Bangalore), Pune, Chennai, Nagpur
5 - 15 yrs
₹20L - ₹30L / yr
databricks
PySpark
Apache Spark
CI/CD
Data engineering


Technical Architect (Databricks)

  • 10+ Years Data Engineering Experience with expertise in Databricks
  • 3+ years of consulting experience
  • Completed Data Engineering Professional certification & required classes
  • Minimum 2-3 projects delivered with hands-on experience in Databricks
  • Completed Apache Spark Programming with Databricks, Data Engineering with Databricks, Optimizing Apache Spark™ on Databricks
  • Experience in Spark and/or Hadoop, Flink, Presto, other popular big data engines
  • Familiarity with Databricks multi-hop pipeline architecture

 

 

Sr. Data Engineer (Databricks)

 

  • 5+ Years Data Engineering Experience with expertise in Databricks
  • Completed Data Engineering Associate certification & required classes
  • Minimum 1 project delivered with hands-on experience in development on Databricks
  • Completed Apache Spark Programming with Databricks, Data Engineering with Databricks, Optimizing Apache Spark™ on Databricks
  • SQL delivery experience, and familiarity with Bigquery, Synapse or Redshift
  • Proficient in Python, knowledge of additional databricks programming languages (Scala)


Read more
Aceis Services

at Aceis Services

2 candid answers
Anushi Mishra
Posted by Anushi Mishra
Remote only
2 - 10 yrs
₹8.6L - ₹30.2L / yr
CI/CD
Apache Spark
PySpark
MLOps
skill iconMachine Learning (ML)
+6 more

We are hiring freelancers to work on advanced Data & AI projects using Databricks. If you are passionate about cloud platforms, machine learning, data engineering, or architecture, and want to work with cutting-edge tools on real-world challenges, this is the opportunity for you!

Key Details

  • Work Type: Freelance / Contract
  • Location: Remote
  • Time Zones: IST / EST only
  • Domain: Data & AI, Cloud, Big Data, Machine Learning
  • Collaboration: Work with industry leaders on innovative projects

🔹 Open Roles

1. Databricks – Senior Consultant

  • Skills: Data Warehousing, Python, Java, Scala, ETL, SQL, AWS, GCP, Azure
  • Experience: 6+ years

2. Databricks – ML Engineer

  • Skills: CI/CD, MLOps, Machine Learning, Spark, Hadoop
  • Experience: 4+ years

3. Databricks – Solution Architect

  • Skills: Azure, GCP, AWS, CI/CD, MLOps
  • Experience: 7+ years

4. Databricks – Solution Consultant

  • Skills: SQL, Spark, BigQuery, Python, Scala
  • Experience: 2+ years

What We Offer

  • Opportunity to work with top-tier professionals and clients
  • Exposure to cutting-edge technologies and real-world data challenges
  • Flexible remote work environment aligned with IST / EST time zones
  • Competitive compensation and growth opportunities

📌 Skills We Value

Cloud Computing | Data Warehousing | Python | Java | Scala | ETL | SQL | AWS | GCP | Azure | CI/CD | MLOps | Machine Learning | Spark |

Read more
Wissen Technology

at Wissen Technology

4 recruiters
Annie Varghese
Posted by Annie Varghese
Pune, Mumbai, Bengaluru (Bangalore)
3 - 8 yrs
Best in industry
snowflake
Apache Airflow
ETL
skill iconPython
PySpark
+1 more

Job Summary:

We are looking for a highly skilled and experienced Data Engineer with deep expertise in Airflow, dbt, Python, and Snowflake. The ideal candidate will be responsible for designing, building, and managing scalable data pipelines and transformation frameworks to enable robust data workflows across the organization.

Key Responsibilities:

  • Design and implement scalable ETL/ELT pipelines using Apache Airflow for orchestration.
  • Develop modular and maintainable data transformation models using dbt.
  • Write high-performance data processing scripts and automation using Python.
  • Build and maintain data models and pipelines on Snowflake.
  • Collaborate with data analysts, data scientists, and business teams to deliver clean, reliable, and timely data.
  • Monitor and optimize pipeline performance and troubleshoot issues proactively.
  • Follow best practices in version control, testing, and CI/CD for data projects.

Must-Have Skills:

  • Strong hands-on experience with Apache Airflow for scheduling and orchestrating data workflows.
  • Proficiency in dbt (data build tool) for building scalable and testable data models.
  • Expert-level skills in Python for data processing and automation.
  • Solid experience with Snowflake, including SQL performance tuning, data modeling, and warehouse management.
  • Strong understanding of data engineering best practices including modularity, testing, and deployment.

Good to Have:

  • Experience working with cloud platforms (AWS/GCP/Azure).
  • Familiarity with CI/CD pipelines for data (e.g., GitHub Actions, GitLab CI).
  • Exposure to modern data stack tools (e.g., Fivetran, Stitch, Looker).
  • Knowledge of data security and governance best practices.


Note : One face-to-face (F2F) round is mandatory, and as per the process, you will need to visit the office for this.

Read more
Janvi Panchal
Janvi Panchal
Posted by Janvi Panchal
Ahmedabad
4 - 6 yrs
₹10L - ₹20L / yr
skill iconPython
PySpark
Microsoft Windows Azure
skill iconAmazon Web Services (AWS)
SQL
+1 more


Job Description: 

  • 4+ years of experience in a Data Engineer role,
  • Experience with object-oriented/object function scripting languages: Python, Scala, Golang, Java, etc.
  • Experience with Big data tools such as Spark, Hadoop/ Kafka/ Airflow/Hive
  • Experience with Streaming data: Spark/Kinesis/Kafka/Pubsub/Event Hub
  • Experience with GCP/Azure data factory/AWS
  • Strong in SQL Scripting
  • Experience with ETL tools
  • Knowledge of Snowflake Data Warehouse
  • Knowledge of Orchestration frameworks: Airflow/Luig
  • Good to have knowledge of Data Quality Management frameworks
  • Good to have knowledge of Master Data Management
  • Self-learning abilities are a must
  • Familiarity with upcoming new technologies is a strong plus.
  • Should have a bachelor's degree in big data analytics, computer engineering, or a related field


Personal Competency:

  • Strong communication skills is a MUST
  • Self-motivated, detail-oriented
  • Strong organizational skills
  • Ability to prioritize workloads and meet deadlines
Read more
Awign

at Awign

3 recruiters
Agency job
via TechSkillio by Tech Skillio
Remote only
4 - 8 yrs
₹90000 - ₹120000 / mo
databricks
Azure
PySpark
Azure Synopse Analytics

Job Description

Overview:


We are seeking an experienced Azure Data Engineer to join our team in a hybrid Developer/Support capacity. This role focuses on enhancing and supporting existing Data & Analytics solutions by leveraging Azure Data Engineering technologies. The engineer will work on developing, maintaining, and deploying IT products and solutions that serve various business users, with a strong emphasis on performance, scalability, and reliability.



Must-Have Skills:


Azure Databricks

PySpark

Azure Synapse Analytics



Key Responsibilities:



  • Incident classification and prioritization
  • Log analysis and trend identification
  • Coordination with Subject Matter Experts (SMEs)
  • Escalation of unresolved or complex issues
  • Root cause analysis and permanent resolution implementation
  • Stakeholder communication and status updates
  • Resolution of complex and major incidents
  • Code reviews (Per week 2 per individual) to ensure adherence to standards and optimize performance
  • Bug fixing of recurring or critical issues identified during operations
  • Gold layer tasks, including enhancements and performance tuning.
  • Design, develop, and support data pipelines and solutions using Azure data engineering services.
  • Implement data flow and ETL techniques leveraging Azure Data Factory, Databricks, and Synapse.
  • Cleanse, transform, and enrich datasets using Databricks notebooks and PySpark.


  • Orchestrate and automate workflows across services and systems.
  • Collaborate with business and technical teams to deliver robust and scalable data solutions.
  • Work in a support role to resolve incidents, handle change/service requests, and monitor performance.
  • Contribute to CI/CD pipeline implementation using Azure DevOps.



Technical Requirements:


  • 4 to 6 years of experience in IT and Azure data engineering technologies.
  • Strong experience in Azure Databricks, Azure Synapse, and ADLS Gen2.
  • Proficient in Python, PySpark, and SQL.
  • Experience with file formats such as JSON and Parquet.
  • Working knowledge of database systems, with a preference for Teradata and Snowflake.
  • Hands-on experience with Azure DevOps and CI/CD pipeline deployments.
  • Understanding of Data Warehousing concepts and data modeling best practices.
  • Familiarity with SNOW (ServiceNow) for incident and change management.



Non-Technical Requirements:


  • Ability to work independently and collaboratively in virtual teams across geographies.
  • Strong analytical and problem-solving skills.
  • Experience in Agile development practices, including estimation, testing, and deployment.
  • Effective task and time management with the ability to prioritize under pressure.
  • Clear communication and documentation skills for project updates and technical processes.



Technologies:


  • Azure Data Factory
  • Azure Databricks
  • Azure Synapse Analytics
  • PySpark / SQL
  • Azure Data Lake Storage (ADLS), Blob Storage
  • Azure DevOps (CI/CD pipelines)



Nice-to-Have:


  • Experience with Business Intelligence tools, preferably Power BI
  • DP-203 certification (Azure Data Engineer Associate) 




NOTE -

Weekly rotational shifts -

11am to 8pm

2pm to 11pm

5pm to 2 am


P.S. - In any one weekend they should be available in call. If there is any issues alone they should work on that. there will be on call support monthly once.



Read more
VyTCDC
Gobinath Sundaram
Posted by Gobinath Sundaram
Chennai, Bengaluru (Bangalore), Hyderabad, Mumbai, Pune, Noida
4 - 6 yrs
₹3L - ₹21L / yr
AWS Data Engineer
skill iconAmazon Web Services (AWS)
skill iconPython
PySpark
databricks
+1 more

 Key Responsibilities

  • Design and implement ETL/ELT pipelines using Databricks, PySpark, and AWS Glue
  • Develop and maintain scalable data architectures on AWS (S3, EMR, Lambda, Redshift, RDS)
  • Perform data wrangling, cleansing, and transformation using Python and SQL
  • Collaborate with data scientists to integrate Generative AI models into analytics workflows
  • Build dashboards and reports to visualize insights using tools like Power BI or Tableau
  • Ensure data quality, governance, and security across all data assets
  • Optimize performance of data pipelines and troubleshoot bottlenecks
  • Work closely with stakeholders to understand data requirements and deliver actionable insights

🧪 Required Skills

Skill AreaTools & TechnologiesCloud PlatformsAWS (S3, Lambda, Glue, EMR, Redshift)Big DataDatabricks, Apache Spark, PySparkProgrammingPython, SQLData EngineeringETL/ELT, Data Lakes, Data WarehousingAnalyticsData Modeling, Visualization, BI ReportingGen AI IntegrationOpenAI, Hugging Face, LangChain (preferred)DevOps (Bonus)Git, Jenkins, Terraform, Docker

📚 Qualifications

  • Bachelor's or Master’s degree in Computer Science, Data Science, or related field
  • 3+ years of experience in data engineering or data analytics
  • Hands-on experience with Databricks, PySpark, and AWS
  • Familiarity with Generative AI tools and frameworks is a strong plus
  • Strong problem-solving and communication skills

🌟 Preferred Traits

  • Analytical mindset with attention to detail
  • Passion for data and emerging technologies
  • Ability to work independently and in cross-functional teams
  • Eagerness to learn and adapt in a fast-paced environment


Read more
Risosu Consulting LLP
Vandana Saxena
Posted by Vandana Saxena
Bengaluru (Bangalore)
5 - 7 yrs
₹12L - ₹18L / yr
skill iconPython
PySpark
SQL

Job Title: Python Developer

Location: Bangalore

Experience: 5–7 Years

Employment Type: Full-Time


Job Description:


We are seeking an experienced Python Developer with strong proficiency in data analysis tools and PySpark, along with a solid understanding of SQL syntax. The ideal candidate will work on large-scale data processing and analysis tasks within a fast-paced environment.


Key Requirements:


Python: Hands-on experience with Python, specifically in data analysis using libraries such as pandas, numpy, etc.


PySpark: Proficiency in writing efficient PySpark code for distributed data processing.


SQL: Strong knowledge of SQL syntax and experience in writing optimized queries.


Ability to work independently and collaborate effectively with cross-functional teams.

Read more
Tekit Software solution Pvt Ltd
himanshi Tripathi
Posted by himanshi Tripathi
Hyderabad, Bengaluru (Bangalore)
8 - 10 yrs
₹15L - ₹27L / yr
skill iconAmazon Web Services (AWS)
skill iconPython
PySpark
SQL

🔍 Job Description:

We are looking for an experienced and highly skilled Technical Lead to guide the development and enhancement of a large-scale Data Observability solution built on AWS. This platform is pivotal in delivering monitoring, reporting, and actionable insights across the client's data landscape.

The Technical Lead will drive end-to-end feature delivery, mentor junior engineers, and uphold engineering best practices. The position reports to the Programme Technical Lead / Architect and involves close collaboration to align on platform vision, technical priorities, and success KPIs.

🎯 Key Responsibilities:

  • Lead the design, development, and delivery of features for the data observability solution.
  • Mentor and guide junior engineers, promoting technical growth and engineering excellence.
  • Collaborate with the architect to align on platform roadmap, vision, and success metrics.
  • Ensure high quality, scalability, and performance in data engineering solutions.
  • Contribute to code reviews, architecture discussions, and operational readiness.


🔧 Primary Must-Have Skills (Non-Negotiable):

  • 5+ years in Data Engineering or Software Engineering roles.
  • 3+ years in a technical team or squad leadership capacity.
  • Deep expertise in AWS Data Services: Glue, EMR, Kinesis, Lambda, Athena, S3.
  • Advanced programming experience with PySpark, Python, and SQL.
  • Proven experience in building scalable, production-grade data pipelines on cloud platforms.


Read more
NeoGenCode Technologies Pvt Ltd
Bengaluru (Bangalore)
8 - 12 yrs
₹15L - ₹22L / yr
Data engineering
Google Cloud Platform (GCP)
Data Transformation Tool (DBT)
Google Dataform
BigQuery
+6 more

Job Title : Data Engineer – GCP + Spark + DBT

Location : Bengaluru (On-site at Client Location | 3 Days WFO)

Experience : 8 to 12 Years

Level : Associate Architect

Type : Full-time


Job Overview :

We are looking for a seasoned Data Engineer to join the Data Platform Engineering team supporting a Unified Data Platform (UDP). This role requires hands-on expertise in DBT, GCP, BigQuery, and PySpark, with a solid foundation in CI/CD, data pipeline optimization, and agile delivery.


Mandatory Skills : GCP, DBT, Google Dataform, BigQuery, PySpark/Spark SQL, Advanced SQL, CI/CD, Git, Agile Methodologies.


Key Responsibilities :

  • Design, build, and optimize scalable data pipelines using BigQuery, DBT, and PySpark.
  • Leverage GCP-native services like Cloud Storage, Pub/Sub, Dataproc, Cloud Functions, and Composer for ETL/ELT workflows.
  • Implement and maintain CI/CD for data engineering projects with Git-based version control.
  • Collaborate with cross-functional teams including Infra, Security, and DataOps for reliable, secure, and high-quality data delivery.
  • Lead code reviews, mentor junior engineers, and enforce best practices in data engineering.
  • Participate in Agile sprints, backlog grooming, and Jira-based project tracking.

Must-Have Skills :

  • Strong experience with DBT, Google Dataform, and BigQuery
  • Hands-on expertise with PySpark/Spark SQL
  • Proficient in GCP for data engineering workflows
  • Solid knowledge of SQL optimization, Git, and CI/CD pipelines
  • Agile team experience and strong problem-solving abilities

Nice-to-Have Skills :

  • Familiarity with Databricks, Delta Lake, or Kafka
  • Exposure to data observability and quality frameworks (e.g., Great Expectations, Soda)
  • Knowledge of MDM patterns, Terraform, or IaC is a plus
Read more
A leading software company

A leading software company

Agency job
via BOS consultants by Manka Joshi
Remote only
6 - 9 yrs
₹12L - ₹15L / yr
databricks
PySpark
Large Language Models (LLM)
Vector database
Google Cloud Platform (GCP)
+1 more

1. Solid Databricks & pyspark experience 

2. Must have worked in projects dealing with data at terabyte scale

3. Must have knowledge of spark optimization techniques 

4. Must have experience setting up job pipelines in Databricks 

5. Basic knowledge of gcp and big query is required 

6. Understanding LLMs and vector db

Read more
Wissen Technology

at Wissen Technology

4 recruiters
Praffull Shinde
Posted by Praffull Shinde
Pune, Mumbai, Bengaluru (Bangalore)
4 - 8 yrs
₹14L - ₹26L / yr
skill iconPython
PySpark
skill iconDjango
skill iconFlask
RESTful APIs
+3 more

Job title - Python developer

Exp – 4 to 6 years

Location – Pune/Mum/B’lore

 

PFB JD

Requirements:

  • Proven experience as a Python Developer
  • Strong knowledge of core Python and Pyspark concepts
  • Experience with web frameworks such as Django or Flask
  • Good exposure to any cloud platform (GCP Preferred)
  • CI/CD exposure required
  • Solid understanding of RESTful APIs and how to build them
  • Experience working with databases like Oracle DB and MySQL
  • Ability to write efficient SQL queries and optimize database performance
  • Strong problem-solving skills and attention to detail
  • Strong SQL programing (stored procedure, functions)
  • Excellent communication and interpersonal skill

Roles and Responsibilities

  • Design, develop, and maintain data pipelines and ETL processes using pyspark
  • Work closely with data scientists and analysts to provide them with clean, structured data.
  • Optimize data storage and retrieval for performance and scalability.
  • Collaborate with cross-functional teams to gather data requirements.
  • Ensure data quality and integrity through data validation and cleansing processes.
  • Monitor and troubleshoot data-related issues to ensure data pipeline reliability.
  • Stay up to date with industry best practices and emerging technologies in data engineering.
Read more
Tecblic Private LImited
Ahmedabad
4 - 5 yrs
₹8L - ₹12L / yr
Microsoft Windows Azure
SQL
skill iconPython
PySpark
ETL
+2 more

🚀 We Are Hiring: Data Engineer | 4+ Years Experience 🚀


Job description

🔍 Job Title: Data Engineer

📍 Location: Ahmedabad

🚀 Work Mode: On-Site Opportunity

📅 Experience: 4+ Years

🕒 Employment Type: Full-Time

⏱️ Availability : Immediate Joiner Preferred


Join Our Team as a Data Engineer

We are seeking a passionate and experienced Data Engineer to be a part of our dynamic and forward-thinking team in Ahmedabad. This is an exciting opportunity for someone who thrives on transforming raw data into powerful insights and building scalable, high-performance data infrastructure.

As a Data Engineer, you will work closely with data scientists, analysts, and cross-functional teams to design robust data pipelines, optimize data systems, and enable data-driven decision-making across the organization.


Your Key Responsibilities

Architect, build, and maintain scalable and reliable data pipelines from diverse data sources.

Design effective data storage, retrieval mechanisms, and data models to support analytics and business needs.

Implement data validation, transformation, and quality monitoring processes.

Collaborate with cross-functional teams to deliver impactful, data-driven solutions.

Proactively identify bottlenecks and optimize existing workflows and processes.

Provide guidance and mentorship to junior engineers in the team.


Skills & Expertise We’re Looking For

3+ years of hands-on experience in Data Engineering or related roles.

Strong expertise in Python and data pipeline design.

Experience working with Big Data tools like Hadoop, Spark, Hive.

Proficiency with SQL, NoSQL databases, and data warehousing solutions.

Solid experience in cloud platforms - Azure

Familiar with distributed computing, data modeling, and performance tuning.

Understanding of DevOps, Power Automate, and Microsoft Fabric is a plus.

Strong analytical thinking, collaboration skills, Excellent Communication Skill and the ability to work independently or as part of a team.


Qualifications

Bachelor’s degree in Computer Science, Data Science, or a related field.

Read more
Wissen Technology

at Wissen Technology

4 recruiters
Rutuja Patil
Posted by Rutuja Patil
Mumbai
4 - 10 yrs
Best in industry
skill iconJava
J2EE
Hibernate (Java)
skill iconSpring Boot
Spring MVC
+2 more

Company Name – Wissen Technology

Group of companies in India – Wissen Technology & Wissen Infotech

Work Location - Senior Backend Developer – Java (with Python Exposure)- Mumbai


Experience - 4 to 10 years


Kindly revert over mail if you are interested.


Java Developer – Job Description


We are seeking a Senior Backend Developer with strong expertise in Java (Spring Boot) and working knowledge of Python. In this role, Java will be your primary development language, with Python used for scripting, automation, or selected service modules. You’ll be part of a collaborative backend team building scalable and high-performance systems.


Key Responsibilities


  • Design and develop robust backend services and APIs primarily using Java (Spring Boot)
  • Contribute to Python-based components where needed for automation, scripting, or lightweight services
  • Build, integrate, and optimize RESTful APIs and microservices
  • Work with relational and NoSQL databases
  • Write unit and integration tests (JUnit, PyTest)
  • Collaborate closely with DevOps, QA, and product teams
  • Participate in architecture reviews and design discussions
  • Help maintain code quality, organization, and automation


Required Skills & Qualifications

  • 4 to 10 years of hands-on Java development experience
  • Strong experience with Spring Boot, JPA/Hibernate, and REST APIs
  • At least 1–2 years of hands-on experience with Python (e.g., for scripting, automation, or small services)
  • Familiarity with Python frameworks like Flask or FastAPI is a plus
  • Experience with SQL/NoSQL databases (e.g., PostgreSQL, MongoDB)
  • Good understanding of OOPdesign patterns, and software engineering best practices
  • Familiarity with DockerGit, and CI/CD pipelines


Read more
Deqode

at Deqode

1 recruiter
Alisha Das
Posted by Alisha Das
Bengaluru (Bangalore), Mumbai, Pune, Chennai, Gurugram
5.6 - 7 yrs
₹10L - ₹28L / yr
skill iconAmazon Web Services (AWS)
skill iconPython
PySpark
SQL

Job Summary:

As an AWS Data Engineer, you will be responsible for designing, developing, and maintaining scalable, high-performance data pipelines using AWS services. With 6+ years of experience, you’ll collaborate closely with data architects, analysts, and business stakeholders to build reliable, secure, and cost-efficient data infrastructure across the organization.

Key Responsibilities:

  • Design, develop, and manage scalable data pipelines using AWS Glue, Lambda, and other serverless technologies
  • Implement ETL workflows and transformation logic using PySpark and Python on AWS Glue
  • Leverage AWS Redshift for warehousing, performance tuning, and large-scale data queries
  • Work with AWS DMS and RDS for database integration and migration
  • Optimize data flows and system performance for speed and cost-effectiveness
  • Deploy and manage infrastructure using AWS CloudFormation templates
  • Collaborate with cross-functional teams to gather requirements and build robust data solutions
  • Ensure data integrity, quality, and security across all systems and processes

Required Skills & Experience:

  • 6+ years of experience in Data Engineering with strong AWS expertise
  • Proficient in Python and PySpark for data processing and ETL development
  • Hands-on experience with AWS Glue, Lambda, DMS, RDS, and Redshift
  • Strong SQL skills for building complex queries and performing data analysis
  • Familiarity with AWS CloudFormation and infrastructure as code principles
  • Good understanding of serverless architecture and cost-optimized design
  • Ability to write clean, modular, and maintainable code
  • Strong analytical thinking and problem-solving skills


Read more
Deqode

at Deqode

1 recruiter
purvisha Bhavsar
Posted by purvisha Bhavsar
Gurugram, Delhi, Noida, Ghaziabad, Faridabad
6 - 10 yrs
₹5L - ₹15L / yr
Google Cloud Platform (GCP)
skill iconPython
PySpark
skill icon.NET
skill iconScala

🚀 Hiring: Data Engineer | GCP + Spark + Python + .NET |

| 6–10 Yrs | Gurugram (Hybrid)


We’re looking for a skilled Data Engineer with strong hands-on experience in GCP, Spark-Scala, Python, and .NET.


📍 Location: Suncity, Sector 54, Gurugram (Hybrid – 3 days onsite)

💼 Experience: 6–10 Years

⏱️ Notice Period :- Immediate Joiner


Required Skills:

  • 5+ years of experience in distributed computing (Spark) and software development.
  • 3+ years of experience in Spark-Scala
  • 5+ years of experience in Data Engineering.
  • 5+ years of experience in Python.
  • Fluency in working with databases (preferably Postgres).
  • Have a sound understanding of object-oriented programming and development principles.
  • Experience working in an Agile Scrum or Kanban development environment.
  • Experience working with version control software (preferably Git).
  • Experience with CI/CD pipelines.
  • Experience with automated testing, including integration/delta, Load, and Performance
Read more
Bengaluru (Bangalore), Pune, Chennai
5 - 12 yrs
₹5L - ₹25L / yr
PySpark
Automation
SQL

Skill Name: ETL Automation Testing

Location: Bangalore, Chennai and Pune

Experience: 5+ Years


Required:

Experience in ETL Automation Testing

Strong experience in Pyspark.

Read more
Wissen Technology

at Wissen Technology

4 recruiters
Vishakha Walunj
Posted by Vishakha Walunj
Bengaluru (Bangalore), Pune, Mumbai
7 - 12 yrs
Best in industry
PySpark
databricks
SQL
skill iconPython

Required Skills:

  • Hands-on experience with Databricks, PySpark
  • Proficiency in SQL, Python, and Spark.
  • Understanding of data warehousing concepts and data modeling.
  • Experience with CI/CD pipelines and version control (e.g., Git).
  • Fundamental knowledge of any cloud services, preferably Azure or GCP.


Good to Have:

  • Bigquery
  • Experience with performance tuning and data governance.


Read more
Deqode

at Deqode

1 recruiter
Roshni Maji
Posted by Roshni Maji
Pune, Bengaluru (Bangalore), Gurugram, Chennai, Mumbai
5 - 7 yrs
₹6L - ₹20L / yr
skill iconAmazon Web Services (AWS)
Amazon Redshift
AWS Glue
skill iconPython
PySpark

Position: AWS Data Engineer

Experience: 5 to 7 Years

Location: Bengaluru, Pune, Chennai, Mumbai, Gurugram

Work Mode: Hybrid (3 days work from office per week)

Employment Type: Full-time

About the Role:

We are seeking a highly skilled and motivated AWS Data Engineer with 5–7 years of experience in building and optimizing data pipelines, architectures, and data sets. The ideal candidate will have strong experience with AWS services including Glue, Athena, Redshift, Lambda, DMS, RDS, and CloudFormation. You will be responsible for managing the full data lifecycle from ingestion to transformation and storage, ensuring efficiency and performance.

Key Responsibilities:

  • Design, develop, and optimize scalable ETL pipelines using AWS Glue, Python/PySpark, and SQL.
  • Work extensively with AWS services such as Glue, Athena, Lambda, DMS, RDS, Redshift, CloudFormation, and other serverless technologies.
  • Implement and manage data lake and warehouse solutions using AWS Redshift and S3.
  • Optimize data models and storage for cost-efficiency and performance.
  • Write advanced SQL queries to support complex data analysis and reporting requirements.
  • Collaborate with stakeholders to understand data requirements and translate them into scalable solutions.
  • Ensure high data quality and integrity across platforms and processes.
  • Implement CI/CD pipelines and best practices for infrastructure as code using CloudFormation or similar tools.

Required Skills & Experience:

  • Strong hands-on experience with Python or PySpark for data processing.
  • Deep knowledge of AWS Glue, Athena, Lambda, Redshift, RDS, DMS, and CloudFormation.
  • Proficiency in writing complex SQL queries and optimizing them for performance.
  • Familiarity with serverless architectures and AWS best practices.
  • Experience in designing and maintaining robust data architectures and data lakes.
  • Ability to troubleshoot and resolve data pipeline issues efficiently.
  • Strong communication and stakeholder management skills.


Read more
Deqode

at Deqode

1 recruiter
Roshni Maji
Posted by Roshni Maji
Bengaluru (Bangalore), Pune, Mumbai, Chennai, Gurugram
5 - 7 yrs
₹5L - ₹19L / yr
skill iconPython
PySpark
skill iconAmazon Web Services (AWS)
aws
Amazon Redshift
+1 more

Position: AWS Data Engineer

Experience: 5 to 7 Years

Location: Bengaluru, Pune, Chennai, Mumbai, Gurugram

Work Mode: Hybrid (3 days work from office per week)

Employment Type: Full-time

About the Role:

We are seeking a highly skilled and motivated AWS Data Engineer with 5–7 years of experience in building and optimizing data pipelines, architectures, and data sets. The ideal candidate will have strong experience with AWS services including Glue, Athena, Redshift, Lambda, DMS, RDS, and CloudFormation. You will be responsible for managing the full data lifecycle from ingestion to transformation and storage, ensuring efficiency and performance.

Key Responsibilities:

  • Design, develop, and optimize scalable ETL pipelines using AWS Glue, Python/PySpark, and SQL.
  • Work extensively with AWS services such as Glue, Athena, Lambda, DMS, RDS, Redshift, CloudFormation, and other serverless technologies.
  • Implement and manage data lake and warehouse solutions using AWS Redshift and S3.
  • Optimize data models and storage for cost-efficiency and performance.
  • Write advanced SQL queries to support complex data analysis and reporting requirements.
  • Collaborate with stakeholders to understand data requirements and translate them into scalable solutions.
  • Ensure high data quality and integrity across platforms and processes.
  • Implement CI/CD pipelines and best practices for infrastructure as code using CloudFormation or similar tools.

Required Skills & Experience:

  • Strong hands-on experience with Python or PySpark for data processing.
  • Deep knowledge of AWS Glue, Athena, Lambda, Redshift, RDS, DMS, and CloudFormation.
  • Proficiency in writing complex SQL queries and optimizing them for performance.
  • Familiarity with serverless architectures and AWS best practices.
  • Experience in designing and maintaining robust data architectures and data lakes.
  • Ability to troubleshoot and resolve data pipeline issues efficiently.
  • Strong communication and stakeholder management skills.


Read more
Deqode

at Deqode

1 recruiter
Mokshada Solanki
Posted by Mokshada Solanki
Bengaluru (Bangalore), Mumbai, Pune, Gurugram
4 - 5 yrs
₹4L - ₹20L / yr
SQL
skill iconAmazon Web Services (AWS)
Migration
PySpark
ETL

Job Summary:

Seeking a seasoned SQL + ETL Developer with 4+ years of experience in managing large-scale datasets and cloud-based data pipelines. The ideal candidate is hands-on with MySQL, PySpark, AWS Glue, and ETL workflows, with proven expertise in AWS migration and performance optimization.


Key Responsibilities:

  • Develop and optimize complex SQL queries and stored procedures to handle large datasets (100+ million records).
  • Build and maintain scalable ETL pipelines using AWS Glue and PySpark.
  • Work on data migration tasks in AWS environments.
  • Monitor and improve database performance; automate key performance indicators and reports.
  • Collaborate with cross-functional teams to support data integration and delivery requirements.
  • Write shell scripts for automation and manage ETL jobs efficiently.


Required Skills:

  • Strong experience with MySQL, complex SQL queries, and stored procedures.
  • Hands-on experience with AWS Glue, PySpark, and ETL processes.
  • Good understanding of AWS ecosystem and migration strategies.
  • Proficiency in shell scripting.
  • Strong communication and collaboration skills.


Nice to Have:

  • Working knowledge of Python.
  • Experience with AWS RDS.



Read more
Get to hear about interesting companies hiring right now
Company logo
Company logo
Company logo
Company logo
Company logo
Linkedin iconFollow Cutshort
Why apply via Cutshort?
Connect with actual hiring teams and get their fast response. No spam.
Find more jobs
Get to hear about interesting companies hiring right now
Company logo
Company logo
Company logo
Company logo
Company logo
Linkedin iconFollow Cutshort