Etl jobs

50+ ETL Jobs in India

Apply to 50+ ETL Jobs on CutShort.io. Find your next job, effortlessly. Browse ETL Jobs and apply today!

Data Engineer

at Inteliment Technologies

2 candid answers

Posted by Ariba Khan

Pune

3 - 5 yrs

Upto ₹16L / yr (Varies

)

SQL

Python

ETL

Amazon Web Services (AWS)

Azure

+1 more

About the company:

Inteliment is a niche business analytics company with almost 2 decades proven track record of partnering with hundreds of fortunes 500 global companies. Inteliment operates its ISO certified development centre in Pune, India and has business operations in multiple countries through subsidiaries in Singapore, Europe and headquarter in India.

About the Role:

As a Data Engineer, you will contribute to cutting-edge global projects and innovative product initiatives, delivering impactful solutions for our Fortune clients. In this role, you will take ownership of the entire data pipeline and infrastructure development lifecycle—from ideation and design to implementation and ongoing optimization. Your efforts will ensure the delivery of high-performance, scalable, and reliable data solutions. Join us to become a driving force in shaping the future of data infrastructure and innovation, paving the way for transformative advancements in the data ecosystem.

Qualifications:

Bachelor’s or master’s degree in computer science, Information Technology, or a related field.
Certifications with related field will be an added advantage.

Key Competencies:

Must have experience with SQL, Python and Hadoop
Good to have experience with Cloud Computing Platforms (AWS, Azure, GCP, etc.), DevOps Practices, Agile Development Methodologies
ETL or other similar technologies will be an advantage.
Core Skills: Proficiency in SQL, Python, or Scala for data processing and manipulation
Data Platforms: Experience with cloud platforms such as AWS, Azure, or Google Cloud.
Tools: Familiarity with tools like Apache Spark, Kafka, and modern data warehouses (e.g., Snowflake, Big Query, Redshift).
Soft Skills: Strong problem-solving abilities, collaboration, and communication skills to work effectively with technical and non-technical teams.
Additional: Knowledge of SAP would be an advantage

Key Responsibilities:

Data Pipeline Development: Build, maintain, and optimize ETL/ELT pipelines for seamless data flow.
Data Integration: Consolidate data from various sources into unified systems.
Database Management: Design and optimize scalable data storage solutions.
Data Quality Assurance: Ensure data accuracy, consistency, and completeness.
Collaboration: Work with analysts, scientists, and stakeholders to meet data needs.
Performance Optimization: Enhance pipeline efficiency and database performance.
Data Security: Implement and maintain robust data security and governance policies
Innovation: Adopt new tools and design scalable solutions for future growth.
Monitoring: Continuously monitor and maintain data systems for reliability.
Data Engineers ensure reliable, high-quality data infrastructure for analytics and decision-making.

About the company:

About the Role:

Qualifications:

Bachelor’s or master’s degree in computer science, Information Technology, or a related field.
Certifications with related field will be an added advantage.

Key Competencies:

Must have experience with SQL, Python and Hadoop
Good to have experience with Cloud Computing Platforms (AWS, Azure, GCP, etc.), DevOps Practices, Agile Development Methodologies
ETL or other similar technologies will be an advantage.
Core Skills: Proficiency in SQL, Python, or Scala for data processing and manipulation
Data Platforms: Experience with cloud platforms such as AWS, Azure, or Google Cloud.
Tools: Familiarity with tools like Apache Spark, Kafka, and modern data warehouses (e.g., Snowflake, Big Query, Redshift).
Soft Skills: Strong problem-solving abilities, collaboration, and communication skills to work effectively with technical and non-technical teams.
Additional: Knowledge of SAP would be an advantage

Key Responsibilities:

Data Pipeline Development: Build, maintain, and optimize ETL/ELT pipelines for seamless data flow.
Data Integration: Consolidate data from various sources into unified systems.
Database Management: Design and optimize scalable data storage solutions.
Data Quality Assurance: Ensure data accuracy, consistency, and completeness.
Collaboration: Work with analysts, scientists, and stakeholders to meet data needs.
Performance Optimization: Enhance pipeline efficiency and database performance.
Data Security: Implement and maintain robust data security and governance policies
Innovation: Adopt new tools and design scalable solutions for future growth.
Monitoring: Continuously monitor and maintain data systems for reliability.
Data Engineers ensure reliable, high-quality data infrastructure for analytics and decision-making.

Lead/Senior ML Data Engineer (Cloud-Native, Healthcare AI)

at Mango Sciences

Posted by Supriya C

Remote only

8 - 15 yrs

₹20L - ₹30L / yr

PySpark

Python

ETL

databricks

Amazon Web Services (AWS)

+14 more

Job Summary: Lead/Senior ML Data Engineer (Cloud-Native, Healthcare AI)

Experience Required: 8+ Years

Work Mode: Remote

We are seeking a highly autonomous and experienced Lead/Senior ML Data Engineer to drive the critical data foundation for our AI analytics and Generative AI platforms. This is a specialized hybrid position, focusing on designing, building, and optimizing scalable data pipelines (ETL/ELT) that transform complex, messy clinical and healthcare data into high-quality, production-ready feature stores for Machine Learning and NLP models.

The successful candidate will own technical work streams end-to-end, ensuring data quality, governance, and low-latency delivery in a cloud-native environment.

Key Responsibilities & Focus Areas:

ML Data Pipeline Ownership (70-80% Focus): Design and implement high-performance, scalable ETL/ELT pipelines using PySpark and a Lakehouse architecture (such as Databricks) to ingest, clean, and transform large-scale healthcare datasets.
AI Data Preparation: Specialize in Feature Engineering and data preparation for complex ML workloads, including transforming unstructured clinical data (e.g., medical notes) for Generative AI and NLP model training.
Cloud Architecture & Orchestration: Deploy, manage, and optimize data workflows using Airflow in a production AWS environment.
Data Governance & Compliance: Mandatorily implement pipelines with robust data masking, pseudonymization, and security controls to ensure continuous adherence to HIPAA and other relevant health data privacy regulations.
Technical Leadership: Lead and define technical requirements from ambiguous business problems, acting as a key contributor to the data architecture strategy for the core AI platform.

Non-Negotiable Requirements (The "Must-Haves"):

5+ years of progressive experience as a Data Engineer, with a clear focus on ML/AI support.
Deep expertise in PySpark/Python for distributed data processing.
Mandatory proficiency with Lakehouse platforms (e.g., Databricks) in an AWS production environment.
Proven experience handling complex clinical/healthcare data (EHR, Claims), including unstructured text.
Hands-on experience with HIPAA/GDPR compliance in data pipeline design.

Job Summary: Lead/Senior ML Data Engineer (Cloud-Native, Healthcare AI)

Experience Required: 8+ Years

Work Mode: Remote

The successful candidate will own technical work streams end-to-end, ensuring data quality, governance, and low-latency delivery in a cloud-native environment.

Key Responsibilities & Focus Areas:

ML Data Pipeline Ownership (70-80% Focus): Design and implement high-performance, scalable ETL/ELT pipelines using PySpark and a Lakehouse architecture (such as Databricks) to ingest, clean, and transform large-scale healthcare datasets.
AI Data Preparation: Specialize in Feature Engineering and data preparation for complex ML workloads, including transforming unstructured clinical data (e.g., medical notes) for Generative AI and NLP model training.
Cloud Architecture & Orchestration: Deploy, manage, and optimize data workflows using Airflow in a production AWS environment.
Data Governance & Compliance: Mandatorily implement pipelines with robust data masking, pseudonymization, and security controls to ensure continuous adherence to HIPAA and other relevant health data privacy regulations.
Technical Leadership: Lead and define technical requirements from ambiguous business problems, acting as a key contributor to the data architecture strategy for the core AI platform.

Non-Negotiable Requirements (The "Must-Haves"):

5+ years of progressive experience as a Data Engineer, with a clear focus on ML/AI support.
Deep expertise in PySpark/Python for distributed data processing.
Mandatory proficiency with Lakehouse platforms (e.g., Databricks) in an AWS production environment.
Proven experience handling complex clinical/healthcare data (EHR, Claims), including unstructured text.
Hands-on experience with HIPAA/GDPR compliance in data pipeline design.

Data Analytics Lead

PRODUCT DEVELOPMENT COMPANY

Agency job

via Hunarstreet Technologies pvt ltd by Raziya Syed

Remote only

7 - 16 yrs

₹15L - ₹20L / yr

Data Analytics

Data Warehouse (DWH)

Business Intelligence (BI)

Data governance

BI/DW

+6 more

EMPLOYMENT TYPE: Full-Time, Permanent

LOCATION: Remote

SHIFT TIMINGS: 11.00 AM - 8:00 PM IST

Role : Lead Data Analyst

Qualifications:

● Bachelor’s or Master’s degree in Computer Science, Data Analytics, Information Systems, or a related field.

● 7–10 years of experience in data operations, data management, or analytics.

● Strong understanding of data governance, ETL processes, and quality control methodologies.

● Hands-on experience with SQL, Excel/Google Sheets, and data visualization tools

● Experience with automation tools like Python script is a plus.

● Must be capable of working independently and delivering stable, efficient and reliable software.

● Excellent written and verbal communication skills in English.

● Experience supporting and working with cross-functional teams in a dynamic environment

Preferred Skills:

● Experience in SaaS, B2B data, or lead intelligence industry.

● Exposure to data privacy regulations (GDPR, CCPA) and compliance practices.

● Ability to work effectively in cross-functional, global, and remote environments.