Data integration Jobs in Pune

9+ Data integration Jobs in Pune | Data integration Job openings in Pune

Apply to 9+ Data integration Jobs in Pune on CutShort.io. Explore the latest Data integration Job opportunities across top companies like Google, Amazon & Adobe.

Data in other cities

Jobs by Category

Fullstack Developer Jobs Backend Developer Jobs Frontend Developer Jobs Android Developer Jobs iOS Developer Jobs DevOps Jobs Data Science Jobs

Business Developer Jobs Digital Marketing Jobs Sales Jobs

UX Designer Jobs Graphic Designer Jobs

Jobs by Location

Startup Jobs in Bangalore Startup Jobs in Pune Startup Jobs in Delhi All Startup jobs

Collections

Funded Startup Jobs Product Startup Jobs

Senior Data Engineer

Non-Banking Financial Company

Agency job

via Peak Hire Solutions by Dharati Thakkar

Pune

4 - 8 yrs

₹8L - ₹13L / yr

SQL

databricks

PowerBI

Data engineering

Data architecture

+7 more

ROLES AND RESPONSIBILITIES:

We are seeking a highly experienced Senior Data Engineer with strong architectural capability, excellent optimisation skills, and deep hands-on experience in modern data platforms. The ideal candidate will have advanced SQL skills, strong expertise in Databricks, and practical experience working across cloud environments such as AWS and Azure. This role requires end-to-end ownership of complex data engineering initiatives, including architecture design, data governance implementation, and performance optimisation. You will collaborate with cross-functional teams to build scalable, secure, and high-quality data solutions.

Key Responsibilities-

Lead the design and implementation of scalable data architectures, pipelines, and integration frameworks.
Develop, optimise, and maintain complex SQL queries, transformations, and Databricks-based data workflows.
Architect and deliver high-performance ETL/ELT processes across cloud platforms.
Implement and enforce data governance standards, including data quality, lineage, and access control.
Partner with analytics, BI (Power BI), and business teams to enable reliable, governed, and high-value data delivery.
Optimise large-scale data processing, ensuring efficiency, reliability, and cost-effectiveness.
Monitor, troubleshoot, and continuously improve data pipelines and platform performance.
Mentor junior engineers and contribute to engineering best practices, standards, and documentation.

IDEAL CANDIDATE:

Proven industry experience as a Senior Data Engineer, with ownership of high-complexity projects.
Advanced SQL skills with experience handling large, complex datasets.
Strong expertise with Databricks for data engineering workloads.
Hands-on experience with major cloud platforms — AWS and Azure.
Deep understanding of data architecture, data modelling, and optimisation techniques.
Familiarity with BI and reporting environments such as Power BI.
Strong analytical and problem-solving abilities with a focus on data quality and governance
Proficiency in python or another programming language in a plus.

PERKS, BENEFITS AND WORK CULTURE:

Our people define our passion and our audacious, incredibly rewarding achievements. The company is one of India’s most diversified Non-banking financial companies, and among Asia’s top 10 Large workplaces. If you have the drive to get ahead, we can help find you an opportunity at any of the 500+ locations we’re present in India.

ROLES AND RESPONSIBILITIES:

Key Responsibilities-

Lead the design and implementation of scalable data architectures, pipelines, and integration frameworks.
Develop, optimise, and maintain complex SQL queries, transformations, and Databricks-based data workflows.
Architect and deliver high-performance ETL/ELT processes across cloud platforms.
Implement and enforce data governance standards, including data quality, lineage, and access control.
Partner with analytics, BI (Power BI), and business teams to enable reliable, governed, and high-value data delivery.
Optimise large-scale data processing, ensuring efficiency, reliability, and cost-effectiveness.
Monitor, troubleshoot, and continuously improve data pipelines and platform performance.
Mentor junior engineers and contribute to engineering best practices, standards, and documentation.

IDEAL CANDIDATE:

Proven industry experience as a Senior Data Engineer, with ownership of high-complexity projects.
Advanced SQL skills with experience handling large, complex datasets.
Strong expertise with Databricks for data engineering workloads.
Hands-on experience with major cloud platforms — AWS and Azure.
Deep understanding of data architecture, data modelling, and optimisation techniques.
Familiarity with BI and reporting environments such as Power BI.
Strong analytical and problem-solving abilities with a focus on data quality and governance
Proficiency in python or another programming language in a plus.

PERKS, BENEFITS AND WORK CULTURE:

Junior Data Engineer

Non-Banking Financial Company

Agency job

via Peak Hire Solutions by Dharati Thakkar

Pune

1 - 2 yrs

₹5L - ₹6.1L / yr

SQL

databricks

PowerBI

Data engineering

ETL

+6 more

ROLES AND RESPONSIBILITIES:

We are looking for a Junior Data Engineer who will work under guidance to support data engineering tasks, perform basic coding, and actively learn modern data platforms and tools. The ideal candidate should have foundational SQL knowledge, basic exposure to Databricks. This role is designed for early-career professionals who are eager to grow into full data engineering responsibilities while contributing to data pipeline operations and analytical support.

Key Responsibilities-

Support the development and maintenance of data pipelines and ETL/ELT workflows under mentorship.
Write basic SQL queries, transformations, and assist with Databricks notebook tasks.
Help troubleshoot data issues and contribute to ensuring pipeline reliability.
Work with senior engineers and analysts to understand data requirements and deliver small tasks.
Assist in maintaining documentation, data dictionaries, and process notes.
Learn and apply data engineering best practices, coding standards, and cloud fundamentals.
Support basic tasks related to Power BI data preparation or integrations as needed.

IDEAL CANDIDATE:

Foundational SQL skills with the ability to write and understand basic queries.
Basic exposure to Databricks, data transformation concepts, or similar data tools.
Understanding of ETL/ELT concepts, data structures, and analytical workflows.
Eagerness to learn modern data engineering tools, technologies, and best practices.
Strong problem-solving attitude and willingness to work under guidance.
Good communication and collaboration skills to work with senior engineers and analysts.

PERKS, BENEFITS AND WORK CULTURE:

Our people define our passion and our audacious, incredibly rewarding achievements. Bajaj Finance Limited is one of India’s most diversified Non-banking financial companies, and among Asia’s top 10 Large workplaces. If you have the drive to get ahead, we can help find you an opportunity at any of the 500+ locations we’re present in India.

ROLES AND RESPONSIBILITIES:

Key Responsibilities-

Support the development and maintenance of data pipelines and ETL/ELT workflows under mentorship.
Write basic SQL queries, transformations, and assist with Databricks notebook tasks.
Help troubleshoot data issues and contribute to ensuring pipeline reliability.
Work with senior engineers and analysts to understand data requirements and deliver small tasks.
Assist in maintaining documentation, data dictionaries, and process notes.
Learn and apply data engineering best practices, coding standards, and cloud fundamentals.
Support basic tasks related to Power BI data preparation or integrations as needed.

IDEAL CANDIDATE:

Foundational SQL skills with the ability to write and understand basic queries.
Basic exposure to Databricks, data transformation concepts, or similar data tools.
Understanding of ETL/ELT concepts, data structures, and analytical workflows.
Eagerness to learn modern data engineering tools, technologies, and best practices.
Strong problem-solving attitude and willingness to work under guidance.
Good communication and collaboration skills to work with senior engineers and analysts.

PERKS, BENEFITS AND WORK CULTURE:

Lead I - Software Engineering - AI Solutions Analyst

Global Digital Transformation Solutions Provider

Agency job

via Peak Hire Solutions by Dharati Thakkar

Bengaluru (Bangalore), Mumbai, Delhi, Gurugram, Noida, Ghaziabad, Faridabad, Pune, Hyderabad

6 - 8 yrs

₹10L - ₹26L / yr

Large Language Models (LLM)

Prompt engineering

Knowledge base

Large Language Models (LLM) tuning

Artificial Intelligence (AI)

+7 more

MUST-HAVES:

LLM, AI, Prompt Engineering LLM Integration & Prompt Engineering
Context & Knowledge Base Design.
Context & Knowledge Base Design.
Experience running LLM evals

NOTICE PERIOD: Immediate – 30 Days

SKILLS: LLM, AI, PROMPT ENGINEERING

NICE TO HAVES:

Data Literacy & Modelling Awareness Familiarity with Databricks, AWS, and ChatGPT Environments

ROLE PROFICIENCY:

Role Scope / Deliverables:

Scope of Role Serve as the link between business intelligence, data engineering, and AI application teams, ensuring the Large Language Model (LLM) interacts effectively with the modeled dataset.
Define and curate the context and knowledge base that enables GPT to provide accurate, relevant, and compliant business insights.
Collaborate with Data Analysts and System SMEs to identify, structure, and tag data elements that feed the LLM environment.
Design, test, and refine prompt strategies and context frameworks that align GPT outputs with business objectives.
Conduct evaluation and performance testing (evals) to validate LLM responses for accuracy, completeness, and relevance.
Partner with IT and governance stakeholders to ensure secure, ethical, and controlled AI behavior within enterprise boundaries.

KEY DELIVERABLES:

LLM Interaction Design Framework: Documentation of how GPT connects to the modeled dataset, including context injection, prompt templates, and retrieval logic.
Knowledge Base Configuration: Curated and structured domain knowledge to enable precise and useful GPT responses (e.g., commercial definitions, data context, business rules).
Evaluation Scripts & Test Results: Defined eval sets, scoring criteria, and output analysis to measure GPT accuracy and quality over time.
Prompt Library & Usage Guidelines: Standardized prompts and design patterns to ensure consistent business interactions and outcomes.
AI Performance Dashboard / Reporting: Visualizations or reports summarizing GPT response quality, usage trends, and continuous improvement metrics.
Governance & Compliance Documentation: Inputs to data security, bias prevention, and responsible AI practices in collaboration with IT and compliance teams.

KEY SKILLS:

Technical & Analytical Skills:

LLM Integration & Prompt Engineering – Understanding of how GPT models interact with structured and unstructured data to generate business-relevant insights.
Context & Knowledge Base Design – Skilled in curating, structuring, and managing contextual data to optimize GPT accuracy and reliability.
Evaluation & Testing Methods – Experience running LLM evals, defining scoring criteria, and assessing model quality across use cases.
Data Literacy & Modeling Awareness – Familiar with relational and analytical data models to ensure alignment between data structures and AI responses.
Familiarity with Databricks, AWS, and ChatGPT Environments – Capable of working in cloud-based analytics and AI environments for development, testing, and deployment.
Scripting & Query Skills (e.g., SQL, Python) – Ability to extract, transform, and validate data for model training and evaluation workflows.
Business & Collaboration Skills Cross-Functional Collaboration – Works effectively with business, data, and IT teams to align GPT capabilities with business objectives.
Analytical Thinking & Problem Solving – Evaluates LLM outputs critically, identifies improvement opportunities, and translates findings into actionable refinements.
Commercial Context Awareness – Understands how sales and marketing intelligence data should be represented and leveraged by GPT.
Governance & Responsible AI Mindset – Applies enterprise AI standards for data security, privacy, and ethical use.
Communication & Documentation – Clearly articulates AI logic, context structures, and testing results for both technical and non-technical audiences.

MUST-HAVES:

LLM, AI, Prompt Engineering LLM Integration & Prompt Engineering
Context & Knowledge Base Design.
Context & Knowledge Base Design.
Experience running LLM evals

NOTICE PERIOD: Immediate – 30 Days

SKILLS: LLM, AI, PROMPT ENGINEERING

NICE TO HAVES:

Data Literacy & Modelling Awareness Familiarity with Databricks, AWS, and ChatGPT Environments

ROLE PROFICIENCY:

Role Scope / Deliverables:

Scope of Role Serve as the link between business intelligence, data engineering, and AI application teams, ensuring the Large Language Model (LLM) interacts effectively with the modeled dataset.
Define and curate the context and knowledge base that enables GPT to provide accurate, relevant, and compliant business insights.
Collaborate with Data Analysts and System SMEs to identify, structure, and tag data elements that feed the LLM environment.
Design, test, and refine prompt strategies and context frameworks that align GPT outputs with business objectives.
Conduct evaluation and performance testing (evals) to validate LLM responses for accuracy, completeness, and relevance.
Partner with IT and governance stakeholders to ensure secure, ethical, and controlled AI behavior within enterprise boundaries.

KEY DELIVERABLES:

LLM Interaction Design Framework: Documentation of how GPT connects to the modeled dataset, including context injection, prompt templates, and retrieval logic.
Knowledge Base Configuration: Curated and structured domain knowledge to enable precise and useful GPT responses (e.g., commercial definitions, data context, business rules).
Evaluation Scripts & Test Results: Defined eval sets, scoring criteria, and output analysis to measure GPT accuracy and quality over time.
Prompt Library & Usage Guidelines: Standardized prompts and design patterns to ensure consistent business interactions and outcomes.
AI Performance Dashboard / Reporting: Visualizations or reports summarizing GPT response quality, usage trends, and continuous improvement metrics.
Governance & Compliance Documentation: Inputs to data security, bias prevention, and responsible AI practices in collaboration with IT and compliance teams.

KEY SKILLS:

Technical & Analytical Skills:

LLM Integration & Prompt Engineering – Understanding of how GPT models interact with structured and unstructured data to generate business-relevant insights.
Context & Knowledge Base Design – Skilled in curating, structuring, and managing contextual data to optimize GPT accuracy and reliability.
Evaluation & Testing Methods – Experience running LLM evals, defining scoring criteria, and assessing model quality across use cases.
Data Literacy & Modeling Awareness – Familiar with relational and analytical data models to ensure alignment between data structures and AI responses.
Familiarity with Databricks, AWS, and ChatGPT Environments – Capable of working in cloud-based analytics and AI environments for development, testing, and deployment.
Scripting & Query Skills (e.g., SQL, Python) – Ability to extract, transform, and validate data for model training and evaluation workflows.
Business & Collaboration Skills Cross-Functional Collaboration – Works effectively with business, data, and IT teams to align GPT capabilities with business objectives.
Analytical Thinking & Problem Solving – Evaluates LLM outputs critically, identifies improvement opportunities, and translates findings into actionable refinements.
Commercial Context Awareness – Understands how sales and marketing intelligence data should be represented and leveraged by GPT.
Governance & Responsible AI Mindset – Applies enterprise AI standards for data security, privacy, and ethical use.
Communication & Documentation – Clearly articulates AI logic, context structures, and testing results for both technical and non-technical audiences.

Ab Initio Developer

at NeoGenCode Technologies Pvt Ltd

2 candid answers

Posted by Akshay Patil

Pune

5 - 8 yrs

₹10L - ₹18L / yr

Ab Initio

GDE

EME

SQL

Teradata

+5 more

Job Title : Ab Initio Developer

Location : Pune

Experience : 5+ Years

Notice Period : Immediate Joiners Only

Job Summary :

We are looking for an experienced Ab Initio Developer to join our team in Pune.

The ideal candidate should have strong hands-on experience in Ab Initio development, data integration, and Unix scripting, with a solid understanding of SDLC and data warehousing concepts.

Mandatory Skills :

Ab Initio (GDE, EME, graphs, parameters), SQL/Teradata, Data Warehousing, Unix Shell Scripting, Data Integration, DB Load/Unload Utilities.

Key Responsibilities :

Design and develop Ab Initio graphs/plans/sandboxes/projects using GDE and EME.
Manage and configure standard environment parameters and multifile systems.
Perform complex data integration from multiple source and target systems with business rule transformations.
Utilize DB Load/Unload Utilities effectively for optimized performance.
Implement generic graphs, ensure proper use of parallelism, and maintain project parameters.
Work in a data warehouse environment involving SDLC, ETL processes, and data analysis.
Write and maintain Unix Shell Scripts and use utilities like sed, awk, etc.
Optimize and troubleshoot performance issues in Ab Initio jobs.

Mandatory Skills :

Strong expertise in Ab Initio (GDE, EME, graphs, parallelism, DB utilities, multifile systems).
Experience with SQL and databases like SQL Server or Teradata.
Proficiency in Unix Shell Scripting and Unix utilities.
Data integration and ETL from varied source/target systems.

Good to Have :

Experience in Ab Initio and AWS integration.
Knowledge of Message Queues and Continuous Graphs.
Exposure to Metadata Hub.
Familiarity with Big Data tools such as Hive, Impala.
Understanding of job scheduling tools.

Job Title : Ab Initio Developer

Location : Pune

Experience : 5+ Years

Notice Period : Immediate Joiners Only

Job Summary :

We are looking for an experienced Ab Initio Developer to join our team in Pune.

Mandatory Skills :

Ab Initio (GDE, EME, graphs, parameters), SQL/Teradata, Data Warehousing, Unix Shell Scripting, Data Integration, DB Load/Unload Utilities.

Key Responsibilities :

Design and develop Ab Initio graphs/plans/sandboxes/projects using GDE and EME.
Manage and configure standard environment parameters and multifile systems.
Perform complex data integration from multiple source and target systems with business rule transformations.
Utilize DB Load/Unload Utilities effectively for optimized performance.
Implement generic graphs, ensure proper use of parallelism, and maintain project parameters.
Work in a data warehouse environment involving SDLC, ETL processes, and data analysis.
Write and maintain Unix Shell Scripts and use utilities like sed, awk, etc.
Optimize and troubleshoot performance issues in Ab Initio jobs.

Mandatory Skills :

Strong expertise in Ab Initio (GDE, EME, graphs, parallelism, DB utilities, multifile systems).
Experience with SQL and databases like SQL Server or Teradata.
Proficiency in Unix Shell Scripting and Unix utilities.
Data integration and ETL from varied source/target systems.

Good to Have :

Experience in Ab Initio and AWS integration.
Knowledge of Message Queues and Continuous Graphs.
Exposure to Metadata Hub.
Familiarity with Big Data tools such as Hive, Impala.
Understanding of job scheduling tools.

Lead Data Engineer

at xpressbees

Posted by Alfiya Khan

Pune, Bengaluru (Bangalore)

6 - 8 yrs

₹15L - ₹25L / yr

Big Data

Data Warehouse (DWH)

Data modeling

Apache Spark

Data integration

+10 more

Company Profile
XpressBees – a logistics company started in 2015 – is amongst the fastest growing
companies of its sector. While we started off rather humbly in the space of
ecommerce B2C logistics, the last 5 years have seen us steadily progress towards
expanding our presence. Our vision to evolve into a strong full-service logistics
organization reflects itself in our new lines of business like 3PL, B2B Xpress and cross
border operations. Our strong domain expertise and constant focus on meaningful
innovation have helped us rapidly evolve as the most trusted logistics partner of
India. We have progressively carved our way towards best-in-class technology
platforms, an extensive network reach, and a seamless last mile management
system. While on this aggressive growth path, we seek to become the one-stop-shop
for end-to-end logistics solutions. Our big focus areas for the very near future
include strengthening our presence as service providers of choice and leveraging the
power of technology to improve efficiencies for our clients.

Job Profile
As a Lead Data Engineer in the Data Platform Team at XpressBees, you will build the data platform
and infrastructure to support high quality and agile decision-making in our supply chain and logistics
workflows.
You will define the way we collect and operationalize data (structured / unstructured), and
build production pipelines for our machine learning models, and (RT, NRT, Batch) reporting &
dashboarding requirements. As a Senior Data Engineer in the XB Data Platform Team, you will use
your experience with modern cloud and data frameworks to build products (with storage and serving
systems)
that drive optimisation and resilience in the supply chain via data visibility, intelligent decision making,
insights, anomaly detection and prediction.

What You Will Do
• Design and develop data platform and data pipelines for reporting, dashboarding and
machine learning models. These pipelines would productionize machine learning models
and integrate with agent review tools.
• Meet the data completeness, correction and freshness requirements.
• Evaluate and identify the data store and data streaming technology choices.
• Lead the design of the logical model and implement the physical model to support
business needs. Come up with logical and physical database design across platforms (MPP,
MR, Hive/PIG) which are optimal physical designs for different use cases (structured/semi
structured). Envision & implement the optimal data modelling, physical design,
performance optimization technique/approach required for the problem.
• Support your colleagues by reviewing code and designs.
• Diagnose and solve issues in our existing data pipelines and envision and build their
successors.

Qualifications & Experience relevant for the role

• A bachelor's degree in Computer Science or related field with 6 to 9 years of technology
experience.
• Knowledge of Relational and NoSQL data stores, stream processing and micro-batching to
make technology & design choices.
• Strong experience in System Integration, Application Development, ETL, Data-Platform
projects. Talented across technologies used in the enterprise space.
• Software development experience using:
• Expertise in relational and dimensional modelling
• Exposure across all the SDLC process
• Experience in cloud architecture (AWS)
• Proven track record in keeping existing technical skills and developing new ones, so that
you can make strong contributions to deep architecture discussions around systems and
applications in the cloud ( AWS).

• Characteristics of a forward thinker and self-starter that flourishes with new challenges
and adapts quickly to learning new knowledge
• Ability to work with a cross functional teams of consulting professionals across multiple
projects.
• Knack for helping an organization to understand application architectures and integration
approaches, to architect advanced cloud-based solutions, and to help launch the build-out
of those systems
• Passion for educating, training, designing, and building end-to-end systems.

Data/Integration Architect

Consulting Leader

Agency job

via Buaut Tech by KAUSHANK nalin

Pune, Mumbai

8 - 10 yrs

₹8L - ₹16L / yr

Data integration

talend

Hadoop

Integration

Java

+1 more

Job Description for :

Role: Data/Integration Architect

Experience – 8-10 Years

Notice Period: Under 30 days

Key Responsibilities: Designing, Developing frameworks for batch and real time jobs on Talend. Leading migration of these jobs from Mulesoft to Talend, maintaining best practices for the team, conducting code reviews and demos.

Core Skillsets:

Talend Data Fabric - Application, API Integration, Data Integration. Knowledge on Talend Management Cloud, deployment and scheduling of jobs using TMC or Autosys.

Programming Languages - Python/Java
Databases: SQL Server, Other Databases, Hadoop

Should have worked on Agile

Sound communication skills

Should be open to learning new technologies based on business needs on the job

Additional Skills:

Awareness of other data/integration platforms like Mulesoft, Camel

Awareness Hadoop, Snowflake, S3

Job Description for :

Role: Data/Integration Architect

Experience – 8-10 Years

Notice Period: Under 30 days

Core Skillsets:

Talend Data Fabric - Application, API Integration, Data Integration. Knowledge on Talend Management Cloud, deployment and scheduling of jobs using TMC or Autosys.

Programming Languages - Python/Java
Databases: SQL Server, Other Databases, Hadoop

Should have worked on Agile

Sound communication skills

Should be open to learning new technologies based on business needs on the job

Additional Skills:

Awareness of other data/integration platforms like Mulesoft, Camel

Awareness Hadoop, Snowflake, S3

Data Engineer - AWS

A global business process management company

Agency job

via Jobdost by Saida Pathan

Gurugram, Pune, Mumbai, Bengaluru (Bangalore), Chennai, Nashik

4 - 12 yrs

₹12L - ₹15L / yr

Data engineering

Data modeling

data pipeline

Data integration

Data Warehouse (DWH)

+12 more

Designation – Deputy Manager - TS

Job Description

Total of 8/9 years of development experience Data Engineering . B1/BII role
Minimum of 4/5 years in AWS Data Integrations and should be very good on Data modelling skills.
Should be very proficient in end to end AWS Data solution design, that not only includes strong data ingestion, integrations (both Data @ rest and Data in Motion) skills but also complete DevOps knowledge.
Should have experience in delivering at least 4 Data Warehouse or Data Lake Solutions on AWS.
Should be very strong experience on Glue, Lambda, Data Pipeline, Step functions, RDS, CloudFormation etc.
Strong Python skill .
Should be an expert in Cloud design principles, Performance tuning and cost modelling. AWS certifications will have an added advantage
Should be a team player with Excellent communication and should be able to manage his work independently with minimal or no supervision.
Life Science & Healthcare domain background will be a plus

Qualifications

BE/Btect/ME/MTech

Designation – Deputy Manager - TS

Job Description

Total of 8/9 years of development experience Data Engineering . B1/BII role
Minimum of 4/5 years in AWS Data Integrations and should be very good on Data modelling skills.
Should be very proficient in end to end AWS Data solution design, that not only includes strong data ingestion, integrations (both Data @ rest and Data in Motion) skills but also complete DevOps knowledge.
Should have experience in delivering at least 4 Data Warehouse or Data Lake Solutions on AWS.
Should be very strong experience on Glue, Lambda, Data Pipeline, Step functions, RDS, CloudFormation etc.
Strong Python skill .
Should be an expert in Cloud design principles, Performance tuning and cost modelling. AWS certifications will have an added advantage
Should be a team player with Excellent communication and should be able to manage his work independently with minimal or no supervision.
Life Science & Healthcare domain background will be a plus

Qualifications

BE/Btect/ME/MTech

Data Engineer

at Mobile Programming LLC

1 video

34 recruiters

Posted by Apurva kalsotra

Mohali, Gurugram, Pune, Bengaluru (Bangalore), Hyderabad, Chennai

3 - 8 yrs

₹2L - ₹9L / yr

Data engineering

Data engineer

Spark

Apache Spark

Apache Kafka

+13 more

Responsibilities for Data Engineer

Create and maintain optimal data pipeline architecture,
Assemble large, complex data sets that meet functional / non-functional business requirements.
Identify, design, and implement internal process improvements: automating manual processes, optimizing data delivery, re-designing infrastructure for greater scalability, etc.
Build the infrastructure required for optimal extraction, transformation, and loading of data from a wide variety of data sources using SQL and AWS ‘big data’ technologies.
Build analytics tools that utilize the data pipeline to provide actionable insights into customer acquisition, operational efficiency and other key business performance metrics.
Work with stakeholders including the Executive, Product, Data and Design teams to assist with data-related technical issues and support their data infrastructure needs.
Keep our data separated and secure across national boundaries through multiple data centers and AWS regions.
Create data tools for analytics and data scientist team members that assist them in building and optimizing our product into an innovative industry leader.
Work with data and analytics experts to strive for greater functionality in our data systems.

Qualifications for Data Engineer

Advanced working SQL knowledge and experience working with relational databases, query authoring (SQL) as well as working familiarity with a variety of databases.
Experience building and optimizing ‘big data’ data pipelines, architectures and data sets.
Experience performing root cause analysis on internal and external data and processes to answer specific business questions and identify opportunities for improvement.
Strong analytic skills related to working with unstructured datasets.
Build processes supporting data transformation, data structures, metadata, dependency and workload management.
A successful history of manipulating, processing and extracting value from large disconnected datasets.
Working knowledge of message queuing, stream processing, and highly scalable ‘big data’ data stores.
Strong project management and organizational skills.
Experience supporting and working with cross-functional teams in a dynamic environment.
We are looking for a candidate with 5+ years of experience in a Data Engineer role, who has attained a Graduate degree in Computer Science, Statistics, Informatics, Information Systems or another quantitative field. They should also have experience using the following software/tools:

Experience with big data tools: Hadoop, Spark, Kafka, etc.
Experience with relational SQL and NoSQL databases, including Postgres and Cassandra.
Experience with data pipeline and workflow management tools: Azkaban, Luigi, Airflow, etc.
Experience with AWS cloud services: EC2, EMR, RDS, Redshift
Experience with stream-processing systems: Storm, Spark-Streaming, etc.
Experience with object-oriented/object function scripting languages: Python, Java, C++, Scala, etc.

Responsibilities for Data Engineer

Create and maintain optimal data pipeline architecture,
Assemble large, complex data sets that meet functional / non-functional business requirements.
Identify, design, and implement internal process improvements: automating manual processes, optimizing data delivery, re-designing infrastructure for greater scalability, etc.
Build the infrastructure required for optimal extraction, transformation, and loading of data from a wide variety of data sources using SQL and AWS ‘big data’ technologies.
Build analytics tools that utilize the data pipeline to provide actionable insights into customer acquisition, operational efficiency and other key business performance metrics.
Work with stakeholders including the Executive, Product, Data and Design teams to assist with data-related technical issues and support their data infrastructure needs.
Keep our data separated and secure across national boundaries through multiple data centers and AWS regions.
Create data tools for analytics and data scientist team members that assist them in building and optimizing our product into an innovative industry leader.
Work with data and analytics experts to strive for greater functionality in our data systems.

Qualifications for Data Engineer

Advanced working SQL knowledge and experience working with relational databases, query authoring (SQL) as well as working familiarity with a variety of databases.
Experience building and optimizing ‘big data’ data pipelines, architectures and data sets.
Experience performing root cause analysis on internal and external data and processes to answer specific business questions and identify opportunities for improvement.
Strong analytic skills related to working with unstructured datasets.
Build processes supporting data transformation, data structures, metadata, dependency and workload management.
A successful history of manipulating, processing and extracting value from large disconnected datasets.
Working knowledge of message queuing, stream processing, and highly scalable ‘big data’ data stores.
Strong project management and organizational skills.
Experience supporting and working with cross-functional teams in a dynamic environment.
We are looking for a candidate with 5+ years of experience in a Data Engineer role, who has attained a Graduate degree in Computer Science, Statistics, Informatics, Information Systems or another quantitative field. They should also have experience using the following software/tools:

Experience with big data tools: Hadoop, Spark, Kafka, etc.
Experience with relational SQL and NoSQL databases, including Postgres and Cassandra.
Experience with data pipeline and workflow management tools: Azkaban, Luigi, Airflow, etc.
Experience with AWS cloud services: EC2, EMR, RDS, Redshift
Experience with stream-processing systems: Storm, Spark-Streaming, etc.
Experience with object-oriented/object function scripting languages: Python, Java, C++, Scala, etc.

Data Engineer

at Mobile Programming LLC

1 video

34 recruiters

Posted by Apurva kalsotra

Mohali, Gurugram, Bengaluru (Bangalore), Chennai, Hyderabad, Pune

3 - 8 yrs

₹3L - ₹9L / yr

Data Warehouse (DWH)

Big Data

Spark

Apache Kafka

Data engineering

+14 more

Day-to-day Activities
Develop complex queries, pipelines and software programs to solve analytics and data mining problems
Interact with other data scientists, product managers, and engineers to understand business problems, technical requirements to deliver predictive and smart data solutions
Prototype new applications or data systems
Lead data investigations to troubleshoot data issues that arise along the data pipelines
Collaborate with different product owners to incorporate data science solutions
Maintain and improve data science platform
Must Have
BS/MS/PhD in Computer Science, Electrical Engineering or related disciplines
Strong fundamentals: data structures, algorithms, database
5+ years of software industry experience with 2+ years in analytics, data mining, and/or data warehouse
Fluency with Python
Experience developing web services using REST approaches.
Proficiency with SQL/Unix/Shell
Experience in DevOps (CI/CD, Docker, Kubernetes)
Self-driven, challenge-loving, detail oriented, teamwork spirit, excellent communication skills, ability to multi-task and manage expectations
Preferred
Industry experience with big data processing technologies such as Spark and Kafka
Experience with machine learning algorithms and/or R a plus
Experience in Java/Scala a plus
Experience with any MPP analytics engines like Vertica
Experience with data integration tools like Pentaho/SAP Analytics Cloud

Get to hear about interesting companies hiring right now

Follow Cutshort

Why apply via Cutshort?

Connect with actual hiring teams and get their fast response. No spam.

Find more jobs

Get to hear about interesting companies hiring right now

Follow Cutshort