PySpark Jobs in Ahmedabad

7+ PySpark Jobs in Ahmedabad | PySpark Job openings in Ahmedabad

Apply to 7+ PySpark Jobs in Ahmedabad on CutShort.io. Explore the latest PySpark Job opportunities across top companies like Google, Amazon & Adobe.

Lead Data Engineer (Snowflake)

at Kanerika Software

3 candid answers

2 recruiters

Posted by Ariba Khan

Hyderabad, Indore, Ahmedabad

7 - 10 yrs

Upto ₹35L / yr (Varies

)

Snow flake schema

Python

SQL

databricks

PySpark

About Kanerika:

Kanerika Inc. is a premier global software products and services firm that specializes in providing innovative solutions and services for data-driven enterprises. Our focus is to empower businesses to achieve their digital transformation goals and maximize their business impact through the effective use of data and AI.

We leverage cutting-edge technologies in data analytics, data governance, AI-ML, GenAI/ LLM and industry best practices to deliver custom solutions that help organizations optimize their operations, enhance customer experiences, and drive growth.

Awards and Recognitions:

Kanerika has won several awards over the years, including:

1. Best Place to Work 2023 by Great Place to Work®

2. Top 10 Most Recommended RPA Start-Ups in 2022 by RPA Today

3. NASSCOM Emerge 50 Award in 2014

4. Frost & Sullivan India 2021 Technology Innovation Award for its Kompass composable solution architecture

5. Kanerika has also been recognized for its commitment to customer privacy and data security, having achieved ISO 27701, SOC2, and GDPR compliances.

Working for us:

Kanerika is rated 4.6/5 on Glassdoor, for many good reasons. We truly value our employees' growth, well-being, and diversity, and people’s experiences bear this out. At Kanerika, we offer a host of enticing benefits that create an environment where you can thrive both personally and professionally. From our inclusive hiring practices and mandatory training on creating a safe work environment to our flexible working hours and generous parental leave, we prioritize the well-being and success of our employees.

Our commitment to professional development is evident through our mentorship programs, job training initiatives, and support for professional certifications. Additionally, our company-sponsored outings and various time-off benefits ensure a healthy work-life balance. Join us at Kanerika and become part of a vibrant and diverse community where your talents are recognized, your growth is nurtured, and your contributions make a real impact. See the benefits section below for the perks you’ll get while working for Kanerika.

Role Responsibilities:

Following are high level responsibilities that you will play but not limited to:

Design, development, and implementation of modern data pipelines, data models, and ETL/ELT processes.
Architect and optimize data lake and warehouse solutions using Microsoft Fabric, Databricks, or Snowflake.
Enable business analytics and self-service reporting through Power BI and other visualization tools.
Collaborate with data scientists, analysts, and business users to deliver reliable and high-performance data solutions.
Implement and enforce best practices for data governance, data quality, and security.
Mentor and guide junior data engineers; establish coding and design standards.
Evaluate emerging technologies and tools to continuously improve the data ecosystem.

Required Qualifications:

Bachelor's degree in computer science, Information Technology, Engineering, or a related field.
Bachelor’s/ Master’s degree in Computer Science, Information Technology, Engineering, or related field.
7-10 years of experience in data engineering or data platform development
Strong hands-on experience in SQL, Snowflake, Python, and Airflow
Solid understanding of data modeling, data governance, security, and CI/CD practices.

Preferred Qualifications:

Experience in leading a team
Familiarity with data modeling techniques and practices for Power BI.
Knowledge of Azure Databricks or other data processing frameworks.
Knowledge of Microsoft Fabric or other Cloud Platforms.

What we need?

· B. Tech computer science or equivalent.

Why join us?

Work with a passionate and innovative team in a fast-paced, growth-oriented environment.
Gain hands-on experience in content marketing with exposure to real-world projects.
Opportunity to learn from experienced professionals and enhance your marketing skills.
Contribute to exciting initiatives and make an impact from day one.
Competitive stipend and potential for growth within the company.
Recognized for excellence in data and AI solutions with industry awards and accolades.

Employee Benefits:

1. Culture:

Open Door Policy: Encourages open communication and accessibility to management.
Open Office Floor Plan: Fosters a collaborative and interactive work environment.
Flexible Working Hours: Allows employees to have flexibility in their work schedules.
Employee Referral Bonus: Rewards employees for referring qualified candidates.
Appraisal Process Twice a Year: Provides regular performance evaluations and feedback.

2. Inclusivity and Diversity:

Hiring practices that promote diversity: Ensures a diverse and inclusive workforce.
Mandatory POSH training: Promotes a safe and respectful work environment.

3. Health Insurance and Wellness Benefits:

GMC and Term Insurance: Offers medical coverage and financial protection.
Health Insurance: Provides coverage for medical expenses.
Disability Insurance: Offers financial support in case of disability.

4. Child Care & Parental Leave Benefits:

Company-sponsored family events: Creates opportunities for employees and their families to bond.
Generous Parental Leave: Allows parents to take time off after the birth or adoption of a child.
Family Medical Leave: Offers leave for employees to take care of family members' medical needs.

5. Perks and Time-Off Benefits:

Company-sponsored outings: Organizes recreational activities for employees.
Gratuity: Provides a monetary benefit as a token of appreciation.
Provident Fund: Helps employees save for retirement.
Generous PTO: Offers more than the industry standard for paid time off.
Paid sick days: Allows employees to take paid time off when they are unwell.
Paid holidays: Gives employees paid time off for designated holidays.
Bereavement Leave: Provides time off for employees to grieve the loss of a loved one.

6. Professional Development Benefits:

L&D with FLEX- Enterprise Learning Repository: Provides access to a learning repository for professional development.
Mentorship Program: Offers guidance and support from experienced professionals.
Job Training: Provides training to enhance job-related skills.
Professional Certification Reimbursements: Assists employees in obtaining professional certifications.
Promote from Within: Encourages internal growth and advancement opportunities.

About Kanerika:

Awards and Recognitions:

Kanerika has won several awards over the years, including:

1. Best Place to Work 2023 by Great Place to Work®

2. Top 10 Most Recommended RPA Start-Ups in 2022 by RPA Today

3. NASSCOM Emerge 50 Award in 2014

4. Frost & Sullivan India 2021 Technology Innovation Award for its Kompass composable solution architecture

5. Kanerika has also been recognized for its commitment to customer privacy and data security, having achieved ISO 27701, SOC2, and GDPR compliances.

Working for us:

Role Responsibilities:

Following are high level responsibilities that you will play but not limited to:

Design, development, and implementation of modern data pipelines, data models, and ETL/ELT processes.
Architect and optimize data lake and warehouse solutions using Microsoft Fabric, Databricks, or Snowflake.
Enable business analytics and self-service reporting through Power BI and other visualization tools.
Collaborate with data scientists, analysts, and business users to deliver reliable and high-performance data solutions.
Implement and enforce best practices for data governance, data quality, and security.
Mentor and guide junior data engineers; establish coding and design standards.
Evaluate emerging technologies and tools to continuously improve the data ecosystem.

Required Qualifications:

Bachelor's degree in computer science, Information Technology, Engineering, or a related field.
Bachelor’s/ Master’s degree in Computer Science, Information Technology, Engineering, or related field.
7-10 years of experience in data engineering or data platform development
Strong hands-on experience in SQL, Snowflake, Python, and Airflow
Solid understanding of data modeling, data governance, security, and CI/CD practices.

Preferred Qualifications:

Experience in leading a team
Familiarity with data modeling techniques and practices for Power BI.
Knowledge of Azure Databricks or other data processing frameworks.
Knowledge of Microsoft Fabric or other Cloud Platforms.

What we need?

· B. Tech computer science or equivalent.

Why join us?

Work with a passionate and innovative team in a fast-paced, growth-oriented environment.
Gain hands-on experience in content marketing with exposure to real-world projects.
Opportunity to learn from experienced professionals and enhance your marketing skills.
Contribute to exciting initiatives and make an impact from day one.
Competitive stipend and potential for growth within the company.
Recognized for excellence in data and AI solutions with industry awards and accolades.

Employee Benefits:

1. Culture:

Open Door Policy: Encourages open communication and accessibility to management.
Open Office Floor Plan: Fosters a collaborative and interactive work environment.
Flexible Working Hours: Allows employees to have flexibility in their work schedules.
Employee Referral Bonus: Rewards employees for referring qualified candidates.
Appraisal Process Twice a Year: Provides regular performance evaluations and feedback.

2. Inclusivity and Diversity:

Hiring practices that promote diversity: Ensures a diverse and inclusive workforce.
Mandatory POSH training: Promotes a safe and respectful work environment.

3. Health Insurance and Wellness Benefits:

GMC and Term Insurance: Offers medical coverage and financial protection.
Health Insurance: Provides coverage for medical expenses.
Disability Insurance: Offers financial support in case of disability.

4. Child Care & Parental Leave Benefits:

Company-sponsored family events: Creates opportunities for employees and their families to bond.
Generous Parental Leave: Allows parents to take time off after the birth or adoption of a child.
Family Medical Leave: Offers leave for employees to take care of family members' medical needs.

5. Perks and Time-Off Benefits:

Company-sponsored outings: Organizes recreational activities for employees.
Gratuity: Provides a monetary benefit as a token of appreciation.
Provident Fund: Helps employees save for retirement.
Generous PTO: Offers more than the industry standard for paid time off.
Paid sick days: Allows employees to take paid time off when they are unwell.
Paid holidays: Gives employees paid time off for designated holidays.
Bereavement Leave: Provides time off for employees to grieve the loss of a loved one.

6. Professional Development Benefits:

L&D with FLEX- Enterprise Learning Repository: Provides access to a learning repository for professional development.
Mentorship Program: Offers guidance and support from experienced professionals.
Job Training: Provides training to enhance job-related skills.
Professional Certification Reimbursements: Assists employees in obtaining professional certifications.
Promote from Within: Encourages internal growth and advancement opportunities.

Senior Data Engineer

at Tecblic Private LImited

Posted by Priya Khatri

Ahmedabad

5 - 6 yrs

₹5L - ₹15L / yr

Windows Azure

Python

SQL

Data Warehouse (DWH)

Data modeling

+5 more

Job Description: Data Engineer

Location: Ahmedabad

Experience: 5 to 6 years

Employment Type: Full-Time

We are looking for a highly motivated and experienced Data Engineer to join our team. As a Data Engineer, you will play a critical role in designing, building, and optimizing data pipelines that ensure the availability, reliability, and performance of our data infrastructure. You will collaborate closely with data scientists, analysts, and cross-functional teams to provide timely and efficient data solutions.

Responsibilities

● Design and optimize data pipelines for various data sources

● Design and implement efficient data storage and retrieval mechanisms

● Develop data modelling solutions and data validation mechanisms

● Troubleshoot data-related issues and recommend process improvements

● Collaborate with data scientists and stakeholders to provide data-driven insights and solutions

● Coach and mentor junior data engineers in the team

Skills Required:

● Minimum 4 years of experience in data engineering or related field

● Proficient in designing and optimizing data pipelines and data modeling

● Strong programming expertise in Python

● Hands-on experience with big data technologies such as Hadoop, Spark, and Hive

● Extensive experience with cloud data services such as AWS, Azure, and GCP

● Advanced knowledge of database technologies like SQL, NoSQL, and data warehousing

● Knowledge of distributed computing and storage systems

● Familiarity with DevOps practices and power automate and Microsoft Fabric will be an added advantage

● Strong analytical and problem-solving skills with outstanding communication and collaboration abilities

Qualifications

Bachelor's degree in Computer Science, Data Science, or a Computer related field

Job Description: Data Engineer

Location: Ahmedabad

Experience: 5 to 6 years

Employment Type: Full-Time

Responsibilities

● Design and optimize data pipelines for various data sources

● Design and implement efficient data storage and retrieval mechanisms

● Develop data modelling solutions and data validation mechanisms

● Troubleshoot data-related issues and recommend process improvements

● Collaborate with data scientists and stakeholders to provide data-driven insights and solutions

● Coach and mentor junior data engineers in the team

Skills Required:

● Minimum 4 years of experience in data engineering or related field

● Proficient in designing and optimizing data pipelines and data modeling

● Strong programming expertise in Python

● Hands-on experience with big data technologies such as Hadoop, Spark, and Hive

● Extensive experience with cloud data services such as AWS, Azure, and GCP

● Advanced knowledge of database technologies like SQL, NoSQL, and data warehousing

● Knowledge of distributed computing and storage systems

● Familiarity with DevOps practices and power automate and Microsoft Fabric will be an added advantage

● Strong analytical and problem-solving skills with outstanding communication and collaboration abilities

Qualifications

Bachelor's degree in Computer Science, Data Science, or a Computer related field

Data Engineer

at Janvi Panchal

Posted by Janvi Panchal

Ahmedabad

4 - 6 yrs

₹10L - ₹20L / yr

Python

PySpark

Microsoft Windows Azure

Amazon Web Services (AWS)

SQL

+1 more

Job Description:

4+ years of experience in a Data Engineer role,
Experience with object-oriented/object function scripting languages: Python, Scala, Golang, Java, etc.
Experience with Big data tools such as Spark, Hadoop/ Kafka/ Airflow/Hive
Experience with Streaming data: Spark/Kinesis/Kafka/Pubsub/Event Hub
Experience with GCP/Azure data factory/AWS
Strong in SQL Scripting
Experience with ETL tools
Knowledge of Snowflake Data Warehouse
Knowledge of Orchestration frameworks: Airflow/Luig
Good to have knowledge of Data Quality Management frameworks
Good to have knowledge of Master Data Management
Self-learning abilities are a must
Familiarity with upcoming new technologies is a strong plus.
Should have a bachelor's degree in big data analytics, computer engineering, or a related field

Personal Competency:

Strong communication skills is a MUST
Self-motivated, detail-oriented
Strong organizational skills
Ability to prioritize workloads and meet deadlines

Job Description:

4+ years of experience in a Data Engineer role,
Experience with object-oriented/object function scripting languages: Python, Scala, Golang, Java, etc.
Experience with Big data tools such as Spark, Hadoop/ Kafka/ Airflow/Hive
Experience with Streaming data: Spark/Kinesis/Kafka/Pubsub/Event Hub
Experience with GCP/Azure data factory/AWS
Strong in SQL Scripting
Experience with ETL tools
Knowledge of Snowflake Data Warehouse
Knowledge of Orchestration frameworks: Airflow/Luig
Good to have knowledge of Data Quality Management frameworks
Good to have knowledge of Master Data Management
Self-learning abilities are a must
Familiarity with upcoming new technologies is a strong plus.
Should have a bachelor's degree in big data analytics, computer engineering, or a related field

Personal Competency:

Strong communication skills is a MUST
Self-motivated, detail-oriented
Strong organizational skills
Ability to prioritize workloads and meet deadlines

Data Engineer

at Tecblic Private LImited

Posted by HR HR

Ahmedabad

4 - 5 yrs

₹8L - ₹12L / yr

Microsoft Windows Azure

SQL

Python

PySpark

ETL

+2 more

🚀 We Are Hiring: Data Engineer | 4+ Years Experience 🚀

Job description

🔍 Job Title: Data Engineer

📍 Location: Ahmedabad

🚀 Work Mode: On-Site Opportunity

📅 Experience: 4+ Years

🕒 Employment Type: Full-Time

⏱️ Availability : Immediate Joiner Preferred

Join Our Team as a Data Engineer

We are seeking a passionate and experienced Data Engineer to be a part of our dynamic and forward-thinking team in Ahmedabad. This is an exciting opportunity for someone who thrives on transforming raw data into powerful insights and building scalable, high-performance data infrastructure.

As a Data Engineer, you will work closely with data scientists, analysts, and cross-functional teams to design robust data pipelines, optimize data systems, and enable data-driven decision-making across the organization.

Your Key Responsibilities

Architect, build, and maintain scalable and reliable data pipelines from diverse data sources.

Design effective data storage, retrieval mechanisms, and data models to support analytics and business needs.

Implement data validation, transformation, and quality monitoring processes.

Collaborate with cross-functional teams to deliver impactful, data-driven solutions.

Proactively identify bottlenecks and optimize existing workflows and processes.

Provide guidance and mentorship to junior engineers in the team.

Skills & Expertise We’re Looking For

3+ years of hands-on experience in Data Engineering or related roles.

Strong expertise in Python and data pipeline design.

Experience working with Big Data tools like Hadoop, Spark, Hive.

Proficiency with SQL, NoSQL databases, and data warehousing solutions.

Solid experience in cloud platforms - Azure

Familiar with distributed computing, data modeling, and performance tuning.

Understanding of DevOps, Power Automate, and Microsoft Fabric is a plus.

Strong analytical thinking, collaboration skills, Excellent Communication Skill and the ability to work independently or as part of a team.

Qualifications

Bachelor’s degree in Computer Science, Data Science, or a related field.

🚀 We Are Hiring: Data Engineer | 4+ Years Experience 🚀

Job description

🔍 Job Title: Data Engineer

📍 Location: Ahmedabad

🚀 Work Mode: On-Site Opportunity

📅 Experience: 4+ Years

🕒 Employment Type: Full-Time

⏱️ Availability : Immediate Joiner Preferred

Join Our Team as a Data Engineer

Your Key Responsibilities

Architect, build, and maintain scalable and reliable data pipelines from diverse data sources.

Design effective data storage, retrieval mechanisms, and data models to support analytics and business needs.

Implement data validation, transformation, and quality monitoring processes.

Collaborate with cross-functional teams to deliver impactful, data-driven solutions.

Proactively identify bottlenecks and optimize existing workflows and processes.

Provide guidance and mentorship to junior engineers in the team.

Skills & Expertise We’re Looking For

3+ years of hands-on experience in Data Engineering or related roles.

Strong expertise in Python and data pipeline design.

Experience working with Big Data tools like Hadoop, Spark, Hive.

Proficiency with SQL, NoSQL databases, and data warehousing solutions.

Solid experience in cloud platforms - Azure

Familiar with distributed computing, data modeling, and performance tuning.

Understanding of DevOps, Power Automate, and Microsoft Fabric is a plus.

Strong analytical thinking, collaboration skills, Excellent Communication Skill and the ability to work independently or as part of a team.

Qualifications

Bachelor’s degree in Computer Science, Data Science, or a related field.

AWS Data Engineer (Contractual)

at Forward Eye Technologies

Posted by Jaya S

Bengaluru (Bangalore), Mumbai, Delhi, Gurugram, Pune, Hyderabad, Ahmedabad, Chennai

3 - 7 yrs

₹8L - ₹15L / yr

AWS Lambda

Amazon S3

Amazon VPC

Amazon EC2

Amazon Redshift

+3 more

Technical Skills:

Ability to understand and translate business requirements into design.
Proficient in AWS infrastructure components such as S3, IAM, VPC, EC2, and Redshift.
Experience in creating ETL jobs using Python/PySpark.
Proficiency in creating AWS Lambda functions for event-based jobs.
Knowledge of automating ETL processes using AWS Step Functions.
Competence in building data warehouses and loading data into them.

Responsibilities:

Understand business requirements and translate them into design.
Assess AWS infrastructure needs for development work.
Develop ETL jobs using Python/PySpark to meet requirements.
Implement AWS Lambda for event-based tasks.
Automate ETL processes using AWS Step Functions.
Build data warehouses and manage data loading.
Engage with customers and stakeholders to articulate the benefits of proposed solutions and frameworks.

Technical Skills:

Ability to understand and translate business requirements into design.
Proficient in AWS infrastructure components such as S3, IAM, VPC, EC2, and Redshift.
Experience in creating ETL jobs using Python/PySpark.
Proficiency in creating AWS Lambda functions for event-based jobs.
Knowledge of automating ETL processes using AWS Step Functions.
Competence in building data warehouses and loading data into them.

Responsibilities:

Understand business requirements and translate them into design.
Assess AWS infrastructure needs for development work.
Develop ETL jobs using Python/PySpark to meet requirements.
Implement AWS Lambda for event-based tasks.
Automate ETL processes using AWS Step Functions.
Build data warehouses and manage data loading.
Engage with customers and stakeholders to articulate the benefits of proposed solutions and frameworks.

Data Engineer

consulting & implementation services in the area of Oil & Gas, Mining and Manufacturing Industry

Agency job

via Jobdost by Sathish Kumar

Ahmedabad, Hyderabad, Pune, Delhi

5 - 7 yrs

₹18L - ₹25L / yr

AWS Lambda

AWS Simple Notification Service (SNS)

AWS Simple Queuing Service (SQS)

Python

PySpark

+9 more

Data Engineer

Required skill set: AWS GLUE, AWS LAMBDA, AWS SNS/SQS, AWS ATHENA, SPARK, SNOWFLAKE, PYTHON

Mandatory Requirements 

Experience in AWS Glue
Experience in Apache Parquet 
Proficient in AWS S3 and data lake 
Knowledge of Snowflake
Understanding of file-based ingestion best practices.
Scripting language - Python & pyspark

CORE RESPONSIBILITIES

Create and manage cloud resources in AWS 
Data ingestion from different data sources which exposes data using different technologies, such as: RDBMS, REST HTTP API, flat files, Streams, and Time series data based on various proprietary systems. Implement data ingestion and processing with the help of Big Data technologies 
Data processing/transformation using various technologies such as Spark and Cloud Services. You will need to understand your part of business logic and implement it using the language supported by the base data platform 
Develop automated data quality check to make sure right data enters the platform and verifying the results of the calculations 
Develop an infrastructure to collect, transform, combine and publish/distribute customer data.
Define process improvement opportunities to optimize data collection, insights and displays.
Ensure data and results are accessible, scalable, efficient, accurate, complete and flexible 
Identify and interpret trends and patterns from complex data sets 
Construct a framework utilizing data visualization tools and techniques to present consolidated analytical and actionable results to relevant stakeholders. 
Key participant in regular Scrum ceremonies with the agile teams  
Proficient at developing queries, writing reports and presenting findings 
Mentor junior members and bring best industry practices

 QUALIFICATIONS

5-7+ years’ experience as data engineer in consumer finance or equivalent industry (consumer loans, collections, servicing, optional product, and insurance sales) 
Strong background in math, statistics, computer science, data science or related discipline
Advanced knowledge one of language: Java, Scala, Python, C# 
Production experience with: HDFS, YARN, Hive, Spark, Kafka, Oozie / Airflow, Amazon Web Services (AWS), Docker / Kubernetes, Snowflake  
Proficient with
Data mining/programming tools (e.g. SAS, SQL, R, Python)
Database technologies (e.g. PostgreSQL, Redshift, Snowflake. and Greenplum)
Data visualization (e.g. Tableau, Looker, MicroStrategy)
Comfortable learning about and deploying new technologies and tools. 
Organizational skills and the ability to handle multiple projects and priorities simultaneously and meet established deadlines. 
Good written and oral communication skills and ability to present results to non-technical audiences 
Knowledge of business intelligence and analytical tools, technologies and techniques.

Familiarity and experience in the following is a plus: 

AWS certification
Spark Streaming 
Kafka Streaming / Kafka Connect 
ELK Stack 
Cassandra / MongoDB 
CI/CD: Jenkins, GitLab, Jira, Confluence other related tools

Data Engineer

Required skill set: AWS GLUE, AWS LAMBDA, AWS SNS/SQS, AWS ATHENA, SPARK, SNOWFLAKE, PYTHON

Mandatory Requirements 

Experience in AWS Glue
Experience in Apache Parquet 
Proficient in AWS S3 and data lake 
Knowledge of Snowflake
Understanding of file-based ingestion best practices.
Scripting language - Python & pyspark

CORE RESPONSIBILITIES

Create and manage cloud resources in AWS 
Data ingestion from different data sources which exposes data using different technologies, such as: RDBMS, REST HTTP API, flat files, Streams, and Time series data based on various proprietary systems. Implement data ingestion and processing with the help of Big Data technologies 
Data processing/transformation using various technologies such as Spark and Cloud Services. You will need to understand your part of business logic and implement it using the language supported by the base data platform 
Develop automated data quality check to make sure right data enters the platform and verifying the results of the calculations 
Develop an infrastructure to collect, transform, combine and publish/distribute customer data.
Define process improvement opportunities to optimize data collection, insights and displays.
Ensure data and results are accessible, scalable, efficient, accurate, complete and flexible 
Identify and interpret trends and patterns from complex data sets 
Construct a framework utilizing data visualization tools and techniques to present consolidated analytical and actionable results to relevant stakeholders. 
Key participant in regular Scrum ceremonies with the agile teams  
Proficient at developing queries, writing reports and presenting findings 
Mentor junior members and bring best industry practices

 QUALIFICATIONS

5-7+ years’ experience as data engineer in consumer finance or equivalent industry (consumer loans, collections, servicing, optional product, and insurance sales) 
Strong background in math, statistics, computer science, data science or related discipline
Advanced knowledge one of language: Java, Scala, Python, C# 
Production experience with: HDFS, YARN, Hive, Spark, Kafka, Oozie / Airflow, Amazon Web Services (AWS), Docker / Kubernetes, Snowflake  
Proficient with
Data mining/programming tools (e.g. SAS, SQL, R, Python)
Database technologies (e.g. PostgreSQL, Redshift, Snowflake. and Greenplum)
Data visualization (e.g. Tableau, Looker, MicroStrategy)
Comfortable learning about and deploying new technologies and tools. 
Organizational skills and the ability to handle multiple projects and priorities simultaneously and meet established deadlines. 
Good written and oral communication skills and ability to present results to non-technical audiences 
Knowledge of business intelligence and analytical tools, technologies and techniques.

Familiarity and experience in the following is a plus: 

AWS certification
Spark Streaming 
Kafka Streaming / Kafka Connect 
ELK Stack 
Cassandra / MongoDB 
CI/CD: Jenkins, GitLab, Jira, Confluence other related tools

Data Engineer

Consulting and Services company

Agency job

via Jobdost by Sathish Kumar

Hyderabad, Ahmedabad

5 - 10 yrs

₹5L - ₹30L / yr

Amazon Web Services (AWS)

Apache

Python

PySpark

Data Engineer

Mandatory Requirements 

Experience in AWS Glue

Experience in Apache Parquet 
Proficient in AWS S3 and data lake 
Knowledge of Snowflake
Understanding of file-based ingestion best practices.
Scripting language - Python & pyspark

CORE RESPONSIBILITIES

Create and manage cloud resources in AWS 
Data ingestion from different data sources which exposes data using different technologies, such as: RDBMS, REST HTTP API, flat files, Streams, and Time series data based on various proprietary systems. Implement data ingestion and processing with the help of Big Data technologies 
Data processing/transformation using various technologies such as Spark and Cloud Services. You will need to understand your part of business logic and implement it using the language supported by the base data platform

Develop automated data quality check to make sure right data enters the platform and verifying the results of the calculations 
Develop an infrastructure to collect, transform, combine and publish/distribute customer data.
Define process improvement opportunities to optimize data collection, insights and displays.
Ensure data and results are accessible, scalable, efficient, accurate, complete and flexible 
Identify and interpret trends and patterns from complex data sets

Construct a framework utilizing data visualization tools and techniques to present consolidated analytical and actionable results to relevant stakeholders. 
Key participant in regular Scrum ceremonies with the agile teams  
Proficient at developing queries, writing reports and presenting findings 
Mentor junior members and bring best industry practices

QUALIFICATIONS

5-7+ years’ experience as data engineer in consumer finance or equivalent industry (consumer loans, collections, servicing, optional product, and insurance sales) 
Strong background in math, statistics, computer science, data science or related discipline
Advanced knowledge one of language: Java, Scala, Python, C# 
Production experience with: HDFS, YARN, Hive, Spark, Kafka, Oozie / Airflow, Amazon Web Services (AWS), Docker / Kubernetes, Snowflake

Proficient with
Data mining/programming tools (e.g. SAS, SQL, R, Python)
Database technologies (e.g. PostgreSQL, Redshift, Snowflake. and Greenplum)
Data visualization (e.g. Tableau, Looker, MicroStrategy)
Comfortable learning about and deploying new technologies and tools.

Organizational skills and the ability to handle multiple projects and priorities simultaneously and meet established deadlines. 
Good written and oral communication skills and ability to present results to non-technical audiences 
Knowledge of business intelligence and analytical tools, technologies and techniques.

Familiarity and experience in the following is a plus: 

AWS certification
Spark Streaming 
Kafka Streaming / Kafka Connect 
ELK Stack 
Cassandra / MongoDB

CI/CD: Jenkins, GitLab, Jira, Confluence other related tools

Data Engineer

Mandatory Requirements 

Experience in AWS Glue

Experience in Apache Parquet 
Proficient in AWS S3 and data lake 
Knowledge of Snowflake
Understanding of file-based ingestion best practices.
Scripting language - Python & pyspark

CORE RESPONSIBILITIES

Create and manage cloud resources in AWS 
Data ingestion from different data sources which exposes data using different technologies, such as: RDBMS, REST HTTP API, flat files, Streams, and Time series data based on various proprietary systems. Implement data ingestion and processing with the help of Big Data technologies 
Data processing/transformation using various technologies such as Spark and Cloud Services. You will need to understand your part of business logic and implement it using the language supported by the base data platform

Develop automated data quality check to make sure right data enters the platform and verifying the results of the calculations 
Develop an infrastructure to collect, transform, combine and publish/distribute customer data.
Define process improvement opportunities to optimize data collection, insights and displays.
Ensure data and results are accessible, scalable, efficient, accurate, complete and flexible 
Identify and interpret trends and patterns from complex data sets

Construct a framework utilizing data visualization tools and techniques to present consolidated analytical and actionable results to relevant stakeholders. 
Key participant in regular Scrum ceremonies with the agile teams  
Proficient at developing queries, writing reports and presenting findings 
Mentor junior members and bring best industry practices

QUALIFICATIONS

5-7+ years’ experience as data engineer in consumer finance or equivalent industry (consumer loans, collections, servicing, optional product, and insurance sales) 
Strong background in math, statistics, computer science, data science or related discipline
Advanced knowledge one of language: Java, Scala, Python, C# 
Production experience with: HDFS, YARN, Hive, Spark, Kafka, Oozie / Airflow, Amazon Web Services (AWS), Docker / Kubernetes, Snowflake

Proficient with
Data mining/programming tools (e.g. SAS, SQL, R, Python)
Database technologies (e.g. PostgreSQL, Redshift, Snowflake. and Greenplum)
Data visualization (e.g. Tableau, Looker, MicroStrategy)
Comfortable learning about and deploying new technologies and tools.

Organizational skills and the ability to handle multiple projects and priorities simultaneously and meet established deadlines. 
Good written and oral communication skills and ability to present results to non-technical audiences 
Knowledge of business intelligence and analytical tools, technologies and techniques.

Familiarity and experience in the following is a plus: 

AWS certification
Spark Streaming 
Kafka Streaming / Kafka Connect 
ELK Stack 
Cassandra / MongoDB

CI/CD: Jenkins, GitLab, Jira, Confluence other related tools

Get to hear about interesting companies hiring right now

Follow Cutshort

Why apply via Cutshort?

Connect with actual hiring teams and get their fast response. No spam.

Find more jobs

Get to hear about interesting companies hiring right now

Follow Cutshort