Cutshort logo
PySpark Jobs in Bangalore (Bengaluru)

50+ PySpark Jobs in Bangalore (Bengaluru) | PySpark Job openings in Bangalore (Bengaluru)

Apply to 50+ PySpark Jobs in Bangalore (Bengaluru) on CutShort.io. Explore the latest PySpark Job opportunities across top companies like Google, Amazon & Adobe.

icon
Deqode

at Deqode

1 recruiter
Alisha Das
Posted by Alisha Das
Bengaluru (Bangalore), Mumbai, Pune, Chennai, Gurugram
5.6 - 7 yrs
₹10L - ₹28L / yr
skill iconAmazon Web Services (AWS)
skill iconPython
PySpark
SQL

Job Summary:

As an AWS Data Engineer, you will be responsible for designing, developing, and maintaining scalable, high-performance data pipelines using AWS services. With 6+ years of experience, you’ll collaborate closely with data architects, analysts, and business stakeholders to build reliable, secure, and cost-efficient data infrastructure across the organization.

Key Responsibilities:

  • Design, develop, and manage scalable data pipelines using AWS Glue, Lambda, and other serverless technologies
  • Implement ETL workflows and transformation logic using PySpark and Python on AWS Glue
  • Leverage AWS Redshift for warehousing, performance tuning, and large-scale data queries
  • Work with AWS DMS and RDS for database integration and migration
  • Optimize data flows and system performance for speed and cost-effectiveness
  • Deploy and manage infrastructure using AWS CloudFormation templates
  • Collaborate with cross-functional teams to gather requirements and build robust data solutions
  • Ensure data integrity, quality, and security across all systems and processes

Required Skills & Experience:

  • 6+ years of experience in Data Engineering with strong AWS expertise
  • Proficient in Python and PySpark for data processing and ETL development
  • Hands-on experience with AWS Glue, Lambda, DMS, RDS, and Redshift
  • Strong SQL skills for building complex queries and performing data analysis
  • Familiarity with AWS CloudFormation and infrastructure as code principles
  • Good understanding of serverless architecture and cost-optimized design
  • Ability to write clean, modular, and maintainable code
  • Strong analytical thinking and problem-solving skills


Read more
Bengaluru (Bangalore), Pune, Chennai
5 - 12 yrs
₹5L - ₹25L / yr
PySpark
Automation
SQL

Skill Name: ETL Automation Testing

Location: Bangalore, Chennai and Pune

Experience: 5+ Years


Required:

Experience in ETL Automation Testing

Strong experience in Pyspark.

Read more
Wissen Technology

at Wissen Technology

4 recruiters
Vishakha Walunj
Posted by Vishakha Walunj
Bengaluru (Bangalore), Pune, Mumbai
7 - 12 yrs
Best in industry
PySpark
databricks
SQL
skill iconPython

Required Skills:

  • Hands-on experience with Databricks, PySpark
  • Proficiency in SQL, Python, and Spark.
  • Understanding of data warehousing concepts and data modeling.
  • Experience with CI/CD pipelines and version control (e.g., Git).
  • Fundamental knowledge of any cloud services, preferably Azure or GCP.


Good to Have:

  • Bigquery
  • Experience with performance tuning and data governance.


Read more
Deqode

at Deqode

1 recruiter
Roshni Maji
Posted by Roshni Maji
Pune, Bengaluru (Bangalore), Gurugram, Chennai, Mumbai
5 - 7 yrs
₹6L - ₹20L / yr
skill iconAmazon Web Services (AWS)
Amazon Redshift
AWS Glue
skill iconPython
PySpark

Position: AWS Data Engineer

Experience: 5 to 7 Years

Location: Bengaluru, Pune, Chennai, Mumbai, Gurugram

Work Mode: Hybrid (3 days work from office per week)

Employment Type: Full-time

About the Role:

We are seeking a highly skilled and motivated AWS Data Engineer with 5–7 years of experience in building and optimizing data pipelines, architectures, and data sets. The ideal candidate will have strong experience with AWS services including Glue, Athena, Redshift, Lambda, DMS, RDS, and CloudFormation. You will be responsible for managing the full data lifecycle from ingestion to transformation and storage, ensuring efficiency and performance.

Key Responsibilities:

  • Design, develop, and optimize scalable ETL pipelines using AWS Glue, Python/PySpark, and SQL.
  • Work extensively with AWS services such as Glue, Athena, Lambda, DMS, RDS, Redshift, CloudFormation, and other serverless technologies.
  • Implement and manage data lake and warehouse solutions using AWS Redshift and S3.
  • Optimize data models and storage for cost-efficiency and performance.
  • Write advanced SQL queries to support complex data analysis and reporting requirements.
  • Collaborate with stakeholders to understand data requirements and translate them into scalable solutions.
  • Ensure high data quality and integrity across platforms and processes.
  • Implement CI/CD pipelines and best practices for infrastructure as code using CloudFormation or similar tools.

Required Skills & Experience:

  • Strong hands-on experience with Python or PySpark for data processing.
  • Deep knowledge of AWS Glue, Athena, Lambda, Redshift, RDS, DMS, and CloudFormation.
  • Proficiency in writing complex SQL queries and optimizing them for performance.
  • Familiarity with serverless architectures and AWS best practices.
  • Experience in designing and maintaining robust data architectures and data lakes.
  • Ability to troubleshoot and resolve data pipeline issues efficiently.
  • Strong communication and stakeholder management skills.


Read more
Deqode

at Deqode

1 recruiter
Roshni Maji
Posted by Roshni Maji
Bengaluru (Bangalore), Pune, Mumbai, Chennai, Gurugram
5 - 7 yrs
₹5L - ₹19L / yr
skill iconPython
PySpark
skill iconAmazon Web Services (AWS)
aws
Amazon Redshift
+1 more

Position: AWS Data Engineer

Experience: 5 to 7 Years

Location: Bengaluru, Pune, Chennai, Mumbai, Gurugram

Work Mode: Hybrid (3 days work from office per week)

Employment Type: Full-time

About the Role:

We are seeking a highly skilled and motivated AWS Data Engineer with 5–7 years of experience in building and optimizing data pipelines, architectures, and data sets. The ideal candidate will have strong experience with AWS services including Glue, Athena, Redshift, Lambda, DMS, RDS, and CloudFormation. You will be responsible for managing the full data lifecycle from ingestion to transformation and storage, ensuring efficiency and performance.

Key Responsibilities:

  • Design, develop, and optimize scalable ETL pipelines using AWS Glue, Python/PySpark, and SQL.
  • Work extensively with AWS services such as Glue, Athena, Lambda, DMS, RDS, Redshift, CloudFormation, and other serverless technologies.
  • Implement and manage data lake and warehouse solutions using AWS Redshift and S3.
  • Optimize data models and storage for cost-efficiency and performance.
  • Write advanced SQL queries to support complex data analysis and reporting requirements.
  • Collaborate with stakeholders to understand data requirements and translate them into scalable solutions.
  • Ensure high data quality and integrity across platforms and processes.
  • Implement CI/CD pipelines and best practices for infrastructure as code using CloudFormation or similar tools.

Required Skills & Experience:

  • Strong hands-on experience with Python or PySpark for data processing.
  • Deep knowledge of AWS Glue, Athena, Lambda, Redshift, RDS, DMS, and CloudFormation.
  • Proficiency in writing complex SQL queries and optimizing them for performance.
  • Familiarity with serverless architectures and AWS best practices.
  • Experience in designing and maintaining robust data architectures and data lakes.
  • Ability to troubleshoot and resolve data pipeline issues efficiently.
  • Strong communication and stakeholder management skills.


Read more
Deqode

at Deqode

1 recruiter
Mokshada Solanki
Posted by Mokshada Solanki
Bengaluru (Bangalore), Mumbai, Pune, Gurugram
4 - 5 yrs
₹4L - ₹20L / yr
SQL
skill iconAmazon Web Services (AWS)
Migration
PySpark
ETL

Job Summary:

Seeking a seasoned SQL + ETL Developer with 4+ years of experience in managing large-scale datasets and cloud-based data pipelines. The ideal candidate is hands-on with MySQL, PySpark, AWS Glue, and ETL workflows, with proven expertise in AWS migration and performance optimization.


Key Responsibilities:

  • Develop and optimize complex SQL queries and stored procedures to handle large datasets (100+ million records).
  • Build and maintain scalable ETL pipelines using AWS Glue and PySpark.
  • Work on data migration tasks in AWS environments.
  • Monitor and improve database performance; automate key performance indicators and reports.
  • Collaborate with cross-functional teams to support data integration and delivery requirements.
  • Write shell scripts for automation and manage ETL jobs efficiently.


Required Skills:

  • Strong experience with MySQL, complex SQL queries, and stored procedures.
  • Hands-on experience with AWS Glue, PySpark, and ETL processes.
  • Good understanding of AWS ecosystem and migration strategies.
  • Proficiency in shell scripting.
  • Strong communication and collaboration skills.


Nice to Have:

  • Working knowledge of Python.
  • Experience with AWS RDS.



Read more
Deqode

at Deqode

1 recruiter
Shraddha Katare
Posted by Shraddha Katare
Bengaluru (Bangalore), Pune, Chennai, Mumbai, Gurugram
5 - 7 yrs
₹5L - ₹19L / yr
skill iconAmazon Web Services (AWS)
skill iconPython
PySpark
SQL
redshift

Profile: AWS Data Engineer

Mode- Hybrid

Experience- 5+7 years

Locations - Bengaluru, Pune, Chennai, Mumbai, Gurugram


Roles and Responsibilities

  • Design and maintain ETL pipelines using AWS Glue and Python/PySpark
  • Optimize SQL queries for Redshift and Athena
  • Develop Lambda functions for serverless data processing
  • Configure AWS DMS for database migration and replication
  • Implement infrastructure as code with CloudFormation
  • Build optimized data models for performance
  • Manage RDS databases and AWS service integrations
  • Troubleshoot and improve data processing efficiency
  • Gather requirements from business stakeholders
  • Implement data quality checks and validation
  • Document data pipelines and architecture
  • Monitor workflows and implement alerting
  • Keep current with AWS services and best practices


Required Technical Expertise:

  • Python/PySpark for data processing
  • AWS Glue for ETL operations
  • Redshift and Athena for data querying
  • AWS Lambda and serverless architecture
  • AWS DMS and RDS management
  • CloudFormation for infrastructure
  • SQL optimization and performance tuning
Read more
Gruve
Reshika Mendiratta
Posted by Reshika Mendiratta
Bengaluru (Bangalore), Pune
5yrs+
Upto ₹50L / yr (Varies
)
skill iconPython
SQL
Data engineering
Apache Spark
PySpark
+6 more

About the Company:

Gruve is an innovative Software Services startup dedicated to empowering Enterprise Customers in managing their Data Life Cycle. We specialize in Cyber Security, Customer Experience, Infrastructure, and advanced technologies such as Machine Learning and Artificial Intelligence. Our mission is to assist our customers in their business strategies utilizing their data to make more intelligent decisions. As a well-funded early-stage startup, Gruve offers a dynamic environment with strong customer and partner networks.

 

Why Gruve:

At Gruve, we foster a culture of innovation, collaboration, and continuous learning. We are committed to building a diverse and inclusive workplace where everyone can thrive and contribute their best work. If you’re passionate about technology and eager to make an impact, we’d love to hear from you.

Gruve is an equal opportunity employer. We welcome applicants from all backgrounds and thank all who apply; however, only those selected for an interview will be contacted.

 

Position summary:

We are seeking a Senior Software Development Engineer – Data Engineering with 5-8 years of experience to design, develop, and optimize data pipelines and analytics workflows using Snowflake, Databricks, and Apache Spark. The ideal candidate will have a strong background in big data processing, cloud data platforms, and performance optimization to enable scalable data-driven solutions. 

Key Roles & Responsibilities:

  • Design, develop, and optimize ETL/ELT pipelines using Apache Spark, PySpark, Databricks, and Snowflake.
  • Implement real-time and batch data processing workflows in cloud environments (AWS, Azure, GCP).
  • Develop high-performance, scalable data pipelines for structured, semi-structured, and unstructured data.
  • Work with Delta Lake and Lakehouse architectures to improve data reliability and efficiency.
  • Optimize Snowflake and Databricks performance, including query tuning, caching, partitioning, and cost optimization.
  • Implement data governance, security, and compliance best practices.
  • Build and maintain data models, transformations, and data marts for analytics and reporting.
  • Collaborate with data scientists, analysts, and business teams to define data engineering requirements.
  • Automate infrastructure and deployments using Terraform, Airflow, or dbt.
  • Monitor and troubleshoot data pipeline failures, performance issues, and bottlenecks.
  • Develop and enforce data quality and observability frameworks using Great Expectations, Monte Carlo, or similar tools.


Basic Qualifications:

  • Bachelor’s or Master’s Degree in Computer Science or Data Science.
  • 5–8 years of experience in data engineering, big data processing, and cloud-based data platforms.
  • Hands-on expertise in Apache Spark, PySpark, and distributed computing frameworks.
  • Strong experience with Snowflake (Warehouses, Streams, Tasks, Snowpipe, Query Optimization).
  • Experience in Databricks (Delta Lake, MLflow, SQL Analytics, Photon Engine).
  • Proficiency in SQL, Python, or Scala for data transformation and analytics.
  • Experience working with data lake architectures and storage formats (Parquet, Avro, ORC, Iceberg).
  • Hands-on experience with cloud data services (AWS Redshift, Azure Synapse, Google BigQuery).
  • Experience in workflow orchestration tools like Apache Airflow, Prefect, or Dagster.
  • Strong understanding of data governance, access control, and encryption strategies.
  • Experience with CI/CD for data pipelines using GitOps, Terraform, dbt, or similar technologies.


Preferred Qualifications:

  • Knowledge of streaming data processing (Apache Kafka, Flink, Kinesis, Pub/Sub).
  • Experience in BI and analytics tools (Tableau, Power BI, Looker).
  • Familiarity with data observability tools (Monte Carlo, Great Expectations).
  • Experience with machine learning feature engineering pipelines in Databricks.
  • Contributions to open-source data engineering projects.
Read more
Deqode

at Deqode

1 recruiter
Alisha Das
Posted by Alisha Das
Pune, Mumbai, Bengaluru (Bangalore), Chennai
4 - 7 yrs
₹5L - ₹15L / yr
skill iconAmazon Web Services (AWS)
skill iconPython
PySpark
Glue semantics
Amazon Redshift
+1 more

Job Overview:

We are seeking an experienced AWS Data Engineer to join our growing data team. The ideal candidate will have hands-on experience with AWS Glue, Redshift, PySpark, and other AWS services to build robust, scalable data pipelines. This role is perfect for someone passionate about data engineering, automation, and cloud-native development.

Key Responsibilities:

  • Design, build, and maintain scalable and efficient ETL pipelines using AWS Glue, PySpark, and related tools.
  • Integrate data from diverse sources and ensure its quality, consistency, and reliability.
  • Work with large datasets in structured and semi-structured formats across cloud-based data lakes and warehouses.
  • Optimize and maintain data infrastructure, including Amazon Redshift, for high performance.
  • Collaborate with data analysts, data scientists, and product teams to understand data requirements and deliver solutions.
  • Automate data validation, transformation, and loading processes to support real-time and batch data processing.
  • Monitor and troubleshoot data pipeline issues and ensure smooth operations in production environments.

Required Skills:

  • 5 to 7 years of hands-on experience in data engineering roles.
  • Strong proficiency in Python and PySpark for data transformation and scripting.
  • Deep understanding and practical experience with AWS Glue, AWS Redshift, S3, and other AWS data services.
  • Solid understanding of SQL and database optimization techniques.
  • Experience working with large-scale data pipelines and high-volume data environments.
  • Good knowledge of data modeling, warehousing, and performance tuning.

Preferred/Good to Have:

  • Experience with workflow orchestration tools like Airflow or Step Functions.
  • Familiarity with CI/CD for data pipelines.
  • Knowledge of data governance and security best practices on AWS.
Read more
Deqode

at Deqode

1 recruiter
Shraddha Katare
Posted by Shraddha Katare
Pune, Mumbai, Bengaluru (Bangalore), Gurugram
4 - 6 yrs
₹5L - ₹10L / yr
ETL
SQL
skill iconAmazon Web Services (AWS)
PySpark
KPI

Role - ETL Developer

Work ModeHybrid

Experience- 4+ years

Location - Pune, Gurgaon, Bengaluru, Mumbai

Required Skills - AWS, AWS Glue, Pyspark, ETL, SQL

Required Skills:

  • 4+ years of hands-on experience in MySQL, including SQL queries and procedure development
  • Experience in Pyspark, AWS, AWS Glue
  • Experience in AWS ,Migration
  • Experience with automated scripting and tracking KPIs/metrics for database performance
  • Proficiency in shell scripting and ETL.
  • Strong communication skills and a collaborative team player
  • Knowledge of Python and AWS RDS is a plus


Read more
Wissen Technology

at Wissen Technology

4 recruiters
Hanisha Pralayakaveri
Posted by Hanisha Pralayakaveri
Bengaluru (Bangalore), Mumbai
5 - 9 yrs
Best in industry
skill iconPython
skill iconAmazon Web Services (AWS)
PySpark
Data engineering

Job Description: Data Engineer 

Position Overview:

Role Overview

We are seeking a skilled Python Data Engineer with expertise in designing and implementing data solutions using the AWS cloud platform. The ideal candidate will be responsible for building and maintaining scalable, efficient, and secure data pipelines while leveraging Python and AWS services to enable robust data analytics and decision-making processes.

 

Key Responsibilities

· Design, develop, and optimize data pipelines using Python and AWS services such as Glue, Lambda, S3, EMR, Redshift, Athena, and Kinesis.

· Implement ETL/ELT processes to extract, transform, and load data from various sources into centralized repositories (e.g., data lakes or data warehouses).

· Collaborate with cross-functional teams to understand business requirements and translate them into scalable data solutions.

· Monitor, troubleshoot, and enhance data workflows for performance and cost optimization.

· Ensure data quality and consistency by implementing validation and governance practices.

· Work on data security best practices in compliance with organizational policies and regulations.

· Automate repetitive data engineering tasks using Python scripts and frameworks.

· Leverage CI/CD pipelines for deployment of data workflows on AWS.

Read more
ZeMoSo Technologies

at ZeMoSo Technologies

11 recruiters
Agency job
via TIGI HR Solution Pvt. Ltd. by Vaidehi Sarkar
Mumbai, Bengaluru (Bangalore), Hyderabad, Chennai, Pune
4 - 8 yrs
₹10L - ₹15L / yr
Data engineering
skill iconPython
SQL
Data Warehouse (DWH)
skill iconAmazon Web Services (AWS)
+3 more

Work Mode: Hybrid


Need B.Tech, BE, M.Tech, ME candidates - Mandatory



Must-Have Skills:

● Educational Qualification :- B.Tech, BE, M.Tech, ME in any field.

● Minimum of 3 years of proven experience as a Data Engineer.

● Strong proficiency in Python programming language and SQL.

● Experience in DataBricks and setting up and managing data pipelines, data warehouses/lakes.

● Good comprehension and critical thinking skills.


● Kindly note Salary bracket will vary according to the exp. of the candidate - 

- Experience from 4 yrs to 6 yrs - Salary upto 22 LPA

- Experience from 5 yrs to 8 yrs - Salary upto 30 LPA

- Experience more than 8 yrs - Salary upto 40 LPA

Read more
Deqode

at Deqode

1 recruiter
Alisha Das
Posted by Alisha Das
Bengaluru (Bangalore), Delhi, Gurugram, Noida, Ghaziabad, Faridabad, Mumbai, Pune, Hyderabad, Indore, Jaipur, Kolkata
4 - 5 yrs
₹2L - ₹18L / yr
skill iconPython
PySpark

We are looking for a skilled and passionate Data Engineers with a strong foundation in Python programming and hands-on experience working with APIs, AWS cloud, and modern development practices. The ideal candidate will have a keen interest in building scalable backend systems and working with big data tools like PySpark.

Key Responsibilities:

  • Write clean, scalable, and efficient Python code.
  • Work with Python frameworks such as PySpark for data processing.
  • Design, develop, update, and maintain APIs (RESTful).
  • Deploy and manage code using GitHub CI/CD pipelines.
  • Collaborate with cross-functional teams to define, design, and ship new features.
  • Work on AWS cloud services for application deployment and infrastructure.
  • Basic database design and interaction with MySQL or DynamoDB.
  • Debugging and troubleshooting application issues and performance bottlenecks.

Required Skills & Qualifications:

  • 4+ years of hands-on experience with Python development.
  • Proficient in Python basics with a strong problem-solving approach.
  • Experience with AWS Cloud services (EC2, Lambda, S3, etc.).
  • Good understanding of API development and integration.
  • Knowledge of GitHub and CI/CD workflows.
  • Experience in working with PySpark or similar big data frameworks.
  • Basic knowledge of MySQL or DynamoDB.
  • Excellent communication skills and a team-oriented mindset.

Nice to Have:

  • Experience in containerization (Docker/Kubernetes).
  • Familiarity with Agile/Scrum methodologies.


Read more
Xebia IT Architects

at Xebia IT Architects

2 recruiters
Vijay S
Posted by Vijay S
Bengaluru (Bangalore), Gurugram, Pune, Hyderabad, Chennai, Bhopal, Jaipur
10 - 15 yrs
₹30L - ₹40L / yr
Spark
Google Cloud Platform (GCP)
skill iconPython
Apache Airflow
PySpark
+1 more

We are looking for a Senior Data Engineer with strong expertise in GCP, Databricks, and Airflow to design and implement a GCP Cloud Native Data Processing Framework. The ideal candidate will work on building scalable data pipelines and help migrate existing workloads to a modern framework.


  • Shift: 2 PM 11 PM
  • Work Mode: Hybrid (3 days a week) across Xebia locations
  • Notice Period: Immediate joiners or those with a notice period of up to 30 days


Key Responsibilities:

  • Design and implement a GCP Native Data Processing Framework leveraging Spark and GCP Cloud Services.
  • Develop and maintain data pipelines using Databricks and Airflow for transforming Raw → Silver → Gold data layers.
  • Ensure data integrity, consistency, and availability across all systems.
  • Collaborate with data engineers, analysts, and stakeholders to optimize performance.
  • Document standards and best practices for data engineering workflows.

Required Experience:


  • 7-8 years of experience in data engineering, architecture, and pipeline development.
  • Strong knowledge of GCP, Databricks, PySpark, and BigQuery.
  • Experience with Orchestration tools like Airflow, Dagster, or GCP equivalents.
  • Understanding of Data Lake table formats (Delta, Iceberg, etc.).
  • Proficiency in Python for scripting and automation.
  • Strong problem-solving skills and collaborative mindset.


⚠️ Please apply only if you have not applied recently or are not currently in the interview process for any open roles at Xebia.


Looking forward to your response!


Best regards,

Vijay S

Assistant Manager - TAG

https://www.linkedin.com/in/vijay-selvarajan/

Read more
Wissen Technology

at Wissen Technology

4 recruiters
Sukanya Mohan
Posted by Sukanya Mohan
Pune, Bengaluru (Bangalore)
5 - 10 yrs
Best in industry
skill iconAmazon Web Services (AWS)
EMR
skill iconPython
GLUE
SQL
+1 more

Greetings , Wissen Technology is Hiring for the position of Data Engineer

Please find the Job Description for your Reference:


JD

  • Design, develop, and maintain data pipelines on AWS EMR (Elastic MapReduce) to support data processing and analytics.
  • Implement data ingestion processes from various sources including APIs, databases, and flat files.
  • Optimize and tune big data workflows for performance and scalability.
  • Collaborate with data scientists, analysts, and other stakeholders to understand data requirements and deliver solutions.
  • Manage and monitor EMR clusters, ensuring high availability and reliability.
  • Develop ETL (Extract, Transform, Load) processes to cleanse, transform, and store data in data lakes and data warehouses.
  • Implement data security best practices to ensure data is protected and compliant with relevant regulations.
  • Create and maintain technical documentation related to data pipelines, workflows, and infrastructure.
  • Troubleshoot and resolve issues related to data processing and EMR cluster performance.

 

 

Qualifications:

 

  • Bachelor’s degree in Computer Science, Information Technology, or a related field.
  • 5+ years of experience in data engineering, with a focus on big data technologies.
  • Strong experience with AWS services, particularly EMR, S3, Redshift, Lambda, and Glue.
  • Proficiency in programming languages such as Python, Java, or Scala.
  • Experience with big data frameworks and tools such as Hadoop, Spark, Hive, and Pig.
  • Solid understanding of data modeling, ETL processes, and data warehousing concepts.
  • Experience with SQL and NoSQL databases.
  • Familiarity with CI/CD pipelines and version control systems (e.g., Git).
  • Strong problem-solving skills and the ability to work independently and collaboratively in a team environment
Read more
codersbrain

at codersbrain

1 recruiter
Tanuj Uppal
Posted by Tanuj Uppal
Bengaluru (Bangalore)
10 - 15 yrs
₹10L - ₹15L / yr
Microsoft Windows Azure
Snowflake
Delivery Management
ETL
PySpark
+2 more
  • Sr. Solution Architect 
  • Job Location – Bangalore
  • Need candidates who can join in 15 days or less.
  • Overall, 12-15 years of experience.

 

Looking for this tech stack in a Sr. Solution Architect (who also has a Delivery Manager background). Someone who has heavy business and IT stakeholder collaboration and negotiation skills, someone who can provide thought leadership, collaborate in the development of Product roadmaps, influence decisions, negotiate effectively with business and IT stakeholders, etc.

 

  • Building data pipelines using Azure data tools and services (Azure Data Factory, Azure Databricks, Azure Function, Spark, Azure Blob/ADLS, Azure SQL, Snowflake..)
  • Administration of cloud infrastructure in public clouds such as Azure
  • Monitoring cloud infrastructure, applications, big data pipelines and ETL workflows
  • Managing outages, customer escalations, crisis management, and other similar circumstances.
  • Understanding of DevOps tools and environments like Azure DevOps, Jenkins, Git, Ansible, Terraform.
  • SQL, Spark SQL, Python, PySpark
  • Familiarity with agile software delivery methodologies
  • Proven experience collaborating with global Product Team members, including Business Stakeholders located in NA


Read more
Wissen Technology

at Wissen Technology

4 recruiters
Sukanya Mohan
Posted by Sukanya Mohan
Bengaluru (Bangalore)
8 - 15 yrs
Best in industry
Snow flake schema
skill iconPython
PySpark
databricks

Responsibilities:

  • Lead the design, development, and implementation of scalable data architectures leveraging Snowflake, Python, PySpark, and Databricks.
  • Collaborate with business stakeholders to understand requirements and translate them into technical specifications and data models.
  • Architect and optimize data pipelines for performance, reliability, and efficiency.
  • Ensure data quality, integrity, and security across all data processes and systems.
  • Provide technical leadership and mentorship to junior team members.
  • Stay abreast of industry trends and best practices in data architecture and analytics.
  • Drive innovation and continuous improvement in data management practices.

Requirements:

  • Bachelor's degree in Computer Science, Information Systems, or a related field. Master's degree preferred.
  • 5+ years of experience in data architecture, data engineering, or a related field.
  • Strong proficiency in Snowflake, including data modeling, performance tuning, and administration.
  • Expertise in Python and PySpark for data processing, manipulation, and analysis.
  • Hands-on experience with Databricks for building and managing data pipelines.
  • Proven leadership experience, with the ability to lead cross-functional teams and drive projects to successful completion.
  • Experience in the banking or insurance domain is highly desirable.
  • Excellent communication skills, with the ability to effectively collaborate with stakeholders at all levels of the organization.
  • Strong problem-solving and analytical skills, with a keen attention to detail.

Benefits:

  • Competitive salary and performance-based incentives.
  • Comprehensive benefits package, including health insurance, retirement plans, and wellness programs.
  • Flexible work arrangements, including remote options.
  • Opportunities for professional development and career advancement.
  • Dynamic and collaborative work environment with a focus on innovation and continuous learning.


Read more
Frisco Analytics Pvt Ltd
Cedrick Mariadas
Posted by Cedrick Mariadas
Bengaluru (Bangalore), Hyderabad
5 - 8 yrs
₹15L - ₹20L / yr
databricks
Apache Spark
skill iconPython
SQL
MySQL
+3 more

We are actively seeking a self-motivated Data Engineer with expertise in Azure cloud and Databricks, with a thorough understanding of Delta Lake and Lake-house Architecture. The ideal candidate should excel in developing scalable data solutions, crafting platform tools, and integrating systems, while demonstrating proficiency in cloud-native database solutions and distributed data processing.


Key Responsibilities:

  • Contribute to the development and upkeep of a scalable data platform, incorporating tools and frameworks that leverage Azure and Databricks capabilities.
  • Exhibit proficiency in various RDBMS databases such as MySQL and SQL-Server, emphasizing their integration in applications and pipeline development.
  • Design and maintain high-caliber code, including data pipelines and applications, utilizing Python, Scala, and PHP.
  • Implement effective data processing solutions via Apache Spark, optimizing Spark applications for large-scale data handling.
  • Optimize data storage using formats like Parquet and Delta Lake to ensure efficient data accessibility and reliable performance.
  • Demonstrate understanding of Hive Metastore, Unity Catalog Metastore, and the operational dynamics of external tables.
  • Collaborate with diverse teams to convert business requirements into precise technical specifications.

Requirements:

  • Bachelor’s degree in Computer Science, Engineering, or a related discipline.
  • Demonstrated hands-on experience with Azure cloud services and Databricks.
  • Proficient programming skills in Python, Scala, and PHP.
  • In-depth knowledge of SQL, NoSQL databases, and data warehousing principles.
  • Familiarity with distributed data processing and external table management.
  • Insight into enterprise data solutions for PIM, CDP, MDM, and ERP applications.
  • Exceptional problem-solving acumen and meticulous attention to detail.

Additional Qualifications :

  • Acquaintance with data security and privacy standards.
  • Experience in CI/CD pipelines and version control systems, notably Git.
  • Familiarity with Agile methodologies and DevOps practices.
  • Competence in technical writing for comprehensive documentation.


Read more
Bengaluru (Bangalore), Mumbai, Delhi, Gurugram, Pune, Hyderabad, Ahmedabad, Chennai
3 - 7 yrs
₹8L - ₹15L / yr
AWS Lambda
Amazon S3
Amazon VPC
Amazon EC2
Amazon Redshift
+3 more

Technical Skills:


  • Ability to understand and translate business requirements into design.
  • Proficient in AWS infrastructure components such as S3, IAM, VPC, EC2, and Redshift.
  • Experience in creating ETL jobs using Python/PySpark.
  • Proficiency in creating AWS Lambda functions for event-based jobs.
  • Knowledge of automating ETL processes using AWS Step Functions.
  • Competence in building data warehouses and loading data into them.


Responsibilities:


  • Understand business requirements and translate them into design.
  • Assess AWS infrastructure needs for development work.
  • Develop ETL jobs using Python/PySpark to meet requirements.
  • Implement AWS Lambda for event-based tasks.
  • Automate ETL processes using AWS Step Functions.
  • Build data warehouses and manage data loading.
  • Engage with customers and stakeholders to articulate the benefits of proposed solutions and frameworks.
Read more
Publicis Sapient

at Publicis Sapient

10 recruiters
Mohit Singh
Posted by Mohit Singh
Bengaluru (Bangalore), Pune, Hyderabad, Gurugram, Noida
5 - 11 yrs
₹20L - ₹36L / yr
PySpark
Data engineering
Big Data
Hadoop
Spark
+7 more

Publicis Sapient Overview:

The Senior Associate People Senior Associate L1 in Data Engineering, you will translate client requirements into technical design, and implement components for data engineering solution. Utilize deep understanding of data integration and big data design principles in creating custom solutions or implementing package solutions. You will independently drive design discussions to insure the necessary health of the overall solution 

.

Job Summary:

As Senior Associate L2 in Data Engineering, you will translate client requirements into technical design, and implement components for data engineering solution. Utilize deep understanding of data integration and big data design principles in creating custom solutions or implementing package solutions. You will independently drive design discussions to insure the necessary health of the overall solution

The role requires a hands-on technologist who has strong programming background like Java / Scala / Python, should have experience in Data Ingestion, Integration and data Wrangling, Computation, Analytics pipelines and exposure to Hadoop ecosystem components. You are also required to have hands-on knowledge on at least one of AWS, GCP, Azure cloud platforms.


Role & Responsibilities:

Your role is focused on Design, Development and delivery of solutions involving:

• Data Integration, Processing & Governance

• Data Storage and Computation Frameworks, Performance Optimizations

• Analytics & Visualizations

• Infrastructure & Cloud Computing

• Data Management Platforms

• Implement scalable architectural models for data processing and storage

• Build functionality for data ingestion from multiple heterogeneous sources in batch & real-time mode

• Build functionality for data analytics, search and aggregation

Experience Guidelines:

Mandatory Experience and Competencies:

# Competency

1.Overall 5+ years of IT experience with 3+ years in Data related technologies

2.Minimum 2.5 years of experience in Big Data technologies and working exposure in at least one cloud platform on related data services (AWS / Azure / GCP)

3.Hands-on experience with the Hadoop stack – HDFS, sqoop, kafka, Pulsar, NiFi, Spark, Spark Streaming, Flink, Storm, hive, oozie, airflow and other components required in building end to end data pipeline.

4.Strong experience in at least of the programming language Java, Scala, Python. Java preferable

5.Hands-on working knowledge of NoSQL and MPP data platforms like Hbase, MongoDb, Cassandra, AWS Redshift, Azure SQLDW, GCP BigQuery etc

6.Well-versed and working knowledge with data platform related services on at least 1 cloud platform, IAM and data security


Preferred Experience and Knowledge (Good to Have):

# Competency

1.Good knowledge of traditional ETL tools (Informatica, Talend, etc) and database technologies (Oracle, MySQL, SQL Server, Postgres) with hands on experience

2.Knowledge on data governance processes (security, lineage, catalog) and tools like Collibra, Alation etc

3.Knowledge on distributed messaging frameworks like ActiveMQ / RabbiMQ / Solace, search & indexing and Micro services architectures

4.Performance tuning and optimization of data pipelines

5.CI/CD – Infra provisioning on cloud, auto build & deployment pipelines, code quality

6.Cloud data specialty and other related Big data technology certifications


Personal Attributes:

• Strong written and verbal communication skills

• Articulation skills

• Good team player

• Self-starter who requires minimal oversight

• Ability to prioritize and manage multiple tasks

• Process orientation and the ability to define and set up processes


Read more
Publicis Sapient

at Publicis Sapient

10 recruiters
Mohit Singh
Posted by Mohit Singh
Bengaluru (Bangalore), Gurugram, Pune, Hyderabad, Noida
4 - 10 yrs
Best in industry
PySpark
Data engineering
Big Data
Hadoop
Spark
+6 more

Publicis Sapient Overview:

The Senior Associate People Senior Associate L1 in Data Engineering, you will translate client requirements into technical design, and implement components for data engineering solution. Utilize deep understanding of data integration and big data design principles in creating custom solutions or implementing package solutions. You will independently drive design discussions to insure the necessary health of the overall solution 

.

Job Summary:

As Senior Associate L1 in Data Engineering, you will do technical design, and implement components for data engineering solution. Utilize deep understanding of data integration and big data design principles in creating custom solutions or implementing package solutions. You will independently drive design discussions to insure the necessary health of the overall solution

The role requires a hands-on technologist who has strong programming background like Java / Scala / Python, should have experience in Data Ingestion, Integration and data Wrangling, Computation, Analytics pipelines and exposure to Hadoop ecosystem components. Having hands-on knowledge on at least one of AWS, GCP, Azure cloud platforms will be preferable.


Role & Responsibilities:

Job Title: Senior Associate L1 – Data Engineering

Your role is focused on Design, Development and delivery of solutions involving:

• Data Ingestion, Integration and Transformation

• Data Storage and Computation Frameworks, Performance Optimizations

• Analytics & Visualizations

• Infrastructure & Cloud Computing

• Data Management Platforms

• Build functionality for data ingestion from multiple heterogeneous sources in batch & real-time

• Build functionality for data analytics, search and aggregation


Experience Guidelines:

Mandatory Experience and Competencies:

# Competency

1.Overall 3.5+ years of IT experience with 1.5+ years in Data related technologies

2.Minimum 1.5 years of experience in Big Data technologies

3.Hands-on experience with the Hadoop stack – HDFS, sqoop, kafka, Pulsar, NiFi, Spark, Spark Streaming, Flink, Storm, hive, oozie, airflow and other components required in building end to end data pipeline. Working knowledge on real-time data pipelines is added advantage.

4.Strong experience in at least of the programming language Java, Scala, Python. Java preferable

5.Hands-on working knowledge of NoSQL and MPP data platforms like Hbase, MongoDb, Cassandra, AWS Redshift, Azure SQLDW, GCP BigQuery etc


Preferred Experience and Knowledge (Good to Have):

# Competency

1.Good knowledge of traditional ETL tools (Informatica, Talend, etc) and database technologies (Oracle, MySQL, SQL Server, Postgres) with hands on experience

2.Knowledge on data governance processes (security, lineage, catalog) and tools like Collibra, Alation etc

3.Knowledge on distributed messaging frameworks like ActiveMQ / RabbiMQ / Solace, search & indexing and Micro services architectures

4.Performance tuning and optimization of data pipelines

5.CI/CD – Infra provisioning on cloud, auto build & deployment pipelines, code quality

6.Working knowledge with data platform related services on at least 1 cloud platform, IAM and data security

7.Cloud data specialty and other related Big data technology certifications


Job Title: Senior Associate L1 – Data Engineering

Personal Attributes:

• Strong written and verbal communication skills

• Articulation skills

• Good team player

• Self-starter who requires minimal oversight

• Ability to prioritize and manage multiple tasks

• Process orientation and the ability to define and set up processes

Read more
Arting Digital
Pragati Bhardwaj
Posted by Pragati Bhardwaj
Bengaluru (Bangalore)
10 - 16 yrs
₹10L - ₹15L / yr
databricks
Data modeling
SQL
skill iconPython
AWS Lambda
+2 more

Title:- Lead Data Engineer 


Experience: 10+y

Budget: 32-36 LPA

Location: Bangalore 

Work of Mode: Work from office

Primary Skills: Data Bricks, Spark, Pyspark,Sql, Python, AWS

Qualification: Any Engineering degree


Roles and Responsibilities:


• 8 - 10+ years’ experience in developing scalable Big Data applications or solutions on

 distributed platforms.

• Able to partner with others in solving complex problems by taking a broad

 perspective to identify.

• innovative solutions.

• Strong skills building positive relationships across Product and Engineering.

• Able to influence and communicate effectively, both verbally and written, with team

  members and business stakeholders

• Able to quickly pick up new programming languages, technologies, and frameworks.

• Experience working in Agile and Scrum development process.

• Experience working in a fast-paced, results-oriented environment.

• Experience in Amazon Web Services (AWS) mainly S3, Managed Airflow, EMR/ EC2,

  IAM etc.

• Experience working with Data Warehousing tools, including SQL database, Presto,

  and Snowflake

• Experience architecting data product in Streaming, Serverless and Microservices

  Architecture and platform.

• Experience working with Data platforms, including EMR, Airflow, Databricks (Data

  Engineering & Delta

• Lake components, and Lakehouse Medallion architecture), etc.

• Experience with creating/ configuring Jenkins pipeline for smooth CI/CD process for

  Managed Spark jobs, build Docker images, etc.

• Experience working with distributed technology tools, including Spark, Python, Scala

• Working knowledge of Data warehousing, Data modelling, Governance and Data

  Architecture

• Working knowledge of Reporting & Analytical tools such as Tableau, Quicksite

  etc.

• Demonstrated experience in learning new technologies and skills.

• Bachelor’s degree in computer science, Information Systems, Business, or other

  relevant subject area

Read more
A LEADING US BASED MNC

A LEADING US BASED MNC

Agency job
via Zeal Consultants by Zeal Consultants
Bengaluru (Bangalore), Hyderabad, Delhi, Gurugram
5 - 10 yrs
₹14L - ₹15L / yr
Google Cloud Platform (GCP)
Spark
PySpark
Apache Spark
"DATA STREAMING"

Data Engineering : Senior Engineer / Manager


As Senior Engineer/ Manager in Data Engineering, you will translate client requirements into technical design, and implement components for a data engineering solutions. Utilize a deep understanding of data integration and big data design principles in creating custom solutions or implementing package solutions. You will independently drive design discussions to insure the necessary health of the overall solution.


Must Have skills :


1. GCP


2. Spark streaming : Live data streaming experience is desired.


3. Any 1 coding language: Java/Pyhton /Scala



Skills & Experience :


- Overall experience of MINIMUM 5+ years with Minimum 4 years of relevant experience in Big Data technologies


- Hands-on experience with the Hadoop stack - HDFS, sqoop, kafka, Pulsar, NiFi, Spark, Spark Streaming, Flink, Storm, hive, oozie, airflow and other components required in building end to end data pipeline. Working knowledge on real-time data pipelines is added advantage.


- Strong experience in at least of the programming language Java, Scala, Python. Java preferable


- Hands-on working knowledge of NoSQL and MPP data platforms like Hbase, MongoDb, Cassandra, AWS Redshift, Azure SQLDW, GCP BigQuery etc.


- Well-versed and working knowledge with data platform related services on GCP


- Bachelor's degree and year of work experience of 6 to 12 years or any combination of education, training and/or experience that demonstrates the ability to perform the duties of the position


Your Impact :


- Data Ingestion, Integration and Transformation


- Data Storage and Computation Frameworks, Performance Optimizations


- Analytics & Visualizations


- Infrastructure & Cloud Computing


- Data Management Platforms


- Build functionality for data ingestion from multiple heterogeneous sources in batch & real-time


- Build functionality for data analytics, search and aggregation

Read more
A fast growing Big Data company

A fast growing Big Data company

Agency job
via Careerconnects by Kumar Narayanan
Noida, Bengaluru (Bangalore), Chennai, Hyderabad
6 - 8 yrs
₹10L - ₹15L / yr
AWS Glue
SQL
skill iconPython
PySpark
Data engineering
+6 more

AWS Glue Developer 

Work Experience: 6 to 8 Years

Work Location:  Noida, Bangalore, Chennai & Hyderabad

Must Have Skills: AWS Glue, DMS, SQL, Python, PySpark, Data integrations and Data Ops, 

Job Reference ID:BT/F21/IND


Job Description:

Design, build and configure applications to meet business process and application requirements.


Responsibilities:

7 years of work experience with ETL, Data Modelling, and Data Architecture Proficient in ETL optimization, designing, coding, and tuning big data processes using Pyspark Extensive experience to build data platforms on AWS using core AWS services Step function, EMR, Lambda, Glue and Athena, Redshift, Postgres, RDS etc and design/develop data engineering solutions. Orchestrate using Airflow.


Technical Experience:

Hands-on experience on developing Data platform and its components Data Lake, cloud Datawarehouse, APIs, Batch and streaming data pipeline Experience with building data pipelines and applications to stream and process large datasets at low latencies.


➢ Enhancements, new development, defect resolution and production support of Big data ETL development using AWS native services.

➢ Create data pipeline architecture by designing and implementing data ingestion solutions.

➢ Integrate data sets using AWS services such as Glue, Lambda functions/ Airflow.

➢ Design and optimize data models on AWS Cloud using AWS data stores such as Redshift, RDS, S3, Athena.

➢ Author ETL processes using Python, Pyspark.

➢ Build Redshift Spectrum direct transformations and data modelling using data in S3.

➢ ETL process monitoring using CloudWatch events.

➢ You will be working in collaboration with other teams. Good communication must.

➢ Must have experience in using AWS services API, AWS CLI and SDK


Professional Attributes:

➢ Experience operating very large data warehouses or data lakes Expert-level skills in writing and optimizing SQL Extensive, real-world experience designing technology components for enterprise solutions and defining solution architectures and reference architectures with a focus on cloud technology.

➢ Must have 6+ years of big data ETL experience using Python, S3, Lambda, Dynamo DB, Athena, Glue in AWS environment.

➢ Expertise in S3, RDS, Redshift, Kinesis, EC2 clusters highly desired.


Qualification:

➢ Degree in Computer Science, Computer Engineering or equivalent.


Salary: Commensurate with experience and demonstrated competence

Read more
hopscotch
Bengaluru (Bangalore)
5 - 8 yrs
₹6L - ₹15L / yr
skill iconPython
Amazon Redshift
skill iconAmazon Web Services (AWS)
PySpark
Data engineering
+3 more

About the role:

 Hopscotch is looking for a passionate Data Engineer to join our team. You will work closely with other teams like data analytics, marketing, data science and individual product teams to specify, validate, prototype, scale, and deploy data pipelines features and data architecture.


Here’s what will be expected out of you:

➢ Ability to work in a fast-paced startup mindset. Should be able to manage all aspects of data extraction transfer and load activities.

➢ Develop data pipelines that make data available across platforms.

➢ Should be comfortable in executing ETL (Extract, Transform and Load) processes which include data ingestion, data cleaning and curation into a data warehouse, database, or data platform.

➢ Work on various aspects of the AI/ML ecosystem – data modeling, data and ML pipelines.

➢ Work closely with Devops and senior Architect to come up with scalable system and model architectures for enabling real-time and batch services.


What we want:

➢ 5+ years of experience as a data engineer or data scientist with a focus on data engineering and ETL jobs.

➢ Well versed with the concept of Data warehousing, Data Modelling and/or Data Analysis.

➢ Experience using & building pipelines and performing ETL with industry-standard best practices on Redshift (more than 2+ years).

➢ Ability to troubleshoot and solve performance issues with data ingestion, data processing & query execution on Redshift.

➢ Good understanding of orchestration tools like Airflow.

 ➢ Strong Python and SQL coding skills.

➢ Strong Experience in distributed systems like spark.

➢ Experience with AWS Data and ML Technologies (AWS Glue,MWAA, Data Pipeline,EMR,Athena, Redshift,Lambda etc).

➢ Solid hands on with various data extraction techniques like CDC or Time/batch based and the related tools (Debezium, AWS DMS, Kafka Connect, etc) for near real time and batch data extraction.


Note :

Product based companies, Ecommerce companies is added advantage

Read more
Epik Solutions
Sakshi Sarraf
Posted by Sakshi Sarraf
Bengaluru (Bangalore), Noida
5 - 10 yrs
₹7L - ₹28L / yr
skill iconPython
SQL
databricks
skill iconScala
Spark
+2 more

Job Description:


As an Azure Data Engineer, your role will involve designing, developing, and maintaining data solutions on the Azure platform. You will be responsible for building and optimizing data pipelines, ensuring data quality and reliability, and implementing data processing and transformation logic. Your expertise in Azure Databricks, Python, SQL, Azure Data Factory (ADF), PySpark, and Scala will be essential for performing the following key responsibilities:


Designing and developing data pipelines: You will design and implement scalable and efficient data pipelines using Azure Databricks, PySpark, and Scala. This includes data ingestion, data transformation, and data loading processes.


Data modeling and database design: You will design and implement data models to support efficient data storage, retrieval, and analysis. This may involve working with relational databases, data lakes, or other storage solutions on the Azure platform.


Data integration and orchestration: You will leverage Azure Data Factory (ADF) to orchestrate data integration workflows and manage data movement across various data sources and targets. This includes scheduling and monitoring data pipelines.


Data quality and governance: You will implement data quality checks, validation rules, and data governance processes to ensure data accuracy, consistency, and compliance with relevant regulations and standards.


Performance optimization: You will optimize data pipelines and queries to improve overall system performance and reduce processing time. This may involve tuning SQL queries, optimizing data transformation logic, and leveraging caching techniques.


Monitoring and troubleshooting: You will monitor data pipelines, identify performance bottlenecks, and troubleshoot issues related to data ingestion, processing, and transformation. You will work closely with cross-functional teams to resolve data-related problems.


Documentation and collaboration: You will document data pipelines, data flows, and data transformation processes. You will collaborate with data scientists, analysts, and other stakeholders to understand their data requirements and provide data engineering support.


Skills and Qualifications:


Strong experience with Azure Databricks, Python, SQL, ADF, PySpark, and Scala.

Proficiency in designing and developing data pipelines and ETL processes.

Solid understanding of data modeling concepts and database design principles.

Familiarity with data integration and orchestration using Azure Data Factory.

Knowledge of data quality management and data governance practices.

Experience with performance tuning and optimization of data pipelines.

Strong problem-solving and troubleshooting skills related to data engineering.

Excellent collaboration and communication skills to work effectively in cross-functional teams.

Understanding of cloud computing principles and experience with Azure services.

Read more
Kloud9 Technologies
Bengaluru (Bangalore)
3 - 6 yrs
₹5L - ₹20L / yr
skill iconAmazon Web Services (AWS)
Amazon EMR
EMR
Spark
PySpark
+9 more

About Kloud9:

 

Kloud9 exists with the sole purpose of providing cloud expertise to the retail industry. Our team of cloud architects, engineers and developers help retailers launch a successful cloud initiative so you can quickly realise the benefits of cloud technology. Our standardised, proven cloud adoption methodologies reduce the cloud adoption time and effort so you can directly benefit from lower migration costs.

 

Kloud9 was founded with the vision of bridging the gap between E-commerce and cloud. The E-commerce of any industry is limiting and poses a huge challenge in terms of the finances spent on physical data structures.

 

At Kloud9, we know migrating to the cloud is the single most significant technology shift your company faces today. We are your trusted advisors in transformation and are determined to build a deep partnership along the way. Our cloud and retail experts will ease your transition to the cloud.

 

Our sole focus is to provide cloud expertise to retail industry giving our clients the empowerment that will take their business to the next level. Our team of proficient architects, engineers and developers have been designing, building and implementing solutions for retailers for an average of more than 20 years.

 

We are a cloud vendor that is both platform and technology independent. Our vendor independence not just provides us with a unique perspective into the cloud market but also ensures that we deliver the cloud solutions available that best meet our clients' requirements.


What we are looking for:

● 3+ years’ experience developing Data & Analytic solutions

● Experience building data lake solutions leveraging one or more of the following AWS, EMR, S3, Hive& Spark

● Experience with relational SQL

● Experience with scripting languages such as Shell, Python

● Experience with source control tools such as GitHub and related dev process

● Experience with workflow scheduling tools such as Airflow

● In-depth knowledge of scalable cloud

● Has a passion for data solutions

● Strong understanding of data structures and algorithms

● Strong understanding of solution and technical design

● Has a strong problem-solving and analytical mindset

● Experience working with Agile Teams.

● Able to influence and communicate effectively, both verbally and written, with team members and business stakeholders

● Able to quickly pick up new programming languages, technologies, and frameworks

● Bachelor’s Degree in computer science


Why Explore a Career at Kloud9:

 

With job opportunities in prime locations of US, London, Poland and Bengaluru, we help build your career paths in cutting edge technologies of AI, Machine Learning and Data Science. Be part of an inclusive and diverse workforce that's changing the face of retail technology with their creativity and innovative solutions. Our vested interest in our employees translates to deliver the best products and solutions to our customers.

Read more
Kloud9 Technologies
Bengaluru (Bangalore)
4 - 7 yrs
₹10L - ₹30L / yr
Google Cloud Platform (GCP)
PySpark
skill iconPython
skill iconScala

About Kloud9:

 

Kloud9 exists with the sole purpose of providing cloud expertise to the retail industry. Our team of cloud architects, engineers and developers help retailers launch a successful cloud initiative so you can quickly realise the benefits of cloud technology. Our standardised, proven cloud adoption methodologies reduce the cloud adoption time and effort so you can directly benefit from lower migration costs.

 

Kloud9 was founded with the vision of bridging the gap between E-commerce and cloud. The E-commerce of any industry is limiting and poses a huge challenge in terms of the finances spent on physical data structures.

 

At Kloud9, we know migrating to the cloud is the single most significant technology shift your company faces today. We are your trusted advisors in transformation and are determined to build a deep partnership along the way. Our cloud and retail experts will ease your transition to the cloud.

 

Our sole focus is to provide cloud expertise to retail industry giving our clients the empowerment that will take their business to the next level. Our team of proficient architects, engineers and developers have been designing, building and implementing solutions for retailers for an average of more than 20 years.

 

We are a cloud vendor that is both platform and technology independent. Our vendor independence not just provides us with a unique perspective into the cloud market but also ensures that we deliver the cloud solutions available that best meet our clients' requirements.


●    Overall 8+ Years of Experience in Web Application development.

●    5+ Years of development experience with JAVA8 , Springboot, Microservices and middleware

●    3+ Years of Designing Middleware using Node JS platform.

●    good to have 2+ Years of Experience in using NodeJS along with AWS Serverless platform.

●    Good Experience with Javascript / TypeScript, Event Loops, ExpressJS, GraphQL, SQL DB (MySQLDB), NoSQL DB(MongoDB) and YAML templates.

●    Good Experience with TDD Driven Development and Automated Unit Testing.

●    Good Experience with exposing and consuming Rest APIs in Java 8, Springboot platform and Swagger API contracts.

●    Good Experience in building NodeJS middleware performing Transformations, Routing, Aggregation, Orchestration and Authentication(JWT/OAUTH).

●    Experience supporting and working with cross-functional teams in a dynamic environment.

●    Experience working in Agile Scrum Methodology.

●    Very good Problem-Solving Skills.

●    Very good learner and passion for technology.

●     Excellent verbal and written communication skills in English

●     Ability to communicate effectively with team members and business stakeholders


Secondary Skill Requirements:

 

● Experience working with any of Loopback, NestJS, Hapi.JS, Sails.JS, Passport.JS


Why Explore a Career at Kloud9:

 

With job opportunities in prime locations of US, London, Poland and Bengaluru, we help build your career paths in cutting edge technologies of AI, Machine Learning and Data Science. Be part of an inclusive and diverse workforce that's changing the face of retail technology with their creativity and innovative solutions. Our vested interest in our employees translates to deliver the best products and solutions to our customers.

Read more
Tata Digital Pvt Ltd

Tata Digital Pvt Ltd

Agency job
via Seven N Half by Priya Singh
Bengaluru (Bangalore)
8 - 13 yrs
₹10L - ₹15L / yr
PySpark
Data engineering
Big Data
Hadoop
Spark
+2 more

 

              Data Engineer

 

-          High Skilled and proficient on Azure Data Engineering Tech stacks (ADF, Databricks)

-          Should be well experienced in design and development of Big data integration platform (Kafka, Hadoop).

-          Highly skilled and experienced in building medium to complex data integration pipelines for Data at Rest and streaming data using Spark.

-          Strong knowledge in R/Python.

-          Advanced proficiency in solution design and implementation through Azure Data Lake, SQL and NoSQL Databases.

-          Strong in Data Warehousing concepts

-          Expertise in SQL, SQL tuning, Data Management (Data Security), schema design, Python and ETL processes

-          Highly Motivated, Self-Starter and quick learner

-          Must have Good knowledge on Data modelling and understating of Data analytics

-          Exposure to Statistical procedures, Experiments and Machine Learning techniques is an added advantage.

-          Experience in leading small team of 6/7 Data Engineers.

-          Excellent written and verbal communication skills

 

Read more
Top Management Consulting Company

Top Management Consulting Company

Agency job
Bengaluru (Bangalore), Gurugram
2 - 8 yrs
₹10L - ₹35L / yr
skill iconData Science
skill iconMachine Learning (ML)
Natural Language Processing (NLP)
Computer Vision
skill iconPython
+11 more
Greetings!!

We are looking for a Machine Learning engineer for on of our premium client.
Experience: 2-9 years
Location: Gurgaon/Bangalore
Tech Stack:

Python, PySpark, the Python Scientific Stack; MLFlow, Grafana, Prometheus for machine learning pipeline management and monitoring; SQL, Airflow, Databricks, our own open-source data pipelining framework called Kedro, Dask/RAPIDS; Django, GraphQL and ReactJS for horizontal product development; container technologies such as Docker and Kubernetes, CircleCI/Jenkins for CI/CD, cloud solutions such as AWS, GCP, and Azure as well as Terraform and Cloudformation for deployment
Read more
Aureus Tech Systems

at Aureus Tech Systems

3 recruiters
Naveen Yelleti
Posted by Naveen Yelleti
Kolkata, Hyderabad, Chennai, Bengaluru (Bangalore), Bhubaneswar, Visakhapatnam, Vijayawada, Trichur, Thiruvananthapuram, Mysore, Delhi, Noida, Gurugram, Nagpur
1 - 7 yrs
₹4L - ₹15L / yr
PySpark
Data engineering
Big Data
Hadoop
Spark
+2 more

Skills and requirements

  • Experience analyzing complex and varied data in a commercial or academic setting.
  • Desire to solve new and complex problems every day.
  • Excellent ability to communicate scientific results to both technical and non-technical team members.


Desirable

  • A degree in a numerically focused discipline such as, Maths, Physics, Chemistry, Engineering or Biological Sciences..
  • Hands on experience on Python, Pyspark, SQL
  • Hands on experience on building End to End Data Pipelines.
  • Hands on Experience on Azure Data Factory, Azure Data Bricks, Data Lake - added advantage
  • Hands on Experience in building data pipelines.
  • Experience with Bigdata Tools, Hadoop, Hive, Sqoop, Spark, SparkSQL
  • Experience with SQL or NoSQL databases for the purposes of data retrieval and management.
  • Experience in data warehousing and business intelligence tools, techniques and technology, as well as experience in diving deep on data analysis or technical issues to come up with effective solutions.
  • BS degree in math, statistics, computer science or equivalent technical field.
  • Experience in data mining structured and unstructured data (SQL, ETL, data warehouse, Machine Learning etc.) in a business environment with large-scale, complex data sets.
  • Proven ability to look at solutions in unconventional ways. Sees opportunities to innovate and can lead the way.
  • Willing to learn and work on Data Science, ML, AI.
Read more
EnterpriseMinds

at EnterpriseMinds

2 recruiters
phani kalyan
Posted by phani kalyan
Bengaluru (Bangalore)
3 - 7.5 yrs
₹10L - ₹25L / yr
skill iconMachine Learning (ML)
skill iconData Science
Natural Language Processing (NLP)
Spark
Software deployment
+1 more
Job ID: ZS0701

Hi,

We are hiring for Data Scientist for Bangalore.

Req Skills:

  • NLP 
  • ML programming
  • Spark
  • Model Deployment
  • Experience processing unstructured data and building NLP models
  • Experience with big data tools pyspark
  • Pipeline orchestration using Airflow and model deployment experience is preferred
Read more
EnterpriseMinds

at EnterpriseMinds

2 recruiters
phani kalyan
Posted by phani kalyan
Bengaluru (Bangalore)
3 - 6 yrs
Best in industry
skill iconPython
PySpark
skill iconData Science
Job ID: ZS070

Hi,

Enterprise minds is looking for Data Scientist. 

Strong in Python,Pyspark.

Prefer immediate joiners
Read more
RedSeer Consulting

at RedSeer Consulting

2 recruiters
Raunak Swarnkar
Posted by Raunak Swarnkar
Bengaluru (Bangalore)
0 - 2 yrs
₹10L - ₹15L / yr
skill iconPython
PySpark
SQL
pandas
Cloud Computing
+2 more

BRIEF DESCRIPTION:

At-least 1 year of Python, Spark, SQL, data engineering experience

Primary Skillset: PySpark, Scala/Python/Spark, Azure Synapse, S3, RedShift/Snowflake

Relevant Experience: Legacy ETL job Migration to AWS Glue / Python & Spark combination

 

ROLE SCOPE:

Reverse engineer the existing/legacy ETL jobs

Create the workflow diagrams and review the logic diagrams with Tech Leads

Write equivalent logic in Python & Spark

Unit test the Glue jobs and certify the data loads before passing to system testing

Follow the best practices, enable appropriate audit & control mechanism

Analytically skillful, identify the root causes quickly and efficiently debug issues

Take ownership of the deliverables and support the deployments

 

REQUIREMENTS:

Create data pipelines for data integration into Cloud stacks eg. Azure Synapse

Code data processing jobs in Azure Synapse Analytics, Python, and Spark

Experience in dealing with structured, semi-structured, and unstructured data in batch and real-time environments.

Should be able to process .json, .parquet and .avro files

 

PREFERRED BACKGROUND:

Tier1/2 candidates from IIT/NIT/IIITs

However, relevant experience, learning attitude takes precedence

Read more
Top 3 Fintech Startup

Top 3 Fintech Startup

Agency job
via Jobdost by Sathish Kumar
Bengaluru (Bangalore)
6 - 9 yrs
₹16L - ₹24L / yr
SQL
skill iconAmazon Web Services (AWS)
Spark
PySpark
Apache Hive

We are looking for an exceptionally talented Lead data engineer who has exposure in implementing AWS services to build data pipelines, api integration and designing data warehouse. Candidate with both hands-on and leadership capabilities will be ideal for this position.

 

Qualification: At least a bachelor’s degree in Science, Engineering, Applied Mathematics. Preferred Masters degree

 

Job Responsibilities:

• Total 6+ years of experience as a Data Engineer and 2+ years of experience in managing a team

• Have minimum 3 years of AWS Cloud experience.

• Well versed in languages such as Python, PySpark, SQL, NodeJS etc

• Has extensive experience in the real-timeSpark ecosystem and has worked on both real time and batch processing

• Have experience in AWS Glue, EMR, DMS, Lambda, S3, DynamoDB, Step functions, Airflow, RDS, Aurora etc.

• Experience with modern Database systems such as Redshift, Presto, Hive etc.

• Worked on building data lakes in the past on S3 or Apache Hudi

• Solid understanding of Data Warehousing Concepts

• Good to have experience on tools such as Kafka or Kinesis

• Good to have AWS Developer Associate or Solutions Architect Associate Certification

• Have experience in managing a team

Read more
Fragma Data Systems

at Fragma Data Systems

8 recruiters

Vamsikrishna G
Posted by Vamsikrishna G
Bengaluru (Bangalore)
2 - 10 yrs
₹5L - ₹15L / yr
PySpark
Data engineering
Big Data
Hadoop
Spark
+1 more
Job Description:

Must Have Skills:
• Good experience in Pyspark - Including Dataframe core functions and Spark SQL
• Good experience in SQL DBs - Be able to write queries including fair complexity.
• Should have excellent experience in Big Data programming for data transformation and aggregations
• Good at ELT architecture. Business rules processing and data extraction from Data Lake into data streams for business consumption.
• Good customer communication.
• Good Analytical skills
Read more
Indium Software

at Indium Software

16 recruiters
Karunya P
Posted by Karunya P
Bengaluru (Bangalore), Hyderabad
1 - 9 yrs
₹1L - ₹15L / yr
SQL
skill iconPython
Hadoop
HiveQL
Spark
+1 more

Responsibilities:

 

* 3+ years of Data Engineering Experience - Design, develop, deliver and maintain data infrastructures.

SQL Specialist – Strong knowledge and Seasoned experience with SQL Queries

Languages: Python

* Good communicator, shows initiative, works well with stakeholders.

* Experience working closely with Data Analysts and provide the data they need and guide them on the issues.

* Solid ETL experience and Hadoop/Hive/Pyspark/Presto/ SparkSQL

* Solid communication and articulation skills

* Able to handle stakeholders independently with less interventions of reporting manager.

* Develop strategies to solve problems in logical yet creative ways.

* Create custom reports and presentations accompanied by strong data visualization and storytelling

 

We would be excited if you have:

 

* Excellent communication and interpersonal skills

* Ability to meet deadlines and manage project delivery

* Excellent report-writing and presentation skills

* Critical thinking and problem-solving capabilities

Read more
Top 3 Fintech Startup

Top 3 Fintech Startup

Agency job
via Jobdost by Sathish Kumar
Bengaluru (Bangalore)
6 - 9 yrs
₹20L - ₹30L / yr
skill iconAmazon Web Services (AWS)
PySpark
SQL
Apache Spark
skill iconPython

We are looking for an exceptionally talented Lead data engineer who has exposure in implementing AWS services to build data pipelines, api integration and designing data warehouse. Candidate with both hands-on and leadership capabilities will be ideal for this position.

 

Qualification: At least a bachelor’s degree in Science, Engineering, Applied Mathematics. Preferred Masters degree

 

Job Responsibilities:

• Total 6+ years of experience as a Data Engineer and 2+ years of experience in managing a team

• Have minimum 3 years of AWS Cloud experience.

• Well versed in languages such as Python, PySpark, SQL, NodeJS etc

• Has extensive experience in Spark ecosystem and has worked on both real time and batch processing

• Have experience in AWS Glue, EMR, DMS, Lambda, S3, DynamoDB, Step functions, Airflow, RDS, Aurora etc.

• Experience with modern Database systems such as Redshift, Presto, Hive etc.

• Worked on building data lakes in the past on S3 or Apache Hudi

• Solid understanding of Data Warehousing Concepts

• Good to have experience on tools such as Kafka or Kinesis

• Good to have AWS Developer Associate or Solutions Architect Associate Certification

• Have experience in managing a team

Read more
Persistent System Ltd

Persistent System Ltd

Agency job
via Milestone Hr Consultancy by Haina khan
Pune, Bengaluru (Bangalore), Hyderabad
4 - 9 yrs
₹8L - ₹27L / yr
skill iconPython
PySpark
skill iconAmazon Web Services (AWS)
Spark
skill iconScala
Greetings..

We have urgent requirement of Data Engineer/Sr Data Engineer for reputed MNC company.

Exp: 4-9yrs

Location: Pune/Bangalore/Hyderabad

Skills: We need candidate either Python AWS or Pyspark AWS or Spark Scala
Read more
Persistent Systems

at Persistent Systems

1 video
1 recruiter
Agency job
via Milestone Hr Consultancy by Haina khan
Pune, Bengaluru (Bangalore), Hyderabad, Nagpur
4 - 9 yrs
₹4L - ₹15L / yr
Spark
Hadoop
Big Data
Data engineering
PySpark
+3 more
Greetings..

We have an urgent requirements of Big Data Developer profiles in our reputed MNC company.

Location: Pune/Bangalore/Hyderabad/Nagpur
Experience: 4-9yrs

Skills: Pyspark,AWS
or Spark,Scala,AWS
or Python Aws
Read more
Top 3 Fintech Startup

Top 3 Fintech Startup

Agency job
via Jobdost by Sathish Kumar
Bengaluru (Bangalore)
4 - 7 yrs
₹11L - ₹17L / yr
skill iconMachine Learning (ML)
skill iconData Science
Natural Language Processing (NLP)
Computer Vision
skill iconPython
+6 more
Responsible to lead a team of analysts to build and deploy predictive models to infuse core business functions with deep analytical insights. The Senior Data Scientist will also work
closely with the Kinara management team to investigate strategically important business
questions.

Lead a team through the entire analytical and machine learning model life cycle:

 Define the problem statement
 Build and clean datasets
 Exploratory data analysis
 Feature engineering
 Apply ML algorithms and assess the performance
 Code for deployment
 Code testing and troubleshooting
 Communicate Analysis to Stakeholders
 Manage Data Analysts and Data Scientists
Read more
Hiring for one of the MNC for India location

Hiring for one of the MNC for India location

Agency job
via Natalie Consultants by Rahul Kumar
Gurugram, Pune, Bengaluru (Bangalore), Delhi, Noida, Ghaziabad, Faridabad
2 - 9 yrs
₹8L - ₹20L / yr
skill iconPython
Hadoop
Big Data
Spark
Data engineering
+3 more

Key Responsibilities : ( Data Developer Python, Spark)

Exp : 2 to 9 Yrs 

Development of data platforms, integration frameworks, processes, and code.

Develop and deliver APIs in Python or Scala for Business Intelligence applications build using a range of web languages

Develop comprehensive automated tests for features via end-to-end integration tests, performance tests, acceptance tests and unit tests.

Elaborate stories in a collaborative agile environment (SCRUM or Kanban)

Familiarity with cloud platforms like GCP, AWS or Azure.

Experience with large data volumes.

Familiarity with writing rest-based services.

Experience with distributed processing and systems

Experience with Hadoop / Spark toolsets

Experience with relational database management systems (RDBMS)

Experience with Data Flow development

Knowledge of Agile and associated development techniques including:

Read more
A global provider of Business Process Management company

A global provider of Business Process Management company

Agency job
via Jobdost by Saida Jabbar
Bengaluru (Bangalore), UK
5 - 10 yrs
₹15L - ₹25L / yr
Data Visualization
PowerBI
ADF
Business Intelligence (BI)
PySpark
+11 more

Power BI Developer

Senior visualization engineer with 5 years’ experience in Power BI to develop and deliver solutions that enable delivery of information to audiences in support of key business processes. In addition, Hands-on experience on Azure data services like ADF and databricks is a must.

Ensure code and design quality through execution of test plans and assist in development of standards & guidelines working closely with internal and external design, business, and technical counterparts.

Candidates should have worked in agile development environments.

Desired Competencies:

  • Should have minimum of 3 years project experience using Power BI on Azure stack.
  • Should have good understanding and working knowledge of Data Warehouse and Data Modelling.
  • Good hands-on experience of Power BI
  • Hands-on experience T-SQL/ DAX/ MDX/ SSIS
  • Data Warehousing on SQL Server (preferably 2016)
  • Experience in Azure Data Services – ADF, DataBricks & PySpark
  • Manage own workload with minimum supervision.
  • Take responsibility of projects or issues assigned to them
  • Be personable, flexible and a team player
  • Good written and verbal communications
  • Have a strong personality who will be able to operate directly with users
Read more
Greenway Health

at Greenway Health

2 recruiters
Agency job
via Vipsa Talent Solutions by Prashma S R
Bengaluru (Bangalore)
6 - 8 yrs
₹8L - ₹15L / yr
PySpark
Data engineering
Big Data
Hadoop
Spark
+5 more
6-8years of experience in data engineer
Spark
Hadoop
Big Data
Data engineering
PySpark
Python
AWS Lambda
SQL
hadoop
kafka
Read more
Fragma Data Systems

at Fragma Data Systems

8 recruiters
Harpreet kour
Posted by Harpreet kour
Bengaluru (Bangalore)
1 - 6 yrs
₹10L - ₹15L / yr
Data engineering
Big Data
PySpark
SQL
skill iconPython
 Good experience in Pyspark - Including Dataframe core functions and Spark SQL
Good experience in SQL DBs - Be able to write queries including fair complexity.
Should have excellent experience in Big Data programming for data transformation and aggregations
Good at ELT architecture. Business rules processing and data extraction from Data Lake into data streams for business consumption.
 Good customer communication.
 Good Analytical skills
Read more
Virtusa

at Virtusa

2 recruiters
Agency job
via Response Informatics by Anupama Lavanya Uppala
Chennai, Bengaluru (Bangalore), Mumbai, Hyderabad, Pune
3 - 10 yrs
₹10L - ₹25L / yr
PySpark
skill iconPython
  • Minimum 1 years of relevant experience, in PySpark (mandatory)
  • Hands on experience in development, test, deploy, maintain and improving data integration pipeline in AWS cloud environment is added plus 
  • Ability to play lead role and independently manage 3-5 member of Pyspark development team 
  • EMR ,Python and PYspark mandate.
  • Knowledge and awareness working with AWS Cloud technologies like Apache Spark, , Glue, Kafka, Kinesis, and Lambda in S3, Redshift, RDS
Read more
UAE Client

UAE Client

Agency job
via Fragma Data Systems by Harpreet kour
Dubai, Bengaluru (Bangalore)
4 - 8 yrs
₹6L - ₹16L / yr
Data engineering
Data Engineer
Big Data
Big Data Engineer
Apache Spark
+3 more
• Responsible for developing and maintaining applications with PySpark 
• Contribute to the overall design and architecture of the application developed and deployed.
• Performance Tuning wrt to executor sizing and other environmental parameters, code optimization, partitions tuning, etc.
• Interact with business users to understand requirements and troubleshoot issues.
• Implement Projects based on functional specifications.

Must Have Skills:
• Good experience in Pyspark - Including Dataframe core functions and Spark SQL
• Good experience in SQL DBs - Be able to write queries including fair complexity.
• Should have excellent experience in Big Data programming for data transformation and aggregations
• Good at ELT architecture. Business rules processing and data extraction from Data Lake into data streams for business consumption.
• Good customer communication.
• Good Analytical skills
Read more
Fragma Data Systems

at Fragma Data Systems

8 recruiters
Evelyn Charles
Posted by Evelyn Charles
Remote, Bengaluru (Bangalore), Hyderabad
0 - 1 yrs
₹3L - ₹3.5L / yr
SQL
Data engineering
Data Engineer
skill iconPython
Big Data
+1 more
Strong Programmer with expertise in Python and SQL
 
● Hands-on Work experience in SQL/PLSQL
● Expertise in at least one popular Python framework (like Django,
Flask or Pyramid)
● Knowledge of object-relational mapping (ORM)
● Familiarity with front-end technologies (like JavaScript and HTML5)
● Willingness to learn & upgrade to Big data and cloud technologies
like Pyspark Azure etc.
● Team spirit
● Good problem-solving skills
● Write effective, scalable code
Read more
Hammoq

at Hammoq

1 recruiter
Nikitha Muthuswamy
Posted by Nikitha Muthuswamy
Remote, Indore, Ujjain, Hyderabad, Bengaluru (Bangalore)
5 - 8 yrs
₹5L - ₹15L / yr
pandas
NumPy
Data engineering
Data Engineer
Apache Spark
+6 more
  • Does analytics to extract insights from raw historical data of the organization. 
  • Generates usable training dataset for any/all MV projects with the help of Annotators, if needed.
  • Analyses user trends, and identifies their biggest bottlenecks in Hammoq Workflow.
  • Tests the short/long term impact of productized MV models on those trends.
  • Skills - Numpy, Pandas, SPARK, APACHE SPARK, PYSPARK, ETL mandatory. 
Read more
Infogain
Agency job
via Technogen India PvtLtd by RAHUL BATTA
Bengaluru (Bangalore), Pune, Noida, NCR (Delhi | Gurgaon | Noida)
7 - 10 yrs
₹20L - ₹25L / yr
Data engineering
skill iconPython
SQL
Spark
PySpark
+10 more
  1. Sr. Data Engineer:

 Core Skills – Data Engineering, Big Data, Pyspark, Spark SQL and Python

Candidate with prior Palantir Cloud Foundry OR Clinical Trial Data Model background is preferred

Major accountabilities:

  • Responsible for Data Engineering, Foundry Data Pipeline Creation, Foundry Analysis & Reporting, Slate Application development, re-usable code development & management and Integrating Internal or External System with Foundry for data ingestion with high quality.
  • Have good understanding on Foundry Platform landscape and it’s capabilities
  • Performs data analysis required to troubleshoot data related issues and assist in the resolution of data issues.
  • Defines company data assets (data models), Pyspark, spark SQL, jobs to populate data models.
  • Designs data integrations and data quality framework.
  • Design & Implement integration with Internal, External Systems, F1 AWS platform using Foundry Data Connector or Magritte Agent
  • Collaboration with data scientists, data analyst and technology teams to document and leverage their understanding of the Foundry integration with different data sources - Actively participate in agile work practices
  • Coordinating with Quality Engineer to ensure the all quality controls, naming convention & best practices have been followed

Desired Candidate Profile :

  • Strong data engineering background
  • Experience with Clinical Data Model is preferred
  • Experience in
    • SQL Server ,Postgres, Cassandra, Hadoop, and Spark for distributed data storage and parallel computing
    • Java and Groovy for our back-end applications and data integration tools
    • Python for data processing and analysis
    • Cloud infrastructure based on AWS EC2 and S3
  • 7+ years IT experience, 2+ years’ experience in Palantir Foundry Platform, 4+ years’ experience in Big Data platform
  • 5+ years of Python and Pyspark development experience
  • Strong troubleshooting and problem solving skills
  • BTech or master's degree in computer science or a related technical field
  • Experience designing, building, and maintaining big data pipelines systems
  • Hands-on experience on Palantir Foundry Platform and Foundry custom Apps development
  • Able to design and implement data integration between Palantir Foundry and external Apps based on Foundry data connector framework
  • Hands-on in programming languages primarily Python, R, Java, Unix shell scripts
  • Hand-on experience in AWS / Azure cloud platform and stack
  • Strong in API based architecture and concept, able to do quick PoC using API integration and development
  • Knowledge of machine learning and AI
  • Skill and comfort working in a rapidly changing environment with dynamic objectives and iteration with users.

 Demonstrated ability to continuously learn, work independently, and make decisions with minimal supervision

Read more
Get to hear about interesting companies hiring right now
Company logo
Company logo
Company logo
Company logo
Company logo
Linkedin iconFollow Cutshort
Why apply via Cutshort?
Connect with actual hiring teams and get their fast response. No spam.
Find more jobs
Get to hear about interesting companies hiring right now
Company logo
Company logo
Company logo
Company logo
Company logo
Linkedin iconFollow Cutshort