Cutshort logo
PySpark Jobs in Bangalore (Bengaluru)

50+ PySpark Jobs in Bangalore (Bengaluru) | PySpark Job openings in Bangalore (Bengaluru)

Apply to 50+ PySpark Jobs in Bangalore (Bengaluru) on CutShort.io. Explore the latest PySpark Job opportunities across top companies like Google, Amazon & Adobe.

icon
Fractal Analytics

at Fractal Analytics

5 recruiters
Eman Khan
Posted by Eman Khan
Bengaluru (Bangalore), Hyderabad, Gurugram, Noida, Mumbai, Pune, Coimbatore, Chennai
3 - 5 yrs
₹18L - ₹26L / yr
MLOps
MLFlow
kubeflow
skill iconMachine Learning (ML)
skill iconPython
+6 more

Building the machine learning production (or MLOps) is the biggest challenge most large companies currently have in making the transition to becoming an AI-driven organization. This position is an opportunity for an experienced, server-side developer to build expertise in this exciting new frontier. You will be part of a team deploying state-of-the-art AI solutions for Fractal clients.


Responsibilities

As MLOps Engineer, you will work collaboratively with Data Scientists and Data engineers to deploy and operate advanced analytics machine learning models. You’ll help automate and streamline Model development and Model operations. You’ll build and maintain tools for deployment, monitoring, and operations. You’ll also troubleshoot and resolve issues in development, testing, and production environments.

  • Enable Model tracking, model experimentation, Model automation
  • Develop ML pipelines to support
  • Develop MLOps components in Machine learning development life cycle using Model Repository (either of): MLFlow, Kubeflow Model Registry
  • Develop MLOps components in Machine learning development life cycle using Machine Learning Services (either of): Kubeflow, DataRobot, HopsWorks, Dataiku or any relevant ML E2E PaaS/SaaS
  • Work across all phases of Model development life cycle to build MLOPS components
  • Build the knowledge base required to deliver increasingly complex MLOPS projects on Azure
  • Be an integral part of client business development and delivery engagements across multiple domains


Required Qualifications

  • 3-5 years experience building production-quality software.
  • B.E/B.Tech/M.Tech in Computer Science or related technical degree OR Equivalent
  • Strong experience in System Integration, Application Development or Data Warehouse projects across technologies used in the enterprise space
  • Knowledge of MLOps, machine learning and docker
  • Object-oriented languages (e.g. Python, PySpark, Java, C#, C++)
  • CI/CD experience( i.e. Jenkins, Git hub action,
  • Database programming using any flavors of SQL
  • Knowledge of Git for Source code management
  • Ability to collaborate effectively with highly technical resources in a fast-paced environment
  • Ability to solve complex challenges/problems and rapidly deliver innovative solutions
  • Foundational Knowledge of Cloud Computing on Azure
  • Hunger and passion for learning new skills
Read more
Fractal Analytics

at Fractal Analytics

5 recruiters
Eman Khan
Posted by Eman Khan
Bengaluru (Bangalore), Gurugram, Mumbai, Hyderabad, Pune, Noida, Coimbatore, Chennai
5.5 - 9 yrs
₹25L - ₹38L / yr
MLOps
MLFlow
kubeflow
skill iconMachine Learning (ML)
skill iconPython
+6 more

Building the machine learning production System(or MLOps) is the biggest challenge most large companies currently have in making the transition to becoming an AI-driven organization. This position is an opportunity for an experienced, server-side developer to build expertise in this exciting new frontier. You will be part of a team deploying state-ofthe-art AI solutions for Fractal clients.


Responsibilities

As MLOps Engineer, you will work collaboratively with Data Scientists and Data engineers to deploy and operate advanced analytics machine learning models. You’ll help automate and streamline Model development and Model operations. You’ll build and maintain tools for deployment, monitoring, and operations. You’ll also troubleshoot and resolve issues in development, testing, and production environments.

  • Enable Model tracking, model experimentation, Model automation
  • Develop scalable ML pipelines
  • Develop MLOps components in Machine learning development life cycle using Model Repository (either of): MLFlow, Kubeflow Model Registry
  • Machine Learning Services (either of): Kubeflow, DataRobot, HopsWorks, Dataiku or any relevant ML E2E PaaS/SaaS
  • Work across all phases of Model development life cycle to build MLOPS components
  • Build the knowledge base required to deliver increasingly complex MLOPS projects on Azure
  • Be an integral part of client business development and delivery engagements across multiple domains


Required Qualifications

  • 5.5-9 years experience building production-quality software
  • B.E/B.Tech/M.Tech in Computer Science or related technical degree OR equivalent
  • Strong experience in System Integration, Application Development or Datawarehouse projects across technologies used in the enterprise space
  • Expertise in MLOps, machine learning and docker
  • Object-oriented languages (e.g. Python, PySpark, Java, C#, C++)
  • Experience developing CI/CD components for production ready ML pipeline.
  • Database programming using any flavors of SQL
  • Knowledge of Git for Source code management
  • Ability to collaborate effectively with highly technical resources in a fast-paced environment
  • Ability to solve complex challenges/problems and rapidly deliver innovative solutions
  • Team handling, problem solving, project management and communication skills & creative thinking
  • Foundational Knowledge of Cloud Computing on Azure
  • Hunger and passion for learning new skills
Read more
a leading data & analytics intelligence technology solutions provider to companies that value insights from information as a competitive advantage

a leading data & analytics intelligence technology solutions provider to companies that value insights from information as a competitive advantage

Agency job
via HyrHub by Neha Koshy
Bengaluru (Bangalore)
7 - 10 yrs
₹10L - ₹30L / yr
skill iconData Science
skill iconMachine Learning (ML)
MLOps
skill iconPython
PySpark

What are we looking for?

  • Bachelor’s degree in analytics related area (Data Science, Computer Engineering, Computer Science, Information Systems, Engineering, or a related discipline)
  • 7+ years of work experience in data science, analytics, engineering, or product management for a diverse range of projects
  • Hands on experience with python, deploying ML models
  • Hands on experience with time-wise tracking of model performance, and diagnosis of data/model drift
  • Familiarity with Dataiku, or other data-science-enabling tools (Sagemaker, etc)
  • Demonstrated familiarity with distributed computing frameworks (snowpark, pyspark)
  • Experience working with various types of data (structured / unstructured)
  • Deep understanding of all data science phases (eg. Data engineering, EDA, Machine Learning, MLOps, Serving)
  • Highly self-motivated to deliver both independently and with strong team collaboration
  • Ability to creatively take on new challenges and work outside comfort zone
  • Strong English communication skills (written & verbal)


Roles & Responsibilities:

  • Clearly articulates expectations, capabilities, and action plans; actively listens with others’ frame of reference in mind; appropriately shares information with team; favorably influences people without direct authority
  • Clearly articulates scope and deliverables of projects; breaks complex initiatives into detailed component parts and sequences actions appropriately; develops action plans and monitors progress independently; designs success criteria and uses them to track outcomes; drives implementation of recommendations when appropriate, engages with stakeholders throughout to ensure buy-in
  • Manages projects with and through others; shares responsibility and credit; develops self and others through teamwork; comfortable providing guidance and sharing expertise with others to help them develop their skills and perform at their best; helps others take appropriate risks; communicates frequently with team members earning respect and trust of the team
  • Experience in translating business priorities and vision into product/platform thinking, set clear directives to a group of team members with diverse skillsets, while providing functional & technical guidance and SME support
  • Demonstrated experience interfacing with internal and external teams to develop innovative data science solutions
  • Strong business analysis, product design, and product management skills
  • Ability to work in a collaborative environment - reviewing peers' code, contributing to problem- solving sessions, ability to communicate technical knowledge to a a variety of audience (such as management, brand teams, data engineering teams, etc.)
  • Ability to articulate model performance to a non-technical crowd, and ability to select appropriate evaluation criteria to evaluate hidden confounders and biases within a model
  • MLOps frameworks and their use in model tracking and deployment, and automating the model serving pipeline
  • Work with all sizes of ML model from linear/logistic regression to other sklearn-like models, to deep learning
  • Formulate training schema for unbiased model training e.g. K-fold cross-validation, leave-one- out cross-validation) for parameter searching and model tuning
  • Ability to work on machine learning work like recommender systems, end-to-end ML lifecycle)
  • Ability to manage ML on largely imbalanced training sets (<5% positive rate)
  • Ability to articulate model performance to a non-technical crowd, and ability to select appropriate evaluation criteria to evaluate hidden confounders and biases within a model
  • MLOps frameworks and their use in model tracking and deployment, and automating the model serving pipeline
Read more
Wissen Technology

at Wissen Technology

4 recruiters
Sukanya Mohan
Posted by Sukanya Mohan
Pune, Bengaluru (Bangalore)
5 - 10 yrs
Best in industry
skill iconAmazon Web Services (AWS)
EMR
skill iconPython
GLUE
SQL
+1 more

Greetings , Wissen Technology is Hiring for the position of Data Engineer

Please find the Job Description for your Reference:


JD

  • Design, develop, and maintain data pipelines on AWS EMR (Elastic MapReduce) to support data processing and analytics.
  • Implement data ingestion processes from various sources including APIs, databases, and flat files.
  • Optimize and tune big data workflows for performance and scalability.
  • Collaborate with data scientists, analysts, and other stakeholders to understand data requirements and deliver solutions.
  • Manage and monitor EMR clusters, ensuring high availability and reliability.
  • Develop ETL (Extract, Transform, Load) processes to cleanse, transform, and store data in data lakes and data warehouses.
  • Implement data security best practices to ensure data is protected and compliant with relevant regulations.
  • Create and maintain technical documentation related to data pipelines, workflows, and infrastructure.
  • Troubleshoot and resolve issues related to data processing and EMR cluster performance.

 

 

Qualifications:

 

  • Bachelor’s degree in Computer Science, Information Technology, or a related field.
  • 5+ years of experience in data engineering, with a focus on big data technologies.
  • Strong experience with AWS services, particularly EMR, S3, Redshift, Lambda, and Glue.
  • Proficiency in programming languages such as Python, Java, or Scala.
  • Experience with big data frameworks and tools such as Hadoop, Spark, Hive, and Pig.
  • Solid understanding of data modeling, ETL processes, and data warehousing concepts.
  • Experience with SQL and NoSQL databases.
  • Familiarity with CI/CD pipelines and version control systems (e.g., Git).
  • Strong problem-solving skills and the ability to work independently and collaboratively in a team environment
Read more
codersbrain

at codersbrain

1 recruiter
Tanuj Uppal
Posted by Tanuj Uppal
Bengaluru (Bangalore)
10 - 15 yrs
₹10L - ₹15L / yr
Microsoft Windows Azure
Snowflake
Delivery Management
ETL
PySpark
+2 more
  • Sr. Solution Architect 
  • Job Location – Bangalore
  • Need candidates who can join in 15 days or less.
  • Overall, 12-15 years of experience.

 

Looking for this tech stack in a Sr. Solution Architect (who also has a Delivery Manager background). Someone who has heavy business and IT stakeholder collaboration and negotiation skills, someone who can provide thought leadership, collaborate in the development of Product roadmaps, influence decisions, negotiate effectively with business and IT stakeholders, etc.

 

  • Building data pipelines using Azure data tools and services (Azure Data Factory, Azure Databricks, Azure Function, Spark, Azure Blob/ADLS, Azure SQL, Snowflake..)
  • Administration of cloud infrastructure in public clouds such as Azure
  • Monitoring cloud infrastructure, applications, big data pipelines and ETL workflows
  • Managing outages, customer escalations, crisis management, and other similar circumstances.
  • Understanding of DevOps tools and environments like Azure DevOps, Jenkins, Git, Ansible, Terraform.
  • SQL, Spark SQL, Python, PySpark
  • Familiarity with agile software delivery methodologies
  • Proven experience collaborating with global Product Team members, including Business Stakeholders located in NA


Read more
Wissen Technology

at Wissen Technology

4 recruiters
Sukanya Mohan
Posted by Sukanya Mohan
Bengaluru (Bangalore)
8 - 15 yrs
Best in industry
Snow flake schema
skill iconPython
PySpark
databricks

Responsibilities:

  • Lead the design, development, and implementation of scalable data architectures leveraging Snowflake, Python, PySpark, and Databricks.
  • Collaborate with business stakeholders to understand requirements and translate them into technical specifications and data models.
  • Architect and optimize data pipelines for performance, reliability, and efficiency.
  • Ensure data quality, integrity, and security across all data processes and systems.
  • Provide technical leadership and mentorship to junior team members.
  • Stay abreast of industry trends and best practices in data architecture and analytics.
  • Drive innovation and continuous improvement in data management practices.

Requirements:

  • Bachelor's degree in Computer Science, Information Systems, or a related field. Master's degree preferred.
  • 5+ years of experience in data architecture, data engineering, or a related field.
  • Strong proficiency in Snowflake, including data modeling, performance tuning, and administration.
  • Expertise in Python and PySpark for data processing, manipulation, and analysis.
  • Hands-on experience with Databricks for building and managing data pipelines.
  • Proven leadership experience, with the ability to lead cross-functional teams and drive projects to successful completion.
  • Experience in the banking or insurance domain is highly desirable.
  • Excellent communication skills, with the ability to effectively collaborate with stakeholders at all levels of the organization.
  • Strong problem-solving and analytical skills, with a keen attention to detail.

Benefits:

  • Competitive salary and performance-based incentives.
  • Comprehensive benefits package, including health insurance, retirement plans, and wellness programs.
  • Flexible work arrangements, including remote options.
  • Opportunities for professional development and career advancement.
  • Dynamic and collaborative work environment with a focus on innovation and continuous learning.


Read more
Frisco Analytics Pvt Ltd
Cedrick Mariadas
Posted by Cedrick Mariadas
Bengaluru (Bangalore), Hyderabad
5 - 8 yrs
₹15L - ₹20L / yr
databricks
Apache Spark
skill iconPython
SQL
MySQL
+3 more

We are actively seeking a self-motivated Data Engineer with expertise in Azure cloud and Databricks, with a thorough understanding of Delta Lake and Lake-house Architecture. The ideal candidate should excel in developing scalable data solutions, crafting platform tools, and integrating systems, while demonstrating proficiency in cloud-native database solutions and distributed data processing.


Key Responsibilities:

  • Contribute to the development and upkeep of a scalable data platform, incorporating tools and frameworks that leverage Azure and Databricks capabilities.
  • Exhibit proficiency in various RDBMS databases such as MySQL and SQL-Server, emphasizing their integration in applications and pipeline development.
  • Design and maintain high-caliber code, including data pipelines and applications, utilizing Python, Scala, and PHP.
  • Implement effective data processing solutions via Apache Spark, optimizing Spark applications for large-scale data handling.
  • Optimize data storage using formats like Parquet and Delta Lake to ensure efficient data accessibility and reliable performance.
  • Demonstrate understanding of Hive Metastore, Unity Catalog Metastore, and the operational dynamics of external tables.
  • Collaborate with diverse teams to convert business requirements into precise technical specifications.

Requirements:

  • Bachelor’s degree in Computer Science, Engineering, or a related discipline.
  • Demonstrated hands-on experience with Azure cloud services and Databricks.
  • Proficient programming skills in Python, Scala, and PHP.
  • In-depth knowledge of SQL, NoSQL databases, and data warehousing principles.
  • Familiarity with distributed data processing and external table management.
  • Insight into enterprise data solutions for PIM, CDP, MDM, and ERP applications.
  • Exceptional problem-solving acumen and meticulous attention to detail.

Additional Qualifications :

  • Acquaintance with data security and privacy standards.
  • Experience in CI/CD pipelines and version control systems, notably Git.
  • Familiarity with Agile methodologies and DevOps practices.
  • Competence in technical writing for comprehensive documentation.


Read more
Bengaluru (Bangalore), Mumbai, Delhi, Gurugram, Pune, Hyderabad, Ahmedabad, Chennai
3 - 7 yrs
₹8L - ₹15L / yr
AWS Lambda
Amazon S3
Amazon VPC
Amazon EC2
Amazon Redshift
+3 more

Technical Skills:


  • Ability to understand and translate business requirements into design.
  • Proficient in AWS infrastructure components such as S3, IAM, VPC, EC2, and Redshift.
  • Experience in creating ETL jobs using Python/PySpark.
  • Proficiency in creating AWS Lambda functions for event-based jobs.
  • Knowledge of automating ETL processes using AWS Step Functions.
  • Competence in building data warehouses and loading data into them.


Responsibilities:


  • Understand business requirements and translate them into design.
  • Assess AWS infrastructure needs for development work.
  • Develop ETL jobs using Python/PySpark to meet requirements.
  • Implement AWS Lambda for event-based tasks.
  • Automate ETL processes using AWS Step Functions.
  • Build data warehouses and manage data loading.
  • Engage with customers and stakeholders to articulate the benefits of proposed solutions and frameworks.
Read more
Publicis Sapient

at Publicis Sapient

10 recruiters
Mohit Singh
Posted by Mohit Singh
Bengaluru (Bangalore), Pune, Hyderabad, Gurugram, Noida
5 - 11 yrs
₹20L - ₹36L / yr
PySpark
Data engineering
Big Data
Hadoop
Spark
+7 more

Publicis Sapient Overview:

The Senior Associate People Senior Associate L1 in Data Engineering, you will translate client requirements into technical design, and implement components for data engineering solution. Utilize deep understanding of data integration and big data design principles in creating custom solutions or implementing package solutions. You will independently drive design discussions to insure the necessary health of the overall solution 

.

Job Summary:

As Senior Associate L2 in Data Engineering, you will translate client requirements into technical design, and implement components for data engineering solution. Utilize deep understanding of data integration and big data design principles in creating custom solutions or implementing package solutions. You will independently drive design discussions to insure the necessary health of the overall solution

The role requires a hands-on technologist who has strong programming background like Java / Scala / Python, should have experience in Data Ingestion, Integration and data Wrangling, Computation, Analytics pipelines and exposure to Hadoop ecosystem components. You are also required to have hands-on knowledge on at least one of AWS, GCP, Azure cloud platforms.


Role & Responsibilities:

Your role is focused on Design, Development and delivery of solutions involving:

• Data Integration, Processing & Governance

• Data Storage and Computation Frameworks, Performance Optimizations

• Analytics & Visualizations

• Infrastructure & Cloud Computing

• Data Management Platforms

• Implement scalable architectural models for data processing and storage

• Build functionality for data ingestion from multiple heterogeneous sources in batch & real-time mode

• Build functionality for data analytics, search and aggregation

Experience Guidelines:

Mandatory Experience and Competencies:

# Competency

1.Overall 5+ years of IT experience with 3+ years in Data related technologies

2.Minimum 2.5 years of experience in Big Data technologies and working exposure in at least one cloud platform on related data services (AWS / Azure / GCP)

3.Hands-on experience with the Hadoop stack – HDFS, sqoop, kafka, Pulsar, NiFi, Spark, Spark Streaming, Flink, Storm, hive, oozie, airflow and other components required in building end to end data pipeline.

4.Strong experience in at least of the programming language Java, Scala, Python. Java preferable

5.Hands-on working knowledge of NoSQL and MPP data platforms like Hbase, MongoDb, Cassandra, AWS Redshift, Azure SQLDW, GCP BigQuery etc

6.Well-versed and working knowledge with data platform related services on at least 1 cloud platform, IAM and data security


Preferred Experience and Knowledge (Good to Have):

# Competency

1.Good knowledge of traditional ETL tools (Informatica, Talend, etc) and database technologies (Oracle, MySQL, SQL Server, Postgres) with hands on experience

2.Knowledge on data governance processes (security, lineage, catalog) and tools like Collibra, Alation etc

3.Knowledge on distributed messaging frameworks like ActiveMQ / RabbiMQ / Solace, search & indexing and Micro services architectures

4.Performance tuning and optimization of data pipelines

5.CI/CD – Infra provisioning on cloud, auto build & deployment pipelines, code quality

6.Cloud data specialty and other related Big data technology certifications


Personal Attributes:

• Strong written and verbal communication skills

• Articulation skills

• Good team player

• Self-starter who requires minimal oversight

• Ability to prioritize and manage multiple tasks

• Process orientation and the ability to define and set up processes


Read more
Publicis Sapient

at Publicis Sapient

10 recruiters
Mohit Singh
Posted by Mohit Singh
Bengaluru (Bangalore), Gurugram, Pune, Hyderabad, Noida
4 - 10 yrs
Best in industry
PySpark
Data engineering
Big Data
Hadoop
Spark
+6 more

Publicis Sapient Overview:

The Senior Associate People Senior Associate L1 in Data Engineering, you will translate client requirements into technical design, and implement components for data engineering solution. Utilize deep understanding of data integration and big data design principles in creating custom solutions or implementing package solutions. You will independently drive design discussions to insure the necessary health of the overall solution 

.

Job Summary:

As Senior Associate L1 in Data Engineering, you will do technical design, and implement components for data engineering solution. Utilize deep understanding of data integration and big data design principles in creating custom solutions or implementing package solutions. You will independently drive design discussions to insure the necessary health of the overall solution

The role requires a hands-on technologist who has strong programming background like Java / Scala / Python, should have experience in Data Ingestion, Integration and data Wrangling, Computation, Analytics pipelines and exposure to Hadoop ecosystem components. Having hands-on knowledge on at least one of AWS, GCP, Azure cloud platforms will be preferable.


Role & Responsibilities:

Job Title: Senior Associate L1 – Data Engineering

Your role is focused on Design, Development and delivery of solutions involving:

• Data Ingestion, Integration and Transformation

• Data Storage and Computation Frameworks, Performance Optimizations

• Analytics & Visualizations

• Infrastructure & Cloud Computing

• Data Management Platforms

• Build functionality for data ingestion from multiple heterogeneous sources in batch & real-time

• Build functionality for data analytics, search and aggregation


Experience Guidelines:

Mandatory Experience and Competencies:

# Competency

1.Overall 3.5+ years of IT experience with 1.5+ years in Data related technologies

2.Minimum 1.5 years of experience in Big Data technologies

3.Hands-on experience with the Hadoop stack – HDFS, sqoop, kafka, Pulsar, NiFi, Spark, Spark Streaming, Flink, Storm, hive, oozie, airflow and other components required in building end to end data pipeline. Working knowledge on real-time data pipelines is added advantage.

4.Strong experience in at least of the programming language Java, Scala, Python. Java preferable

5.Hands-on working knowledge of NoSQL and MPP data platforms like Hbase, MongoDb, Cassandra, AWS Redshift, Azure SQLDW, GCP BigQuery etc


Preferred Experience and Knowledge (Good to Have):

# Competency

1.Good knowledge of traditional ETL tools (Informatica, Talend, etc) and database technologies (Oracle, MySQL, SQL Server, Postgres) with hands on experience

2.Knowledge on data governance processes (security, lineage, catalog) and tools like Collibra, Alation etc

3.Knowledge on distributed messaging frameworks like ActiveMQ / RabbiMQ / Solace, search & indexing and Micro services architectures

4.Performance tuning and optimization of data pipelines

5.CI/CD – Infra provisioning on cloud, auto build & deployment pipelines, code quality

6.Working knowledge with data platform related services on at least 1 cloud platform, IAM and data security

7.Cloud data specialty and other related Big data technology certifications


Job Title: Senior Associate L1 – Data Engineering

Personal Attributes:

• Strong written and verbal communication skills

• Articulation skills

• Good team player

• Self-starter who requires minimal oversight

• Ability to prioritize and manage multiple tasks

• Process orientation and the ability to define and set up processes

Read more
Arting Digital
Pragati Bhardwaj
Posted by Pragati Bhardwaj
Bengaluru (Bangalore)
10 - 16 yrs
₹10L - ₹15L / yr
databricks
Data modeling
SQL
skill iconPython
AWS Lambda
+2 more

Title:- Lead Data Engineer 


Experience: 10+y

Budget: 32-36 LPA

Location: Bangalore 

Work of Mode: Work from office

Primary Skills: Data Bricks, Spark, Pyspark,Sql, Python, AWS

Qualification: Any Engineering degree


Roles and Responsibilities:


• 8 - 10+ years’ experience in developing scalable Big Data applications or solutions on

 distributed platforms.

• Able to partner with others in solving complex problems by taking a broad

 perspective to identify.

• innovative solutions.

• Strong skills building positive relationships across Product and Engineering.

• Able to influence and communicate effectively, both verbally and written, with team

  members and business stakeholders

• Able to quickly pick up new programming languages, technologies, and frameworks.

• Experience working in Agile and Scrum development process.

• Experience working in a fast-paced, results-oriented environment.

• Experience in Amazon Web Services (AWS) mainly S3, Managed Airflow, EMR/ EC2,

  IAM etc.

• Experience working with Data Warehousing tools, including SQL database, Presto,

  and Snowflake

• Experience architecting data product in Streaming, Serverless and Microservices

  Architecture and platform.

• Experience working with Data platforms, including EMR, Airflow, Databricks (Data

  Engineering &amp; Delta

• Lake components, and Lakehouse Medallion architecture), etc.

• Experience with creating/ configuring Jenkins pipeline for smooth CI/CD process for

  Managed Spark jobs, build Docker images, etc.

• Experience working with distributed technology tools, including Spark, Python, Scala

• Working knowledge of Data warehousing, Data modelling, Governance and Data

  Architecture

• Working knowledge of Reporting &amp; Analytical tools such as Tableau, Quicksite

  etc.

• Demonstrated experience in learning new technologies and skills.

• Bachelor’s degree in computer science, Information Systems, Business, or other

  relevant subject area

Read more
A LEADING US BASED MNC

A LEADING US BASED MNC

Agency job
via Zeal Consultants by Zeal Consultants
Bengaluru (Bangalore), Hyderabad, Delhi, Gurugram
5 - 10 yrs
₹14L - ₹15L / yr
Google Cloud Platform (GCP)
Spark
PySpark
Apache Spark
"DATA STREAMING"

Data Engineering : Senior Engineer / Manager


As Senior Engineer/ Manager in Data Engineering, you will translate client requirements into technical design, and implement components for a data engineering solutions. Utilize a deep understanding of data integration and big data design principles in creating custom solutions or implementing package solutions. You will independently drive design discussions to insure the necessary health of the overall solution.


Must Have skills :


1. GCP


2. Spark streaming : Live data streaming experience is desired.


3. Any 1 coding language: Java/Pyhton /Scala



Skills & Experience :


- Overall experience of MINIMUM 5+ years with Minimum 4 years of relevant experience in Big Data technologies


- Hands-on experience with the Hadoop stack - HDFS, sqoop, kafka, Pulsar, NiFi, Spark, Spark Streaming, Flink, Storm, hive, oozie, airflow and other components required in building end to end data pipeline. Working knowledge on real-time data pipelines is added advantage.


- Strong experience in at least of the programming language Java, Scala, Python. Java preferable


- Hands-on working knowledge of NoSQL and MPP data platforms like Hbase, MongoDb, Cassandra, AWS Redshift, Azure SQLDW, GCP BigQuery etc.


- Well-versed and working knowledge with data platform related services on GCP


- Bachelor's degree and year of work experience of 6 to 12 years or any combination of education, training and/or experience that demonstrates the ability to perform the duties of the position


Your Impact :


- Data Ingestion, Integration and Transformation


- Data Storage and Computation Frameworks, Performance Optimizations


- Analytics & Visualizations


- Infrastructure & Cloud Computing


- Data Management Platforms


- Build functionality for data ingestion from multiple heterogeneous sources in batch & real-time


- Build functionality for data analytics, search and aggregation

Read more
A fast growing Big Data company

A fast growing Big Data company

Agency job
via Careerconnects by Kumar Narayanan
Noida, Bengaluru (Bangalore), Chennai, Hyderabad
6 - 8 yrs
₹10L - ₹15L / yr
AWS Glue
SQL
skill iconPython
PySpark
Data engineering
+6 more

AWS Glue Developer 

Work Experience: 6 to 8 Years

Work Location:  Noida, Bangalore, Chennai & Hyderabad

Must Have Skills: AWS Glue, DMS, SQL, Python, PySpark, Data integrations and Data Ops, 

Job Reference ID:BT/F21/IND


Job Description:

Design, build and configure applications to meet business process and application requirements.


Responsibilities:

7 years of work experience with ETL, Data Modelling, and Data Architecture Proficient in ETL optimization, designing, coding, and tuning big data processes using Pyspark Extensive experience to build data platforms on AWS using core AWS services Step function, EMR, Lambda, Glue and Athena, Redshift, Postgres, RDS etc and design/develop data engineering solutions. Orchestrate using Airflow.


Technical Experience:

Hands-on experience on developing Data platform and its components Data Lake, cloud Datawarehouse, APIs, Batch and streaming data pipeline Experience with building data pipelines and applications to stream and process large datasets at low latencies.


➢ Enhancements, new development, defect resolution and production support of Big data ETL development using AWS native services.

➢ Create data pipeline architecture by designing and implementing data ingestion solutions.

➢ Integrate data sets using AWS services such as Glue, Lambda functions/ Airflow.

➢ Design and optimize data models on AWS Cloud using AWS data stores such as Redshift, RDS, S3, Athena.

➢ Author ETL processes using Python, Pyspark.

➢ Build Redshift Spectrum direct transformations and data modelling using data in S3.

➢ ETL process monitoring using CloudWatch events.

➢ You will be working in collaboration with other teams. Good communication must.

➢ Must have experience in using AWS services API, AWS CLI and SDK


Professional Attributes:

➢ Experience operating very large data warehouses or data lakes Expert-level skills in writing and optimizing SQL Extensive, real-world experience designing technology components for enterprise solutions and defining solution architectures and reference architectures with a focus on cloud technology.

➢ Must have 6+ years of big data ETL experience using Python, S3, Lambda, Dynamo DB, Athena, Glue in AWS environment.

➢ Expertise in S3, RDS, Redshift, Kinesis, EC2 clusters highly desired.


Qualification:

➢ Degree in Computer Science, Computer Engineering or equivalent.


Salary: Commensurate with experience and demonstrated competence

Read more
hopscotch
Bengaluru (Bangalore)
5 - 8 yrs
₹6L - ₹15L / yr
skill iconPython
Amazon Redshift
skill iconAmazon Web Services (AWS)
PySpark
Data engineering
+3 more

About the role:

 Hopscotch is looking for a passionate Data Engineer to join our team. You will work closely with other teams like data analytics, marketing, data science and individual product teams to specify, validate, prototype, scale, and deploy data pipelines features and data architecture.


Here’s what will be expected out of you:

➢ Ability to work in a fast-paced startup mindset. Should be able to manage all aspects of data extraction transfer and load activities.

➢ Develop data pipelines that make data available across platforms.

➢ Should be comfortable in executing ETL (Extract, Transform and Load) processes which include data ingestion, data cleaning and curation into a data warehouse, database, or data platform.

➢ Work on various aspects of the AI/ML ecosystem – data modeling, data and ML pipelines.

➢ Work closely with Devops and senior Architect to come up with scalable system and model architectures for enabling real-time and batch services.


What we want:

➢ 5+ years of experience as a data engineer or data scientist with a focus on data engineering and ETL jobs.

➢ Well versed with the concept of Data warehousing, Data Modelling and/or Data Analysis.

➢ Experience using & building pipelines and performing ETL with industry-standard best practices on Redshift (more than 2+ years).

➢ Ability to troubleshoot and solve performance issues with data ingestion, data processing & query execution on Redshift.

➢ Good understanding of orchestration tools like Airflow.

 ➢ Strong Python and SQL coding skills.

➢ Strong Experience in distributed systems like spark.

➢ Experience with AWS Data and ML Technologies (AWS Glue,MWAA, Data Pipeline,EMR,Athena, Redshift,Lambda etc).

➢ Solid hands on with various data extraction techniques like CDC or Time/batch based and the related tools (Debezium, AWS DMS, Kafka Connect, etc) for near real time and batch data extraction.


Note :

Product based companies, Ecommerce companies is added advantage

Read more
Epik Solutions
Sakshi Sarraf
Posted by Sakshi Sarraf
Bengaluru (Bangalore), Noida
5 - 10 yrs
₹7L - ₹28L / yr
skill iconPython
SQL
databricks
skill iconScala
Spark
+2 more

Job Description:


As an Azure Data Engineer, your role will involve designing, developing, and maintaining data solutions on the Azure platform. You will be responsible for building and optimizing data pipelines, ensuring data quality and reliability, and implementing data processing and transformation logic. Your expertise in Azure Databricks, Python, SQL, Azure Data Factory (ADF), PySpark, and Scala will be essential for performing the following key responsibilities:


Designing and developing data pipelines: You will design and implement scalable and efficient data pipelines using Azure Databricks, PySpark, and Scala. This includes data ingestion, data transformation, and data loading processes.


Data modeling and database design: You will design and implement data models to support efficient data storage, retrieval, and analysis. This may involve working with relational databases, data lakes, or other storage solutions on the Azure platform.


Data integration and orchestration: You will leverage Azure Data Factory (ADF) to orchestrate data integration workflows and manage data movement across various data sources and targets. This includes scheduling and monitoring data pipelines.


Data quality and governance: You will implement data quality checks, validation rules, and data governance processes to ensure data accuracy, consistency, and compliance with relevant regulations and standards.


Performance optimization: You will optimize data pipelines and queries to improve overall system performance and reduce processing time. This may involve tuning SQL queries, optimizing data transformation logic, and leveraging caching techniques.


Monitoring and troubleshooting: You will monitor data pipelines, identify performance bottlenecks, and troubleshoot issues related to data ingestion, processing, and transformation. You will work closely with cross-functional teams to resolve data-related problems.


Documentation and collaboration: You will document data pipelines, data flows, and data transformation processes. You will collaborate with data scientists, analysts, and other stakeholders to understand their data requirements and provide data engineering support.


Skills and Qualifications:


Strong experience with Azure Databricks, Python, SQL, ADF, PySpark, and Scala.

Proficiency in designing and developing data pipelines and ETL processes.

Solid understanding of data modeling concepts and database design principles.

Familiarity with data integration and orchestration using Azure Data Factory.

Knowledge of data quality management and data governance practices.

Experience with performance tuning and optimization of data pipelines.

Strong problem-solving and troubleshooting skills related to data engineering.

Excellent collaboration and communication skills to work effectively in cross-functional teams.

Understanding of cloud computing principles and experience with Azure services.

Read more
Kloud9 Technologies
Bengaluru (Bangalore)
3 - 6 yrs
₹5L - ₹20L / yr
skill iconAmazon Web Services (AWS)
Amazon EMR
EMR
Spark
PySpark
+9 more

About Kloud9:

 

Kloud9 exists with the sole purpose of providing cloud expertise to the retail industry. Our team of cloud architects, engineers and developers help retailers launch a successful cloud initiative so you can quickly realise the benefits of cloud technology. Our standardised, proven cloud adoption methodologies reduce the cloud adoption time and effort so you can directly benefit from lower migration costs.

 

Kloud9 was founded with the vision of bridging the gap between E-commerce and cloud. The E-commerce of any industry is limiting and poses a huge challenge in terms of the finances spent on physical data structures.

 

At Kloud9, we know migrating to the cloud is the single most significant technology shift your company faces today. We are your trusted advisors in transformation and are determined to build a deep partnership along the way. Our cloud and retail experts will ease your transition to the cloud.

 

Our sole focus is to provide cloud expertise to retail industry giving our clients the empowerment that will take their business to the next level. Our team of proficient architects, engineers and developers have been designing, building and implementing solutions for retailers for an average of more than 20 years.

 

We are a cloud vendor that is both platform and technology independent. Our vendor independence not just provides us with a unique perspective into the cloud market but also ensures that we deliver the cloud solutions available that best meet our clients' requirements.


What we are looking for:

● 3+ years’ experience developing Data &amp; Analytic solutions

● Experience building data lake solutions leveraging one or more of the following AWS, EMR, S3, Hive&amp; Spark

● Experience with relational SQL

● Experience with scripting languages such as Shell, Python

● Experience with source control tools such as GitHub and related dev process

● Experience with workflow scheduling tools such as Airflow

● In-depth knowledge of scalable cloud

● Has a passion for data solutions

● Strong understanding of data structures and algorithms

● Strong understanding of solution and technical design

● Has a strong problem-solving and analytical mindset

● Experience working with Agile Teams.

● Able to influence and communicate effectively, both verbally and written, with team members and business stakeholders

● Able to quickly pick up new programming languages, technologies, and frameworks

● Bachelor’s Degree in computer science


Why Explore a Career at Kloud9:

 

With job opportunities in prime locations of US, London, Poland and Bengaluru, we help build your career paths in cutting edge technologies of AI, Machine Learning and Data Science. Be part of an inclusive and diverse workforce that's changing the face of retail technology with their creativity and innovative solutions. Our vested interest in our employees translates to deliver the best products and solutions to our customers.

Read more
Kloud9 Technologies
Bengaluru (Bangalore)
4 - 7 yrs
₹10L - ₹30L / yr
Google Cloud Platform (GCP)
PySpark
skill iconPython
skill iconScala

About Kloud9:

 

Kloud9 exists with the sole purpose of providing cloud expertise to the retail industry. Our team of cloud architects, engineers and developers help retailers launch a successful cloud initiative so you can quickly realise the benefits of cloud technology. Our standardised, proven cloud adoption methodologies reduce the cloud adoption time and effort so you can directly benefit from lower migration costs.

 

Kloud9 was founded with the vision of bridging the gap between E-commerce and cloud. The E-commerce of any industry is limiting and poses a huge challenge in terms of the finances spent on physical data structures.

 

At Kloud9, we know migrating to the cloud is the single most significant technology shift your company faces today. We are your trusted advisors in transformation and are determined to build a deep partnership along the way. Our cloud and retail experts will ease your transition to the cloud.

 

Our sole focus is to provide cloud expertise to retail industry giving our clients the empowerment that will take their business to the next level. Our team of proficient architects, engineers and developers have been designing, building and implementing solutions for retailers for an average of more than 20 years.

 

We are a cloud vendor that is both platform and technology independent. Our vendor independence not just provides us with a unique perspective into the cloud market but also ensures that we deliver the cloud solutions available that best meet our clients' requirements.


●    Overall 8+ Years of Experience in Web Application development.

●    5+ Years of development experience with JAVA8 , Springboot, Microservices and middleware

●    3+ Years of Designing Middleware using Node JS platform.

●    good to have 2+ Years of Experience in using NodeJS along with AWS Serverless platform.

●    Good Experience with Javascript / TypeScript, Event Loops, ExpressJS, GraphQL, SQL DB (MySQLDB), NoSQL DB(MongoDB) and YAML templates.

●    Good Experience with TDD Driven Development and Automated Unit Testing.

●    Good Experience with exposing and consuming Rest APIs in Java 8, Springboot platform and Swagger API contracts.

●    Good Experience in building NodeJS middleware performing Transformations, Routing, Aggregation, Orchestration and Authentication(JWT/OAUTH).

●    Experience supporting and working with cross-functional teams in a dynamic environment.

●    Experience working in Agile Scrum Methodology.

●    Very good Problem-Solving Skills.

●    Very good learner and passion for technology.

●     Excellent verbal and written communication skills in English

●     Ability to communicate effectively with team members and business stakeholders


Secondary Skill Requirements:

 

● Experience working with any of Loopback, NestJS, Hapi.JS, Sails.JS, Passport.JS


Why Explore a Career at Kloud9:

 

With job opportunities in prime locations of US, London, Poland and Bengaluru, we help build your career paths in cutting edge technologies of AI, Machine Learning and Data Science. Be part of an inclusive and diverse workforce that's changing the face of retail technology with their creativity and innovative solutions. Our vested interest in our employees translates to deliver the best products and solutions to our customers.

Read more
Tata Digital Pvt Ltd

Tata Digital Pvt Ltd

Agency job
via Seven N Half by Priya Singh
Bengaluru (Bangalore)
8 - 13 yrs
₹10L - ₹15L / yr
PySpark
Data engineering
Big Data
Hadoop
Spark
+2 more

 

              Data Engineer

 

-          High Skilled and proficient on Azure Data Engineering Tech stacks (ADF, Databricks)

-          Should be well experienced in design and development of Big data integration platform (Kafka, Hadoop).

-          Highly skilled and experienced in building medium to complex data integration pipelines for Data at Rest and streaming data using Spark.

-          Strong knowledge in R/Python.

-          Advanced proficiency in solution design and implementation through Azure Data Lake, SQL and NoSQL Databases.

-          Strong in Data Warehousing concepts

-          Expertise in SQL, SQL tuning, Data Management (Data Security), schema design, Python and ETL processes

-          Highly Motivated, Self-Starter and quick learner

-          Must have Good knowledge on Data modelling and understating of Data analytics

-          Exposure to Statistical procedures, Experiments and Machine Learning techniques is an added advantage.

-          Experience in leading small team of 6/7 Data Engineers.

-          Excellent written and verbal communication skills

 

Read more
Top Management Consulting Company

Top Management Consulting Company

Agency job
Bengaluru (Bangalore), Gurugram
2 - 8 yrs
₹10L - ₹35L / yr
skill iconData Science
skill iconMachine Learning (ML)
Natural Language Processing (NLP)
Computer Vision
skill iconPython
+11 more
Greetings!!

We are looking for a Machine Learning engineer for on of our premium client.
Experience: 2-9 years
Location: Gurgaon/Bangalore
Tech Stack:

Python, PySpark, the Python Scientific Stack; MLFlow, Grafana, Prometheus for machine learning pipeline management and monitoring; SQL, Airflow, Databricks, our own open-source data pipelining framework called Kedro, Dask/RAPIDS; Django, GraphQL and ReactJS for horizontal product development; container technologies such as Docker and Kubernetes, CircleCI/Jenkins for CI/CD, cloud solutions such as AWS, GCP, and Azure as well as Terraform and Cloudformation for deployment
Read more
Aureus Tech Systems

at Aureus Tech Systems

3 recruiters
Naveen Yelleti
Posted by Naveen Yelleti
Kolkata, Hyderabad, Chennai, Bengaluru (Bangalore), Bhubaneswar, Visakhapatnam, Vijayawada, Trichur, Thiruvananthapuram, Mysore, Delhi, Noida, Gurugram, Nagpur
1 - 7 yrs
₹4L - ₹15L / yr
PySpark
Data engineering
Big Data
Hadoop
Spark
+2 more

Skills and requirements

  • Experience analyzing complex and varied data in a commercial or academic setting.
  • Desire to solve new and complex problems every day.
  • Excellent ability to communicate scientific results to both technical and non-technical team members.


Desirable

  • A degree in a numerically focused discipline such as, Maths, Physics, Chemistry, Engineering or Biological Sciences..
  • Hands on experience on Python, Pyspark, SQL
  • Hands on experience on building End to End Data Pipelines.
  • Hands on Experience on Azure Data Factory, Azure Data Bricks, Data Lake - added advantage
  • Hands on Experience in building data pipelines.
  • Experience with Bigdata Tools, Hadoop, Hive, Sqoop, Spark, SparkSQL
  • Experience with SQL or NoSQL databases for the purposes of data retrieval and management.
  • Experience in data warehousing and business intelligence tools, techniques and technology, as well as experience in diving deep on data analysis or technical issues to come up with effective solutions.
  • BS degree in math, statistics, computer science or equivalent technical field.
  • Experience in data mining structured and unstructured data (SQL, ETL, data warehouse, Machine Learning etc.) in a business environment with large-scale, complex data sets.
  • Proven ability to look at solutions in unconventional ways. Sees opportunities to innovate and can lead the way.
  • Willing to learn and work on Data Science, ML, AI.
Read more
EnterpriseMinds

at EnterpriseMinds

2 recruiters
phani kalyan
Posted by phani kalyan
Bengaluru (Bangalore)
3 - 7.5 yrs
₹10L - ₹25L / yr
skill iconMachine Learning (ML)
skill iconData Science
Natural Language Processing (NLP)
Spark
Software deployment
+1 more
Job ID: ZS0701

Hi,

We are hiring for Data Scientist for Bangalore.

Req Skills:

  • NLP 
  • ML programming
  • Spark
  • Model Deployment
  • Experience processing unstructured data and building NLP models
  • Experience with big data tools pyspark
  • Pipeline orchestration using Airflow and model deployment experience is preferred
Read more
EnterpriseMinds

at EnterpriseMinds

2 recruiters
phani kalyan
Posted by phani kalyan
Bengaluru (Bangalore)
3 - 6 yrs
Best in industry
skill iconPython
PySpark
skill iconData Science
Job ID: ZS070

Hi,

Enterprise minds is looking for Data Scientist. 

Strong in Python,Pyspark.

Prefer immediate joiners
Read more
RedSeer Consulting

at RedSeer Consulting

2 recruiters
Raunak Swarnkar
Posted by Raunak Swarnkar
Bengaluru (Bangalore)
0 - 2 yrs
₹10L - ₹15L / yr
skill iconPython
PySpark
SQL
pandas
Cloud Computing
+2 more

BRIEF DESCRIPTION:

At-least 1 year of Python, Spark, SQL, data engineering experience

Primary Skillset: PySpark, Scala/Python/Spark, Azure Synapse, S3, RedShift/Snowflake

Relevant Experience: Legacy ETL job Migration to AWS Glue / Python & Spark combination

 

ROLE SCOPE:

Reverse engineer the existing/legacy ETL jobs

Create the workflow diagrams and review the logic diagrams with Tech Leads

Write equivalent logic in Python & Spark

Unit test the Glue jobs and certify the data loads before passing to system testing

Follow the best practices, enable appropriate audit & control mechanism

Analytically skillful, identify the root causes quickly and efficiently debug issues

Take ownership of the deliverables and support the deployments

 

REQUIREMENTS:

Create data pipelines for data integration into Cloud stacks eg. Azure Synapse

Code data processing jobs in Azure Synapse Analytics, Python, and Spark

Experience in dealing with structured, semi-structured, and unstructured data in batch and real-time environments.

Should be able to process .json, .parquet and .avro files

 

PREFERRED BACKGROUND:

Tier1/2 candidates from IIT/NIT/IIITs

However, relevant experience, learning attitude takes precedence

Read more
Top 3 Fintech Startup

Top 3 Fintech Startup

Agency job
via Jobdost by Sathish Kumar
Bengaluru (Bangalore)
6 - 9 yrs
₹16L - ₹24L / yr
SQL
skill iconAmazon Web Services (AWS)
Spark
PySpark
Apache Hive

We are looking for an exceptionally talented Lead data engineer who has exposure in implementing AWS services to build data pipelines, api integration and designing data warehouse. Candidate with both hands-on and leadership capabilities will be ideal for this position.

 

Qualification: At least a bachelor’s degree in Science, Engineering, Applied Mathematics. Preferred Masters degree

 

Job Responsibilities:

• Total 6+ years of experience as a Data Engineer and 2+ years of experience in managing a team

• Have minimum 3 years of AWS Cloud experience.

• Well versed in languages such as Python, PySpark, SQL, NodeJS etc

• Has extensive experience in the real-timeSpark ecosystem and has worked on both real time and batch processing

• Have experience in AWS Glue, EMR, DMS, Lambda, S3, DynamoDB, Step functions, Airflow, RDS, Aurora etc.

• Experience with modern Database systems such as Redshift, Presto, Hive etc.

• Worked on building data lakes in the past on S3 or Apache Hudi

• Solid understanding of Data Warehousing Concepts

• Good to have experience on tools such as Kafka or Kinesis

• Good to have AWS Developer Associate or Solutions Architect Associate Certification

• Have experience in managing a team

Read more
Fragma Data Systems

at Fragma Data Systems

8 recruiters

Vamsikrishna G
Posted by Vamsikrishna G
Bengaluru (Bangalore)
2 - 10 yrs
₹5L - ₹15L / yr
PySpark
Data engineering
Big Data
Hadoop
Spark
+1 more
Job Description:

Must Have Skills:
• Good experience in Pyspark - Including Dataframe core functions and Spark SQL
• Good experience in SQL DBs - Be able to write queries including fair complexity.
• Should have excellent experience in Big Data programming for data transformation and aggregations
• Good at ELT architecture. Business rules processing and data extraction from Data Lake into data streams for business consumption.
• Good customer communication.
• Good Analytical skills
Read more
Indium Software

at Indium Software

16 recruiters
Karunya P
Posted by Karunya P
Bengaluru (Bangalore), Hyderabad
1 - 9 yrs
₹1L - ₹15L / yr
SQL
skill iconPython
Hadoop
HiveQL
Spark
+1 more

Responsibilities:

 

* 3+ years of Data Engineering Experience - Design, develop, deliver and maintain data infrastructures.

SQL Specialist – Strong knowledge and Seasoned experience with SQL Queries

Languages: Python

* Good communicator, shows initiative, works well with stakeholders.

* Experience working closely with Data Analysts and provide the data they need and guide them on the issues.

* Solid ETL experience and Hadoop/Hive/Pyspark/Presto/ SparkSQL

* Solid communication and articulation skills

* Able to handle stakeholders independently with less interventions of reporting manager.

* Develop strategies to solve problems in logical yet creative ways.

* Create custom reports and presentations accompanied by strong data visualization and storytelling

 

We would be excited if you have:

 

* Excellent communication and interpersonal skills

* Ability to meet deadlines and manage project delivery

* Excellent report-writing and presentation skills

* Critical thinking and problem-solving capabilities

Read more
Top 3 Fintech Startup

Top 3 Fintech Startup

Agency job
via Jobdost by Sathish Kumar
Bengaluru (Bangalore)
6 - 9 yrs
₹20L - ₹30L / yr
skill iconAmazon Web Services (AWS)
PySpark
SQL
Apache Spark
skill iconPython

We are looking for an exceptionally talented Lead data engineer who has exposure in implementing AWS services to build data pipelines, api integration and designing data warehouse. Candidate with both hands-on and leadership capabilities will be ideal for this position.

 

Qualification: At least a bachelor’s degree in Science, Engineering, Applied Mathematics. Preferred Masters degree

 

Job Responsibilities:

• Total 6+ years of experience as a Data Engineer and 2+ years of experience in managing a team

• Have minimum 3 years of AWS Cloud experience.

• Well versed in languages such as Python, PySpark, SQL, NodeJS etc

• Has extensive experience in Spark ecosystem and has worked on both real time and batch processing

• Have experience in AWS Glue, EMR, DMS, Lambda, S3, DynamoDB, Step functions, Airflow, RDS, Aurora etc.

• Experience with modern Database systems such as Redshift, Presto, Hive etc.

• Worked on building data lakes in the past on S3 or Apache Hudi

• Solid understanding of Data Warehousing Concepts

• Good to have experience on tools such as Kafka or Kinesis

• Good to have AWS Developer Associate or Solutions Architect Associate Certification

• Have experience in managing a team

Read more
Persistent System Ltd

Persistent System Ltd

Agency job
via Milestone Hr Consultancy by Haina khan
Pune, Bengaluru (Bangalore), Hyderabad
4 - 9 yrs
₹8L - ₹27L / yr
skill iconPython
PySpark
skill iconAmazon Web Services (AWS)
Spark
skill iconScala
Greetings..

We have urgent requirement of Data Engineer/Sr Data Engineer for reputed MNC company.

Exp: 4-9yrs

Location: Pune/Bangalore/Hyderabad

Skills: We need candidate either Python AWS or Pyspark AWS or Spark Scala
Read more
Persistent Systems

at Persistent Systems

1 video
1 recruiter
Agency job
via Milestone Hr Consultancy by Haina khan
Pune, Bengaluru (Bangalore), Hyderabad, Nagpur
4 - 9 yrs
₹4L - ₹15L / yr
Spark
Hadoop
Big Data
Data engineering
PySpark
+3 more
Greetings..

We have an urgent requirements of Big Data Developer profiles in our reputed MNC company.

Location: Pune/Bangalore/Hyderabad/Nagpur
Experience: 4-9yrs

Skills: Pyspark,AWS
or Spark,Scala,AWS
or Python Aws
Read more
Top 3 Fintech Startup

Top 3 Fintech Startup

Agency job
via Jobdost by Sathish Kumar
Bengaluru (Bangalore)
4 - 7 yrs
₹11L - ₹17L / yr
skill iconMachine Learning (ML)
skill iconData Science
Natural Language Processing (NLP)
Computer Vision
skill iconPython
+6 more
Responsible to lead a team of analysts to build and deploy predictive models to infuse core business functions with deep analytical insights. The Senior Data Scientist will also work
closely with the Kinara management team to investigate strategically important business
questions.

Lead a team through the entire analytical and machine learning model life cycle:

 Define the problem statement
 Build and clean datasets
 Exploratory data analysis
 Feature engineering
 Apply ML algorithms and assess the performance
 Code for deployment
 Code testing and troubleshooting
 Communicate Analysis to Stakeholders
 Manage Data Analysts and Data Scientists
Read more
Hiring for one of the MNC for India location

Hiring for one of the MNC for India location

Agency job
via Natalie Consultants by Rahul Kumar
Gurugram, Pune, Bengaluru (Bangalore), Delhi, Noida, Ghaziabad, Faridabad
2 - 9 yrs
₹8L - ₹20L / yr
skill iconPython
Hadoop
Big Data
Spark
Data engineering
+3 more

Key Responsibilities : ( Data Developer Python, Spark)

Exp : 2 to 9 Yrs 

Development of data platforms, integration frameworks, processes, and code.

Develop and deliver APIs in Python or Scala for Business Intelligence applications build using a range of web languages

Develop comprehensive automated tests for features via end-to-end integration tests, performance tests, acceptance tests and unit tests.

Elaborate stories in a collaborative agile environment (SCRUM or Kanban)

Familiarity with cloud platforms like GCP, AWS or Azure.

Experience with large data volumes.

Familiarity with writing rest-based services.

Experience with distributed processing and systems

Experience with Hadoop / Spark toolsets

Experience with relational database management systems (RDBMS)

Experience with Data Flow development

Knowledge of Agile and associated development techniques including:

Read more
A global provider of Business Process Management company

A global provider of Business Process Management company

Agency job
via Jobdost by Saida Jabbar
Bengaluru (Bangalore), UK
5 - 10 yrs
₹15L - ₹25L / yr
Data Visualization
PowerBI
ADF
Business Intelligence (BI)
PySpark
+11 more

Power BI Developer

Senior visualization engineer with 5 years’ experience in Power BI to develop and deliver solutions that enable delivery of information to audiences in support of key business processes. In addition, Hands-on experience on Azure data services like ADF and databricks is a must.

Ensure code and design quality through execution of test plans and assist in development of standards & guidelines working closely with internal and external design, business, and technical counterparts.

Candidates should have worked in agile development environments.

Desired Competencies:

  • Should have minimum of 3 years project experience using Power BI on Azure stack.
  • Should have good understanding and working knowledge of Data Warehouse and Data Modelling.
  • Good hands-on experience of Power BI
  • Hands-on experience T-SQL/ DAX/ MDX/ SSIS
  • Data Warehousing on SQL Server (preferably 2016)
  • Experience in Azure Data Services – ADF, DataBricks & PySpark
  • Manage own workload with minimum supervision.
  • Take responsibility of projects or issues assigned to them
  • Be personable, flexible and a team player
  • Good written and verbal communications
  • Have a strong personality who will be able to operate directly with users
Read more
Greenway Health

at Greenway Health

2 recruiters
Agency job
via Vipsa Talent Solutions by Prashma S R
Bengaluru (Bangalore)
6 - 8 yrs
₹8L - ₹15L / yr
PySpark
Data engineering
Big Data
Hadoop
Spark
+5 more
6-8years of experience in data engineer
Spark
Hadoop
Big Data
Data engineering
PySpark
Python
AWS Lambda
SQL
hadoop
kafka
Read more
Fragma Data Systems

at Fragma Data Systems

8 recruiters
Harpreet kour
Posted by Harpreet kour
Bengaluru (Bangalore)
1 - 6 yrs
₹10L - ₹15L / yr
Data engineering
Big Data
PySpark
SQL
skill iconPython
 Good experience in Pyspark - Including Dataframe core functions and Spark SQL
Good experience in SQL DBs - Be able to write queries including fair complexity.
Should have excellent experience in Big Data programming for data transformation and aggregations
Good at ELT architecture. Business rules processing and data extraction from Data Lake into data streams for business consumption.
 Good customer communication.
 Good Analytical skills
Read more
Virtusa

at Virtusa

2 recruiters
Agency job
via Response Informatics by Anupama Lavanya Uppala
Chennai, Bengaluru (Bangalore), Mumbai, Hyderabad, Pune
3 - 10 yrs
₹10L - ₹25L / yr
PySpark
skill iconPython
  • Minimum 1 years of relevant experience, in PySpark (mandatory)
  • Hands on experience in development, test, deploy, maintain and improving data integration pipeline in AWS cloud environment is added plus 
  • Ability to play lead role and independently manage 3-5 member of Pyspark development team 
  • EMR ,Python and PYspark mandate.
  • Knowledge and awareness working with AWS Cloud technologies like Apache Spark, , Glue, Kafka, Kinesis, and Lambda in S3, Redshift, RDS
Read more
UAE Client

UAE Client

Agency job
via Fragma Data Systems by Harpreet kour
Dubai, Bengaluru (Bangalore)
4 - 8 yrs
₹6L - ₹16L / yr
Data engineering
Data Engineer
Big Data
Big Data Engineer
Apache Spark
+3 more
• Responsible for developing and maintaining applications with PySpark 
• Contribute to the overall design and architecture of the application developed and deployed.
• Performance Tuning wrt to executor sizing and other environmental parameters, code optimization, partitions tuning, etc.
• Interact with business users to understand requirements and troubleshoot issues.
• Implement Projects based on functional specifications.

Must Have Skills:
• Good experience in Pyspark - Including Dataframe core functions and Spark SQL
• Good experience in SQL DBs - Be able to write queries including fair complexity.
• Should have excellent experience in Big Data programming for data transformation and aggregations
• Good at ELT architecture. Business rules processing and data extraction from Data Lake into data streams for business consumption.
• Good customer communication.
• Good Analytical skills
Read more
Fragma Data Systems

at Fragma Data Systems

8 recruiters
Evelyn Charles
Posted by Evelyn Charles
Remote, Bengaluru (Bangalore), Hyderabad
0 - 1 yrs
₹3L - ₹3.5L / yr
SQL
Data engineering
Data Engineer
skill iconPython
Big Data
+1 more
Strong Programmer with expertise in Python and SQL
 
● Hands-on Work experience in SQL/PLSQL
● Expertise in at least one popular Python framework (like Django,
Flask or Pyramid)
● Knowledge of object-relational mapping (ORM)
● Familiarity with front-end technologies (like JavaScript and HTML5)
● Willingness to learn &amp; upgrade to Big data and cloud technologies
like Pyspark Azure etc.
● Team spirit
● Good problem-solving skills
● Write effective, scalable code
Read more
Hammoq

at Hammoq

1 recruiter
Nikitha Muthuswamy
Posted by Nikitha Muthuswamy
Remote, Indore, Ujjain, Hyderabad, Bengaluru (Bangalore)
5 - 8 yrs
₹5L - ₹15L / yr
pandas
NumPy
Data engineering
Data Engineer
Apache Spark
+6 more
  • Does analytics to extract insights from raw historical data of the organization. 
  • Generates usable training dataset for any/all MV projects with the help of Annotators, if needed.
  • Analyses user trends, and identifies their biggest bottlenecks in Hammoq Workflow.
  • Tests the short/long term impact of productized MV models on those trends.
  • Skills - Numpy, Pandas, SPARK, APACHE SPARK, PYSPARK, ETL mandatory. 
Read more
Infogain
Agency job
via Technogen India PvtLtd by RAHUL BATTA
Bengaluru (Bangalore), Pune, Noida, NCR (Delhi | Gurgaon | Noida)
7 - 10 yrs
₹20L - ₹25L / yr
Data engineering
skill iconPython
SQL
Spark
PySpark
+10 more
  1. Sr. Data Engineer:

 Core Skills – Data Engineering, Big Data, Pyspark, Spark SQL and Python

Candidate with prior Palantir Cloud Foundry OR Clinical Trial Data Model background is preferred

Major accountabilities:

  • Responsible for Data Engineering, Foundry Data Pipeline Creation, Foundry Analysis & Reporting, Slate Application development, re-usable code development & management and Integrating Internal or External System with Foundry for data ingestion with high quality.
  • Have good understanding on Foundry Platform landscape and it’s capabilities
  • Performs data analysis required to troubleshoot data related issues and assist in the resolution of data issues.
  • Defines company data assets (data models), Pyspark, spark SQL, jobs to populate data models.
  • Designs data integrations and data quality framework.
  • Design & Implement integration with Internal, External Systems, F1 AWS platform using Foundry Data Connector or Magritte Agent
  • Collaboration with data scientists, data analyst and technology teams to document and leverage their understanding of the Foundry integration with different data sources - Actively participate in agile work practices
  • Coordinating with Quality Engineer to ensure the all quality controls, naming convention & best practices have been followed

Desired Candidate Profile :

  • Strong data engineering background
  • Experience with Clinical Data Model is preferred
  • Experience in
    • SQL Server ,Postgres, Cassandra, Hadoop, and Spark for distributed data storage and parallel computing
    • Java and Groovy for our back-end applications and data integration tools
    • Python for data processing and analysis
    • Cloud infrastructure based on AWS EC2 and S3
  • 7+ years IT experience, 2+ years’ experience in Palantir Foundry Platform, 4+ years’ experience in Big Data platform
  • 5+ years of Python and Pyspark development experience
  • Strong troubleshooting and problem solving skills
  • BTech or master's degree in computer science or a related technical field
  • Experience designing, building, and maintaining big data pipelines systems
  • Hands-on experience on Palantir Foundry Platform and Foundry custom Apps development
  • Able to design and implement data integration between Palantir Foundry and external Apps based on Foundry data connector framework
  • Hands-on in programming languages primarily Python, R, Java, Unix shell scripts
  • Hand-on experience in AWS / Azure cloud platform and stack
  • Strong in API based architecture and concept, able to do quick PoC using API integration and development
  • Knowledge of machine learning and AI
  • Skill and comfort working in a rapidly changing environment with dynamic objectives and iteration with users.

 Demonstrated ability to continuously learn, work independently, and make decisions with minimal supervision

Read more
MNC

MNC

Agency job
via Fragma Data Systems by Harpreet kour
Bengaluru (Bangalore)
2 - 4 yrs
₹10L - ₹15L / yr
PySpark
SQL
• Responsible for developing and maintaining applications with PySpark 
• Contribute to the overall design and architecture of the application developed and deployed.
• Performance Tuning wrt to executor sizing and other environmental parameters, code optimization, partitions tuning, etc.
• Interact with business users to understand requirements and troubleshoot issues.
• Implement Projects based on functional specifications.
Read more
Fragma Data Systems

at Fragma Data Systems

8 recruiters
Evelyn Charles
Posted by Evelyn Charles
Remote, Bengaluru (Bangalore)
3.5 - 8 yrs
₹5L - ₹18L / yr
PySpark
Data engineering
Data Warehouse (DWH)
SQL
Spark
+1 more
Must-Have Skills:
• Good experience in Pyspark - Including Dataframe core functions and Spark SQL
• Good experience in SQL DBs - Be able to write queries including fair complexity.
• Should have excellent experience in Big Data programming for data transformation and aggregations
• Good at ELT architecture. Business rules processing and data extraction from Data Lake into data streams for business consumption.
• Good customer communication.
• Good Analytical skill
 
 
Technology Skills (Good to Have):
  • Building and operationalizing large scale enterprise data solutions and applications using one or more of AZURE data and analytics services in combination with custom solutions - Azure Synapse/Azure SQL DWH, Azure Data Lake, Azure Blob Storage, Spark, HDInsights, Databricks, CosmosDB, EventHub/IOTHub.
  • Experience in migrating on-premise data warehouses to data platforms on AZURE cloud. 
  • Designing and implementing data engineering, ingestion, and transformation functions
  • Azure Synapse or Azure SQL data warehouse
  • Spark on Azure is available in HD insights and data bricks
 
Good to Have: 
  • Experience with Azure Analysis Services
  • Experience in Power BI
  • Experience with third-party solutions like Attunity/Stream sets, Informatica
  • Experience with PreSales activities (Responding to RFPs, Executing Quick POCs)
  • Capacity Planning and Performance Tuning on Azure Stack and Spark.
Read more
RentoMojo

at RentoMojo

1 video
5 recruiters
Anand Pandey
Posted by Anand Pandey
Bengaluru (Bangalore)
1 - 2 yrs
₹5L - ₹7L / yr
Business Analysis
Windows Azure
PySpark
SQL
Data Warehouse (DWH)
+4 more
RESPONSIBILITIES & OWNERSHIP: THINGS THE ROLE CAN'T MISS
  • Setting KPIs, monitoring key trends, and helping stakeholders by generating insights from the data delivered.
  • Understanding user behaviour and performing root-cause analysis of changes in data trends across different verticals.
  • Get answers to business questions, identify areas of improvement, and identify opportunities for growth.
  • Work on ad-hoc requests for data and analysis.
  • Work with Cross functional Teams as when required to automate reports and create informative dashboards based on problem statements.

WHO COULD BE A GREAT FIT:
Functional Experience
  • 1-2 years of experience working in Analytics as a Business or Data Analyst.
  • Analytical mind with a problem-solving aptitude.
  • Familiarity with Microsoft Azure & AWS PySpark, Python, Data Bricks, Metabase, Understanding of APIs, data warehouse and ETL etc.
  • Proficient in writing Complex Queries in SQL.
  • Experience in Performing hands-on analysis on data and across multiple datasets and databases primarily using Excel, Google Sheets and R.
  • Ability to work across cross-functional teams proactively.
Read more
Curl Analytics

Curl Analytics

Agency job
via wrackle by Naveen Taalanki
Bengaluru (Bangalore)
5 - 10 yrs
₹15L - ₹30L / yr
ETL
Big Data
Data engineering
Apache Kafka
PySpark
+11 more
What you will do
  • Bring in industry best practices around creating and maintaining robust data pipelines for complex data projects with/without AI component
    • programmatically ingesting data from several static and real-time sources (incl. web scraping)
    • rendering results through dynamic interfaces incl. web / mobile / dashboard with the ability to log usage and granular user feedbacks
    • performance tuning and optimal implementation of complex Python scripts (using SPARK), SQL (using stored procedures, HIVE), and NoSQL queries in a production environment
  • Industrialize ML / DL solutions and deploy and manage production services; proactively handle data issues arising on live apps
  • Perform ETL on large and complex datasets for AI applications - work closely with data scientists on performance optimization of large-scale ML/DL model training
  • Build data tools to facilitate fast data cleaning and statistical analysis
  • Ensure data architecture is secure and compliant
  • Resolve issues escalated from Business and Functional areas on data quality, accuracy, and availability
  • Work closely with APAC CDO and coordinate with a fully decentralized team across different locations in APAC and global HQ (Paris).

You should be

  •  Expert in structured and unstructured data in traditional and Big data environments – Oracle / SQLserver, MongoDB, Hive / Pig, BigQuery, and Spark
  • Have excellent knowledge of Python programming both in traditional and distributed models (PySpark)
  • Expert in shell scripting and writing schedulers
  • Hands-on experience with Cloud - deploying complex data solutions in hybrid cloud / on-premise environment both for data extraction/storage and computation
  • Hands-on experience in deploying production apps using large volumes of data with state-of-the-art technologies like Dockers, Kubernetes, and Kafka
  • Strong knowledge of data security best practices
  • 5+ years experience in a data engineering role
  • Science / Engineering graduate from a Tier-1 university in the country
  • And most importantly, you must be a passionate coder who really cares about building apps that can help people do things better, smarter, and faster even when they sleep
Read more
IT solutions specialized in Apps Lifecycle management. (MG1)

IT solutions specialized in Apps Lifecycle management. (MG1)

Agency job
via Multi Recruit by Ayub Pasha
Bengaluru (Bangalore)
5 - 6 yrs
₹10L - ₹12L / yr
Mlops
skill iconKubernetes
skill iconDocker
Ansible
PySpark
+3 more
  • Automate and maintain ML and Data pipelines at scale
  • Collaborate with Data Scientists and Data Engineers on feature development teams to containerize and build out deployment pipelines for new modules
  • Maintain and expand our on-prem deployments with spark clusters
  • Design, build and optimize applications containerization and orchestration with Docker and Kubernetes and AWS or Azure
Skills:
  • 5 years of IT experience in data-driven or AI technology products
  • Understanding of ML Model Deployment and Lifecycle
  • Extensive experience in Apache airflow for MLOps workflow automation
  • Experience is building and automating data pipelines
  • Experience in working on Spark Cluster architecture
  • Extensive experience with Unix/Linux environments
  • Experience with standard concepts and technologies used in CI/CD build, deployment pipelines using Jenkins
  • Strong experience in Python and PySpark and building required automation (using standard technologies such as Docker, Jenkins, and Ansible).
  • Experience with Kubernetes or Docker Swarm
  • Working technical knowledge of current systems software, protocols, and standards, including firewalls, Active Directory, etc.
  • Basic knowledge of Multi-tier architectures: load balancers, caching, web servers, application servers, and databases.
  • Experience with various virtualization technologies and multi-tenant, private and hybrid cloud environments.
  • Hands-on software and hardware troubleshooting experience.
  • Experience documenting and maintaining configuration and process information.
  • Basic Knowledge of machine learning frameworks: Tensorflow, Caffe/Caffe2, Pytorch
Read more
Marktine

at Marktine

1 recruiter
Vishal Sharma
Posted by Vishal Sharma
Remote, Bengaluru (Bangalore)
3 - 6 yrs
₹10L - ₹20L / yr
Big Data
Spark
PySpark
Data engineering
Data Warehouse (DWH)
+5 more

Azure – Data Engineer

  • At least 2 years hands on experience working with an Agile data engineering team working on big data pipelines using Azure in a commercial environment.
  • Dealing with senior stakeholders/leadership
  • Understanding of Azure data security and encryption best practices. [ADFS/ACLs]

Data Bricks –experience writing in and using data bricks Using Python to transform, manipulate data.

Data Factory – experience using data factory in an enterprise solution to build data pipelines. Experience calling rest APIs.

Synapse/data warehouse – experience using synapse/data warehouse to present data securely and to build & manage data models.

Microsoft SQL server – We’d expect the candidate to have come from a SQL/Data background and progressed into Azure

PowerBI – Experience with this is preferred

Additionally

  • Experience using GIT as a source control system
  • Understanding of DevOps concepts and application
  • Understanding of Azure Cloud costs/management and running platforms efficiently
Read more
MNC

MNC

Agency job
via Fragma Data Systems by Priyanka U
Remote, Bengaluru (Bangalore)
5 - 8 yrs
₹12L - ₹20L / yr
PySpark
SQL
Data Warehouse (DWH)
ETL
SQL Developer with Relevant experience of 7 Yrs with Strong Communication Skills.
 
Key responsibilities:
 
  • Creating, designing and developing data models
  • Prepare plans for all ETL (Extract/Transformation/Load) procedures and architectures
  • Validating results and creating business reports
  • Monitoring and tuning data loads and queries
  • Develop and prepare a schedule for a new data warehouse
  • Analyze large databases and recommend appropriate optimization for the same
  • Administer all requirements and design various functional specifications for data
  • Provide support to the Software Development Life cycle
  • Prepare various code designs and ensure efficient implementation of the same
  • Evaluate all codes and ensure the quality of all project deliverables
  • Monitor data warehouse work and provide subject matter expertise
  • Hands-on BI practices, data structures, data modeling, SQL skills
  • Minimum 1 year experience in Pyspark
Read more
MNC

MNC

Agency job
via Fragma Data Systems by Priyanka U
Remote, Bengaluru (Bangalore)
2 - 6 yrs
₹6L - ₹15L / yr
Spark
Apache Kafka
PySpark
Internet of Things (IOT)
Real time media streaming

JD for IOT DE:

 

The role requires experience in Azure core technologies – IoT Hub/ Event Hub, Stream Analytics, IoT Central, Azure Data Lake Storage, Azure Cosmos, Azure Data Factory, Azure SQL Database, Azure HDInsight / Databricks, SQL data warehouse.

 

You Have:

  • Minimum 2 years of software development experience
  • Minimum 2 years of experience in IoT/streaming data pipelines solution development
  • Bachelor's and/or Master’s degree in computer science
  • Strong Consulting skills in data management including data governance, data quality, security, data integration, processing, and provisioning
  • Delivered data management projects with real-time/near real-time data insights delivery on Azure Cloud
  • Translated complex analytical requirements into the technical design including data models, ETLs, and Dashboards / Reports
  • Experience deploying dashboards and self-service analytics solutions on both relational and non-relational databases
  • Experience with different computing paradigms in databases such as In-Memory, Distributed, Massively Parallel Processing
  • Successfully delivered large scale IOT data management initiatives covering Plan, Design, Build and Deploy phases leveraging different delivery methodologies including Agile
  • Experience in handling telemetry data with Spark Streaming, Kafka, Flink, Scala, Pyspark, Spark SQL.
  • Hands-on experience on containers and Dockers
  • Exposure to streaming protocols like MQTT and AMQP
  • Knowledge of OT network protocols like OPC UA, CAN Bus, and similar protocols
  • Strong knowledge of continuous integration, static code analysis, and test-driven development
  • Experience in delivering projects in a highly collaborative delivery model with teams at onsite and offshore
  • Must have excellent analytical and problem-solving skills
  • Delivered change management initiatives focused on driving data platforms adoption across the enterprise
  • Strong verbal and written communications skills are a must, as well as the ability to work effectively across internal and external organizations
     

Roles & Responsibilities
 

You Will:

  • Translate functional requirements into technical design
  • Interact with clients and internal stakeholders to understand the data and platform requirements in detail and determine core Azure services needed to fulfill the technical design
  • Design, Develop and Deliver data integration interfaces in ADF and Azure Databricks
  • Design, Develop and Deliver data provisioning interfaces to fulfill consumption needs
  • Deliver data models on Azure platform, it could be on Azure Cosmos, SQL DW / Synapse, or SQL
  • Advise clients on ML Engineering and deploying ML Ops at Scale on AKS
  • Automate core activities to minimize the delivery lead times and improve the overall quality
  • Optimize platform cost by selecting the right platform services and architecting the solution in a cost-effective manner
  • Deploy Azure DevOps and CI CD processes
  • Deploy logging and monitoring across the different integration points for critical alerts

 

Read more
Marktine

at Marktine

1 recruiter
Vishal Sharma
Posted by Vishal Sharma
Remote, Bengaluru (Bangalore)
3 - 10 yrs
₹5L - ₹15L / yr
Big Data
ETL
PySpark
SSIS
Microsoft Windows Azure
+4 more

Must Have Skills:

- Solid Knowledge on DWH, ETL and Big Data Concepts

- Excellent SQL Skills (With knowledge of SQL Analytics Functions)

- Working Experience on any ETL tool i.e. SSIS / Informatica

- Working Experience on any Azure or AWS Big Data Tools.

- Experience on Implementing Data Jobs (Batch / Real time Streaming)

- Excellent written and verbal communication skills in English, Self-motivated with strong sense of ownership and Ready to learn new tools and technologies

Preferred Skills:

- Experience on Py-Spark / Spark SQL

- AWS Data Tools (AWS Glue, AWS Athena)

- Azure Data Tools (Azure Databricks, Azure Data Factory)

Other Skills:

- Knowledge about Azure Blob, Azure File Storage, AWS S3, Elastic Search / Redis Search

- Knowledge on domain/function (across pricing, promotions and assortment).

- Implementation Experience on Schema and Data Validator framework (Python / Java / SQL),

- Knowledge on DQS and MDM.

Key Responsibilities:

- Independently work on ETL / DWH / Big data Projects

- Gather and process raw data at scale.

- Design and develop data applications using selected tools and frameworks as required and requested.

- Read, extract, transform, stage and load data to selected tools and frameworks as required and requested.

- Perform tasks such as writing scripts, web scraping, calling APIs, write SQL queries, etc.

- Work closely with the engineering team to integrate your work into our production systems.

- Process unstructured data into a form suitable for analysis.

- Analyse processed data.

- Support business decisions with ad hoc analysis as needed.

- Monitoring data performance and modifying infrastructure as needed.

Responsibility: Smart Resource, having excellent communication skills

 

 
Read more
Marktine

at Marktine

1 recruiter
Vishal Sharma
Posted by Vishal Sharma
Remote, Bengaluru (Bangalore)
3 - 7 yrs
₹5L - ₹10L / yr
Data Warehouse (DWH)
Spark
Data engineering
skill iconPython
PySpark
+5 more

Basic Qualifications

- Need to have a working knowledge of AWS Redshift.

- Minimum 1 year of designing and implementing a fully operational production-grade large-scale data solution on Snowflake Data Warehouse.

- 3 years of hands-on experience with building productized data ingestion and processing pipelines using Spark, Scala, Python

- 2 years of hands-on experience designing and implementing production-grade data warehousing solutions

- Expertise and excellent understanding of Snowflake Internals and integration of Snowflake with other data processing and reporting technologies

- Excellent presentation and communication skills, both written and verbal

- Ability to problem-solve and architect in an environment with unclear requirements

Read more
Fragma Data Systems

at Fragma Data Systems

8 recruiters
Evelyn Charles
Posted by Evelyn Charles
Remote, Bengaluru (Bangalore)
1 - 5 yrs
₹5L - ₹15L / yr
Spark
PySpark
Big Data
skill iconPython
SQL
+1 more
Must-Have Skills:
• Good experience in Pyspark - Including Dataframe core functions and Spark SQL
• Good experience in SQL DBs - Be able to write queries including fair complexity.
• Should have excellent experience in Big Data programming for data transformation and aggregations
• Good at ELT architecture. Business rules processing and data extraction from Data Lake into data streams for business consumption.
• Good customer communication.
• Good Analytical skill
 
 
Technology Skills (Good to Have):
  • Building and operationalizing large scale enterprise data solutions and applications using one or more of AZURE data and analytics services in combination with custom solutions - Azure Synapse/Azure SQL DWH, Azure Data Lake, Azure Blob Storage, Spark, HDInsights, Databricks, CosmosDB, EventHub/IOTHub.
  • Experience in migrating on-premise data warehouses to data platforms on AZURE cloud. 
  • Designing and implementing data engineering, ingestion, and transformation functions
  • Azure Synapse or Azure SQL data warehouse
  • Spark on Azure is available in HD insights and data bricks
Read more
Get to hear about interesting companies hiring right now
Company logo
Company logo
Company logo
Company logo
Company logo
Linkedin iconFollow Cutshort
Why apply via Cutshort?
Connect with actual hiring teams and get their fast response. No spam.
Find more jobs
Get to hear about interesting companies hiring right now
Company logo
Company logo
Company logo
Company logo
Company logo
Linkedin iconFollow Cutshort