Spark Jobs in Mumbai

27+ Spark Jobs in Mumbai | Spark Job openings in Mumbai

Apply to 27+ Spark Jobs in Mumbai on CutShort.io. Explore the latest Spark Job opportunities across top companies like Google, Amazon & Adobe.

Spark jobs in other cities

Jobs by Category

Fullstack Developer Jobs Backend Developer Jobs Frontend Developer Jobs Android Developer Jobs iOS Developer Jobs DevOps Jobs Data Science Jobs

Business Developer Jobs Digital Marketing Jobs Sales Jobs

UX Designer Jobs Graphic Designer Jobs

Jobs by Location

Startup Jobs in Bangalore Startup Jobs in Pune Startup Jobs in Delhi All Startup jobs

Collections

Funded Startup Jobs Product Startup Jobs

Sr. Data Scientist (AI/ML, Deep learning)

AI Industry

Agency job

via Peak Hire Solutions by Dhara Thakkar

Mumbai, Bengaluru (Bangalore), Hyderabad, Gurugram

5 - 12 yrs

₹20L - ₹46L / yr

Data Science

Artificial Intelligence (AI)

Machine Learning (ML)

Generative AI

Deep Learning

+14 more

Review Criteria

Strong Senior Data Scientist (AI/ML/GenAI) Profile
5+ years of experience in designing, developing, and deploying Machine Learning / Deep Learning (ML/DL) systems in production
Must have strong hands-on experience in Python and deep learning frameworks such as PyTorch, TensorFlow, or JAX.
1+ years of experience in fine-tuning Large Language Models (LLMs) using techniques like LoRA/QLoRA, and building RAG (Retrieval-Augmented Generation) pipelines.
Must have experience with MLOps and production-grade systems including Docker, Kubernetes, Spark, model registries, and CI/CD workflows

Preferred

Prior experience in open-source GenAI contributions, applied LLM/GenAI research, or large-scale production AI systems
Preferred (Education) – B.S./M.S./Ph.D. in Computer Science, Data Science, Machine Learning, or a related field.

Job Specific Criteria

CV Attachment is mandatory
Which is your preferred job location (Mumbai / Bengaluru / Hyderabad / Gurgaon)?
Are you okay with 3 Days WFO?
Virtual Interview requires video to be on, are you okay with it?

Role & Responsibilities

Company is hiring a Senior Data Scientist with strong expertise in AI, machine learning engineering (MLE), and generative AI. You will play a leading role in designing, deploying, and scaling production-grade ML systems — including large language model (LLM)-based pipelines, AI copilots, and agentic workflows. This role is ideal for someone who thrives on balancing cutting-edge research with production rigor and loves mentoring while building impact-first AI applications.

Responsibilities:

Own the full ML lifecycle: model design, training, evaluation, deployment
Design production-ready ML pipelines with CI/CD, testing, monitoring, and drift detection
Fine-tune LLMs and implement retrieval-augmented generation (RAG) pipelines
Build agentic workflows for reasoning, planning, and decision-making
Develop both real-time and batch inference systems using Docker, Kubernetes, and Spark
Leverage state-of-the-art architectures: transformers, diffusion models, RLHF, and multimodal pipelines
Collaborate with product and engineering teams to integrate AI models into business applications
Mentor junior team members and promote MLOps, scalable architecture, and responsible AI best practices

Ideal Candidate

5+ years of experience in designing, deploying, and scaling ML/DL systems in production
Proficient in Python and deep learning frameworks such as PyTorch, TensorFlow, or JAX
Experience with LLM fine-tuning, LoRA/QLoRA, vector search (Weaviate/PGVector), and RAG pipelines
Familiarity with agent-based development (e.g., ReAct agents, function-calling, orchestration)
Solid understanding of MLOps: Docker, Kubernetes, Spark, model registries, and deployment workflows
Strong software engineering background with experience in testing, version control, and APIs
Proven ability to balance innovation with scalable deployment
B.S./M.S./Ph.D. in Computer Science, Data Science, or a related field
Bonus: Open-source contributions, GenAI research, or applied systems at scale

Review Criteria

Strong Senior Data Scientist (AI/ML/GenAI) Profile
5+ years of experience in designing, developing, and deploying Machine Learning / Deep Learning (ML/DL) systems in production
Must have strong hands-on experience in Python and deep learning frameworks such as PyTorch, TensorFlow, or JAX.
1+ years of experience in fine-tuning Large Language Models (LLMs) using techniques like LoRA/QLoRA, and building RAG (Retrieval-Augmented Generation) pipelines.
Must have experience with MLOps and production-grade systems including Docker, Kubernetes, Spark, model registries, and CI/CD workflows

Preferred

Prior experience in open-source GenAI contributions, applied LLM/GenAI research, or large-scale production AI systems
Preferred (Education) – B.S./M.S./Ph.D. in Computer Science, Data Science, Machine Learning, or a related field.

Job Specific Criteria

CV Attachment is mandatory
Which is your preferred job location (Mumbai / Bengaluru / Hyderabad / Gurgaon)?
Are you okay with 3 Days WFO?
Virtual Interview requires video to be on, are you okay with it?

Role & Responsibilities

Responsibilities:

Own the full ML lifecycle: model design, training, evaluation, deployment
Design production-ready ML pipelines with CI/CD, testing, monitoring, and drift detection
Fine-tune LLMs and implement retrieval-augmented generation (RAG) pipelines
Build agentic workflows for reasoning, planning, and decision-making
Develop both real-time and batch inference systems using Docker, Kubernetes, and Spark
Leverage state-of-the-art architectures: transformers, diffusion models, RLHF, and multimodal pipelines
Collaborate with product and engineering teams to integrate AI models into business applications
Mentor junior team members and promote MLOps, scalable architecture, and responsible AI best practices

Ideal Candidate

5+ years of experience in designing, deploying, and scaling ML/DL systems in production
Proficient in Python and deep learning frameworks such as PyTorch, TensorFlow, or JAX
Experience with LLM fine-tuning, LoRA/QLoRA, vector search (Weaviate/PGVector), and RAG pipelines
Familiarity with agent-based development (e.g., ReAct agents, function-calling, orchestration)
Solid understanding of MLOps: Docker, Kubernetes, Spark, model registries, and deployment workflows
Strong software engineering background with experience in testing, version control, and APIs
Proven ability to balance innovation with scalable deployment
B.S./M.S./Ph.D. in Computer Science, Data Science, or a related field
Bonus: Open-source contributions, GenAI research, or applied systems at scale

Data Engineer

at Pluginlive

1 recruiter

Posted by Harsha Saggi

Chennai, Mumbai

4 - 6 yrs

₹10L - ₹20L / yr

Python

SQL

NOSQL Databases

Data architecture

Data modeling

+7 more

Role Overview:

We are seeking a talented and experienced Data Architect with strong data visualization capabilities to join our dynamic team in Mumbai. As a Data Architect, you will be responsible for designing, building, and managing our data infrastructure, ensuring its reliability, scalability, and performance. You will also play a crucial role in transforming complex data into insightful visualizations that drive business decisions. This role requires a deep understanding of data modeling, database technologies (particularly Oracle Cloud), data warehousing principles, and proficiency in data manipulation and visualization tools, including Python and SQL.

Responsibilities:

Design and implement robust and scalable data architectures, including data warehouses, data lakes, and operational data stores, primarily leveraging Oracle Cloud services.
Develop and maintain data models (conceptual, logical, and physical) that align with business requirements and ensure data integrity and consistency.
Define data governance policies and procedures to ensure data quality, security, and compliance.
Collaborate with data engineers to build and optimize ETL/ELT pipelines for efficient data ingestion, transformation, and loading.
Develop and execute data migration strategies to Oracle Cloud.
Utilize strong SQL skills to query, manipulate, and analyze large datasets from various sources.
Leverage Python and relevant libraries (e.g., Pandas, NumPy) for data cleaning, transformation, and analysis.
Design and develop interactive and insightful data visualizations using tools like [Specify Visualization Tools - e.g., Tableau, Power BI, Matplotlib, Seaborn, Plotly] to communicate data-driven insights to both technical and non-technical stakeholders.
Work closely with business analysts and stakeholders to understand their data needs and translate them into effective data models and visualizations.
Ensure the performance and reliability of data visualization dashboards and reports.
Stay up-to-date with the latest trends and technologies in data architecture, cloud computing (especially Oracle Cloud), and data visualization.
Troubleshoot data-related issues and provide timely resolutions.
Document data architectures, data flows, and data visualization solutions.
Participate in the evaluation and selection of new data technologies and tools.

Qualifications:

Bachelor's or Master's degree in Computer Science, Data Science, Information Systems, or a related field.
Proven experience (typically 5+ years) as a Data Architect, Data Modeler, or similar role.
Deep understanding of data warehousing concepts, dimensional modeling (e.g., star schema, snowflake schema), and ETL/ELT processes.
Extensive experience working with relational databases, particularly Oracle, and proficiency in SQL.
Hands-on experience with Oracle Cloud data services (e.g., Autonomous Data Warehouse, Object Storage, Data Integration).
Strong programming skills in Python and experience with data manipulation and analysis libraries (e.g., Pandas, NumPy).
Demonstrated ability to create compelling and effective data visualizations using industry-standard tools (e.g., Tableau, Power BI, Matplotlib, Seaborn, Plotly).
Excellent analytical and problem-solving skills with the ability to interpret complex data and translate it into actionable insights.
Strong communication and presentation skills, with the ability to effectively communicate technical concepts to non-technical audiences.
Experience with data governance and data quality principles.
Familiarity with agile development methodologies.
Ability to work independently and collaboratively within a team environment.

Application Link- https://forms.gle/km7n2WipJhC2Lj2r5

Role Overview:

Responsibilities:

Design and implement robust and scalable data architectures, including data warehouses, data lakes, and operational data stores, primarily leveraging Oracle Cloud services.
Develop and maintain data models (conceptual, logical, and physical) that align with business requirements and ensure data integrity and consistency.
Define data governance policies and procedures to ensure data quality, security, and compliance.
Collaborate with data engineers to build and optimize ETL/ELT pipelines for efficient data ingestion, transformation, and loading.
Develop and execute data migration strategies to Oracle Cloud.
Utilize strong SQL skills to query, manipulate, and analyze large datasets from various sources.
Leverage Python and relevant libraries (e.g., Pandas, NumPy) for data cleaning, transformation, and analysis.
Design and develop interactive and insightful data visualizations using tools like [Specify Visualization Tools - e.g., Tableau, Power BI, Matplotlib, Seaborn, Plotly] to communicate data-driven insights to both technical and non-technical stakeholders.
Work closely with business analysts and stakeholders to understand their data needs and translate them into effective data models and visualizations.
Ensure the performance and reliability of data visualization dashboards and reports.
Stay up-to-date with the latest trends and technologies in data architecture, cloud computing (especially Oracle Cloud), and data visualization.
Troubleshoot data-related issues and provide timely resolutions.
Document data architectures, data flows, and data visualization solutions.
Participate in the evaluation and selection of new data technologies and tools.

Qualifications:

Bachelor's or Master's degree in Computer Science, Data Science, Information Systems, or a related field.
Proven experience (typically 5+ years) as a Data Architect, Data Modeler, or similar role.
Deep understanding of data warehousing concepts, dimensional modeling (e.g., star schema, snowflake schema), and ETL/ELT processes.
Extensive experience working with relational databases, particularly Oracle, and proficiency in SQL.
Hands-on experience with Oracle Cloud data services (e.g., Autonomous Data Warehouse, Object Storage, Data Integration).
Strong programming skills in Python and experience with data manipulation and analysis libraries (e.g., Pandas, NumPy).
Demonstrated ability to create compelling and effective data visualizations using industry-standard tools (e.g., Tableau, Power BI, Matplotlib, Seaborn, Plotly).
Excellent analytical and problem-solving skills with the ability to interpret complex data and translate it into actionable insights.
Strong communication and presentation skills, with the ability to effectively communicate technical concepts to non-technical audiences.
Experience with data governance and data quality principles.
Familiarity with agile development methodologies.
Ability to work independently and collaboratively within a team environment.

Application Link- https://forms.gle/km7n2WipJhC2Lj2r5

Azure Data Engineer

at Intellikart Ventures LLP

Posted by ramandeep intellikart

Mumbai

3 - 5 yrs

₹18L - ₹19L / yr

SQL

Spark

Data modeling

Windows Azure

Data Analytics

+1 more

Location: Mumbai

Job Type: Full-Time (Hybrid – 3 days in office, 2 days WFH)

Job Overview:

We are looking for a skilled Azure Data Engineer with strong experience in data modeling, pipeline development, and SQL/Spark expertise. The ideal candidate will work closely with the Data Analytics & BI teams to implement robust data solutions on Azure Synapse and ensure seamless data integration with third-party applications.

Key Responsibilities:

Design, develop, and maintain Azure data pipelines using Azure Synapse (SQL dedicated pools or Apache Spark pools).
Implement data models in collaboration with the Data Analytics and BI teams.
Optimize and manage large-scale SQL and Spark-based data processing solutions.
Ensure data availability and reliability for third-party application consumption.
Collaborate with cross-functional teams to translate business requirements into scalable data solutions.

Required Skills & Experience:

3–5 years of hands-on experience in:

Azure data services
Data Modeling
SQL development and tuning
Apache Spark
Strong knowledge of Azure Synapse Analytics.
Experience in designing data pipelines and ETL/ELT processes.
Ability to troubleshoot and optimize complex data workflows.

Preferred Qualifications:

Experience with data governance, security, and data quality practices.
Familiarity with DevOps practices in a data engineering context.
Effective communication skills and the ability to work in a collaborative team environment.

Location: Mumbai

Job Type: Full-Time (Hybrid – 3 days in office, 2 days WFH)

Job Overview:

Key Responsibilities:

Design, develop, and maintain Azure data pipelines using Azure Synapse (SQL dedicated pools or Apache Spark pools).
Implement data models in collaboration with the Data Analytics and BI teams.
Optimize and manage large-scale SQL and Spark-based data processing solutions.
Ensure data availability and reliability for third-party application consumption.
Collaborate with cross-functional teams to translate business requirements into scalable data solutions.

Required Skills & Experience:

3–5 years of hands-on experience in:

Azure data services
Data Modeling
SQL development and tuning
Apache Spark
Strong knowledge of Azure Synapse Analytics.
Experience in designing data pipelines and ETL/ELT processes.
Ability to troubleshoot and optimize complex data workflows.

Preferred Qualifications:

Experience with data governance, security, and data quality practices.
Familiarity with DevOps practices in a data engineering context.
Effective communication skills and the ability to work in a collaborative team environment.

Data Engineers

at Nielsen

Posted by Dheeraj Sidana

Gurugram, Bengaluru (Bangalore), Mumbai

1 - 20 yrs

Best in industry

PySpark

Data engineering

Big Data

Hadoop

Spark

Nielsen, a global company specialising in audience measurement and analytics, is currently seeking a proficient leader in data engineering to join their team in Bangalore, Gurgaon, or Mumbai.

This is a manager of managers role that involves managing multiple scrum teams and overseeing an advanced data platform that analyses audience consumption patterns across various channels like OTT, TV, Radio, and Social Media worldwide. You will be responsible for building and supervising a top-performing data engineering team that delivers data for targeted campaigns. Moreover, you will work with AWS services (S3, Lambda, Kinesis) and other data engineering technologies such as Spark, Scala/Python, Kafka, etc. There may also be opportunities to establish deep integrations with OTT platforms like Netflix, Prime Video, and other.

Nielsen, a global company specialising in audience measurement and analytics, is currently seeking a proficient leader in data engineering to join their team in Bangalore, Gurgaon, or Mumbai.

Data Engineer

at Scremer

Posted by Sathish Dhawan

Pune, Mumbai

6 - 11 yrs

₹15L - ₹15L / yr

Amazon Web Services (AWS)

Python

Java

Spark

Primary Skills

DynamoDB, Java, Kafka, Spark, Amazon Redshift, AWS Lake Formation, AWS Glue, Python

Skills:

Good work experience showing growth as a Data Engineer.

Hands On programming experience

Implementation Experience on Kafka, Kinesis, Spark, AWS Glue, AWS Lake Formation.

Excellent knowledge in: Python, Scala/Java, Spark, AWS (Lambda, Step Functions, Dynamodb, EMR), Terraform, UI (Angular), Git, Mavena

Experience of performance optimization in Batch and Real time processing applications

Expertise in Data Governance and Data Security Implementation

Good hands-on design and programming skills building reusable tools and products Experience developing in AWS or similar cloud platforms. Preferred:, ECS, EKS, S3, EMR, DynamoDB, Aurora, Redshift, Quick Sight or similar.

Familiarity with systems with very high volume of transactions, micro service design, or data processing pipelines (Spark).

Knowledge and hands-on experience with server less technologies such as Lambda, MSK, MWAA, Kinesis Analytics a plus.

Expertise in practices like Agile, Peer reviews, Continuous Integration

Roles and responsibilities:

Determining project requirements and developing work schedules for the team.

Delegating tasks and achieving daily, weekly, and monthly goals.

Responsible for designing, building, testing, and deploying the software releases.

Salary: 25LPA-40LPA

Primary Skills

DynamoDB, Java, Kafka, Spark, Amazon Redshift, AWS Lake Formation, AWS Glue, Python

Skills:

Good work experience showing growth as a Data Engineer.

Hands On programming experience

Implementation Experience on Kafka, Kinesis, Spark, AWS Glue, AWS Lake Formation.

Excellent knowledge in: Python, Scala/Java, Spark, AWS (Lambda, Step Functions, Dynamodb, EMR), Terraform, UI (Angular), Git, Mavena

Experience of performance optimization in Batch and Real time processing applications

Expertise in Data Governance and Data Security Implementation

Familiarity with systems with very high volume of transactions, micro service design, or data processing pipelines (Spark).

Knowledge and hands-on experience with server less technologies such as Lambda, MSK, MWAA, Kinesis Analytics a plus.

Expertise in practices like Agile, Peer reviews, Continuous Integration

Roles and responsibilities:

Determining project requirements and developing work schedules for the team.

Delegating tasks and achieving daily, weekly, and monthly goals.

Responsible for designing, building, testing, and deploying the software releases.

Salary: 25LPA-40LPA

Sr. Data Engineer DataBricks

at Exponentia.ai

1 product

1 recruiter

Posted by Vipul Tiwari

Mumbai

4 - 6 yrs

₹12L - ₹19L / yr

ETL

Informatica

Data Warehouse (DWH)

databricks

Amazon Web Services (AWS)

+6 more

Job DescriptionPosition: Sr Data Engineer – Databricks & AWS

Experience: 4 - 5 Years

Company Profile:

Exponentia.ai is an AI tech organization with a presence across India, Singapore, the Middle East, and the UK. We are an innovative and disruptive organization, working on cutting-edge technology to help our clients transform into the enterprises of the future. We provide artificial intelligence-based products/platforms capable of automated cognitive decision-making to improve productivity, quality, and economics of the underlying business processes. Currently, we are transforming ourselves and rapidly expanding our business.

Exponentia.ai has developed long-term relationships with world-class clients such as PayPal, PayU, SBI Group, HDFC Life, Kotak Securities, Wockhardt and Adani Group amongst others.

One of the top partners of Cloudera (leading analytics player) and Qlik (leader in BI technologies), Exponentia.ai has recently been awarded the ‘Innovation Partner Award’ by Qlik in 2017.

Get to know more about us on our website: http://www.exponentia.ai/ and Life @Exponentia.

Role Overview:

· A Data Engineer understands the client requirements and develops and delivers the data engineering solutions as per the scope.

· The role requires good skills in the development of solutions using various services required for data architecture on Databricks Delta Lake, streaming, AWS, ETL Development, and data modeling.

Job Responsibilities

• Design of data solutions on Databricks including delta lake, data warehouse, data marts and other data solutions to support the analytics needs of the organization.

• Apply best practices during design in data modeling (logical, physical) and ETL pipelines (streaming and batch) using cloud-based services.

• Design, develop and manage the pipelining (collection, storage, access), data engineering (data quality, ETL, Data Modelling) and understanding (documentation, exploration) of the data.

• Interact with stakeholders regarding data landscape understanding, conducting discovery exercises, developing proof of concepts and demonstrating it to stakeholders.

Technical Skills

• Has more than 2 Years of experience in developing data lakes, and datamarts on the Databricks platform.

• Proven skill sets in AWS Data Lake services such as - AWS Glue, S3, Lambda, SNS, IAM, and skills in Spark, Python, and SQL.

• Experience in Pentaho

• Good understanding of developing a data warehouse, data marts etc.

• Has a good understanding of system architectures, and design patterns and should be able to design and develop applications using these principles.

Personality Traits

• Good collaboration and communication skills

• Excellent problem-solving skills to be able to structure the right analytical solutions.

• Strong sense of teamwork, ownership, and accountability

• Analytical and conceptual thinking

• Ability to work in a fast-paced environment with tight schedules.

• Good presentation skills with the ability to convey complex ideas to peers and management.

Education:

BE / ME / MS/MCA.

Job DescriptionPosition: Sr Data Engineer – Databricks & AWS

Experience: 4 - 5 Years

Company Profile:

Exponentia.ai has developed long-term relationships with world-class clients such as PayPal, PayU, SBI Group, HDFC Life, Kotak Securities, Wockhardt and Adani Group amongst others.

One of the top partners of Cloudera (leading analytics player) and Qlik (leader in BI technologies), Exponentia.ai has recently been awarded the ‘Innovation Partner Award’ by Qlik in 2017.

Get to know more about us on our website: http://www.exponentia.ai/ and Life @Exponentia.

Role Overview:

· A Data Engineer understands the client requirements and develops and delivers the data engineering solutions as per the scope.

· The role requires good skills in the development of solutions using various services required for data architecture on Databricks Delta Lake, streaming, AWS, ETL Development, and data modeling.

Job Responsibilities

• Design of data solutions on Databricks including delta lake, data warehouse, data marts and other data solutions to support the analytics needs of the organization.

• Apply best practices during design in data modeling (logical, physical) and ETL pipelines (streaming and batch) using cloud-based services.

• Design, develop and manage the pipelining (collection, storage, access), data engineering (data quality, ETL, Data Modelling) and understanding (documentation, exploration) of the data.

• Interact with stakeholders regarding data landscape understanding, conducting discovery exercises, developing proof of concepts and demonstrating it to stakeholders.

Technical Skills

• Has more than 2 Years of experience in developing data lakes, and datamarts on the Databricks platform.

• Proven skill sets in AWS Data Lake services such as - AWS Glue, S3, Lambda, SNS, IAM, and skills in Spark, Python, and SQL.

• Experience in Pentaho

• Good understanding of developing a data warehouse, data marts etc.

• Has a good understanding of system architectures, and design patterns and should be able to design and develop applications using these principles.

Personality Traits

• Good collaboration and communication skills

• Excellent problem-solving skills to be able to structure the right analytical solutions.

• Strong sense of teamwork, ownership, and accountability

• Analytical and conceptual thinking

• Ability to work in a fast-paced environment with tight schedules.

• Good presentation skills with the ability to convey complex ideas to peers and management.

Education:

BE / ME / MS/MCA.

Python Developer

Consulting

Agency job

via Michael Page by Pratanu Chakraborty

Pune, Mumbai

6 - 8 yrs

₹5L - ₹20L / yr

Python

Spark

SQL

6-8 years of hands-on development experience using core Python
Hands-on experience with Spark and SQL
Good to have java knowledge

Data Engineer – AWS

Encubate Tech Private Ltd

Agency job

via staff hire solutions by Purvaja Patidar

Mumbai

5 - 6 yrs

₹15L - ₹20L / yr

Amazon Web Services (AWS)

Amazon Redshift

Data modeling

ITL

Agile/Scrum

+7 more

Roles and

Responsibilities

Seeking AWS Cloud Engineer /Data Warehouse Developer for our Data CoE team to

help us in configure and develop new AWS environments for our Enterprise Data Lake,

migrate the on-premise traditional workloads to cloud. Must have a sound

understanding of BI best practices, relational structures, dimensional data modelling,

structured query language (SQL) skills, data warehouse and reporting techniques.

 Extensive experience in providing AWS Cloud solutions to various business

use cases.

 Creating star schema data models, performing ETLs and validating results with

business representatives

 Supporting implemented BI solutions by: monitoring and tuning queries and

data loads, addressing user questions concerning data integrity, monitoring

performance and communicating functional and technical issues.

Job Description: -

This position is responsible for the successful delivery of business intelligence

information to the entire organization and is experienced in BI development and

implementations, data architecture and data warehousing.

Requisite Qualification

Essential

AWS Certified Database Specialty or -

AWS Certified Data Analytics

Preferred

Any other Data Engineer Certification

Requisite Experience

Essential 4 -7 yrs of experience

Preferred 2+ yrs of experience in ETL & data pipelines

Skills Required

Special Skills Required

 AWS: S3, DMS, Redshift, EC2, VPC, Lambda, Delta Lake, CloudWatch etc.

 Bigdata: Databricks, Spark, Glue and Athena

 Expertise in Lake Formation, Python programming, Spark, Shell scripting

 Minimum Bachelor’s degree with 5+ years of experience in designing, building,

and maintaining AWS data components

 3+ years of experience in data component configuration, related roles and

access setup

 Expertise in Python programming

 Knowledge in all aspects of DevOps (source control, continuous integration,

deployments, etc.)

 Comfortable working with DevOps: Jenkins, Bitbucket, CI/CD

 Hands on ETL development experience, preferably using or SSIS

 SQL Server experience required

 Strong analytical skills to solve and model complex business requirements

 Sound understanding of BI Best Practices/Methodologies, relational structures,

dimensional data modelling, structured query language (SQL) skills, data

warehouse and reporting techniques

Preferred Skills

Required

 Experience working in the SCRUM Environment.

 Experience in Administration (Windows/Unix/Network/Database/Hadoop) is a

plus.

 Experience in SQL Server, SSIS, SSAS, SSRS

 Comfortable with creating data models and visualization using Power BI

 Hands on experience in relational and multi-dimensional data modelling,

including multiple source systems from databases and flat files, and the use of

standard data modelling tools

 Ability to collaborate on a team with infrastructure, BI report development and

business analyst resources, and clearly communicate solutions to both

technical and non-technical team members

Roles and

Responsibilities

Seeking AWS Cloud Engineer /Data Warehouse Developer for our Data CoE team to

help us in configure and develop new AWS environments for our Enterprise Data Lake,

migrate the on-premise traditional workloads to cloud. Must have a sound

understanding of BI best practices, relational structures, dimensional data modelling,

structured query language (SQL) skills, data warehouse and reporting techniques.

 Extensive experience in providing AWS Cloud solutions to various business

use cases.

 Creating star schema data models, performing ETLs and validating results with

business representatives

 Supporting implemented BI solutions by: monitoring and tuning queries and

data loads, addressing user questions concerning data integrity, monitoring

performance and communicating functional and technical issues.

Job Description: -

This position is responsible for the successful delivery of business intelligence

information to the entire organization and is experienced in BI development and

implementations, data architecture and data warehousing.

Requisite Qualification

Essential

AWS Certified Database Specialty or -

AWS Certified Data Analytics

Preferred

Any other Data Engineer Certification

Requisite Experience

Essential 4 -7 yrs of experience

Preferred 2+ yrs of experience in ETL & data pipelines

Skills Required

Special Skills Required

 AWS: S3, DMS, Redshift, EC2, VPC, Lambda, Delta Lake, CloudWatch etc.

 Bigdata: Databricks, Spark, Glue and Athena

 Expertise in Lake Formation, Python programming, Spark, Shell scripting

 Minimum Bachelor’s degree with 5+ years of experience in designing, building,

and maintaining AWS data components

 3+ years of experience in data component configuration, related roles and

access setup

 Expertise in Python programming

 Knowledge in all aspects of DevOps (source control, continuous integration,

deployments, etc.)

 Comfortable working with DevOps: Jenkins, Bitbucket, CI/CD

 Hands on ETL development experience, preferably using or SSIS

 SQL Server experience required

 Strong analytical skills to solve and model complex business requirements

 Sound understanding of BI Best Practices/Methodologies, relational structures,

dimensional data modelling, structured query language (SQL) skills, data

warehouse and reporting techniques

Preferred Skills

Required

 Experience working in the SCRUM Environment.

 Experience in Administration (Windows/Unix/Network/Database/Hadoop) is a

plus.

 Experience in SQL Server, SSIS, SSAS, SSRS

 Comfortable with creating data models and visualization using Power BI

 Hands on experience in relational and multi-dimensional data modelling,

including multiple source systems from databases and flat files, and the use of

standard data modelling tools

 Ability to collaborate on a team with infrastructure, BI report development and

business analyst resources, and clearly communicate solutions to both

technical and non-technical team members

Bigdata Professional

at HCL Technologies

3 recruiters

Agency job

via Saiva System by Sunny Kumar

Delhi, Gurugram, Noida, Ghaziabad, Faridabad, Bengaluru (Bangalore), Hyderabad, Chennai, Pune, Mumbai, Kolkata

5 - 10 yrs

₹5L - ₹20L / yr

PySpark

Data engineering

Big Data

Hadoop

Spark

+2 more

Exp- 5 + years
Skill- Spark and Scala along with Azure
Location - Pan India

Looking for someone Bigdata along with Azure

Software Engineer - Data Science

at LogiNext

1 video

7 recruiters

Posted by Rakhi Daga

Mumbai

2 - 3 yrs

₹8L - ₹12L / yr

Java

C++

Scala

Spark

LogiNext is looking for a technically savvy and passionate Software Engineer - Data Science to analyze large amounts of raw information to find patterns that will help improve our company. We will rely on you to build data products to extract valuable business insights.

In this role, you should be highly analytical with a knack for analysis, math and statistics. Critical thinking and problem-solving skills are essential for interpreting data. We also want to see a passion for machine-learning and research.

Your goal will be to help our company analyze trends to make better decisions. Without knowledge of how the software works, data scientists might have difficulty in work. Apart from experience in developing R and Python, they must know modern approaches to software development and their impact. DevOps continuous integration and deployment, experience in cloud computing are everyday skills to manage and process data.

Responsibilities:

Identify valuable data sources and automate collection processes Undertake preprocessing of structured and unstructured data Analyze large amounts of information to discover trends and patterns Build predictive models and machine-learning algorithms Combine models through ensemble modeling Present information using data visualization techniques Propose solutions and strategies to business challenges Collaborate with engineering and product development teams

Requirements:

Bachelors degree or higher in Computer Science, Information Technology, Information Systems, Statistics, Mathematics, Commerce, Engineering, Business Management, Marketing or related field from top-tier school 2 to 3 year experince in in data mining, data modeling, and reporting. Understading of SaaS based products and services. Understanding of machine-learning and operations research Experience of R, SQL and Python; familiarity with Scala, Java or C++ is an asset Experience using business intelligence tools (e.g. Tableau) and data frameworks (e.g. Hadoop) Analytical mind and business acumen and problem-solving aptitude Excellent communication and presentation skills Proficiency in Excel for data management and manipulation Experience in statistical modeling techniques and data wrangling Able to work independently and set goals keeping business objectives in mind

Responsibilities:

Requirements:

Senior Software Engineer - Data Science

at LogiNext

1 video

7 recruiters

Posted by Rakhi Daga

Mumbai

4 - 7 yrs

₹12L - ₹19L / yr

Machine Learning (ML)

Data Science

PHP

Java

Spark

+1 more

LogiNext is looking for a technically savvy and passionate Senior Software Engineer - Data Science to analyze large amounts of raw information to find patterns that will help improve our company. We will rely on you to build data products to extract valuable business insights.

Responsibilities :

Adapting and enhancing machine learning techniques based on physical intuition about the domain Design sampling methodology, prepare data, including data cleaning, univariate analysis, missing value imputation, , identify appropriate analytic and statistical methodology, develop predictive models and document process and results Lead projects both as a principal investigator and project manager, responsible for meeting project requirements on schedule and on budget Coordinate and lead efforts to innovate by deriving insights from heterogeneous sets of data generated by our suite of Aerospace products Support and mentor data scientists Maintain and work with our data pipeline that transfers and processes several terabytes of data using Spark, Scala, Python, Apache Kafka, Pig/Hive & Impala Work directly with application teams/partners (internal clients such as Xbox, Skype, Office) to understand their offerings/domain and help them become successful with data so they can run controlled experiments (a/b testing) Understand the data generated by experiments, and producing actionable, trustworthy conclusions from them Apply data analysis, data mining and data processing to present data clearly and develop experiments (ab testing) Work with development team to build tools for data logging and repeatable data tasks tol accelerate and automate data scientist duties

Requirements:

Bachelor’s or Master’s degree in Computer Science, Math, Physics, Engineering, Statistics or other technical field. PhD preferred 4 to 7 years of experience in data mining, data modeling, and reporting 3+ years of experience working with large data sets or do large scale quantitative analysis Expert SQL scripting required Development experience in one of the following: Scala, Java, Python, Perl, PHP, C++ or C# Experience working with Hadoop, Pig/Hive, Spark, MapReduce Ability to drive projects Basic understanding of statistics – hypothesis testing, p-values, confidence intervals, regression, classification, and optimization are core lingo Analysis - Should be able to perform Exploratory Data Analysis and get actionable insights from the data, with impressive visualization. Modeling - Should be familiar with ML concepts and algorithms; understanding of the internals and pros/cons of models is required. Strong algorithmic problem-solving skills Experience manipulating large data sets through statistical software (ex. R, SAS) or other methods Superior verbal, visual and written communication skills to educate and work with cross functional teams on controlled experiments Experimentation design or A/B testing experience is preferred. Experince in team management.

Responsibilities :

Requirements:

Software developer

Tier 1 MNC

Agency job

via People First Consultants by Jayaraj E

Chennai, Pune, Bengaluru (Bangalore), Noida, Gurugram, Kochi (Cochin), Coimbatore, Hyderabad, Mumbai, Navi Mumbai

3 - 12 yrs

₹3L - ₹15L / yr

Spark

Hadoop

Big Data

Data engineering

PySpark

+1 more

Greetings,
We are hiring for Tier 1 MNC for the software developer with good knowledge in Spark,Hadoop and Scala

Data Engineer

at Nascentvision

1 recruiter

Posted by Shanu Mohan

Gurugram, Mumbai, Bengaluru (Bangalore)

2 - 4 yrs

₹10L - ₹17L / yr

Python

PySpark

Amazon Web Services (AWS)

Spark

Scala

+2 more

Hands-on experience in any Cloud Platform

· Versed in Spark, Scala/python, SQL

Microsoft Azure Experience

· Experience working on Real Time Data Processing Pipeline

Hands-on experience in any Cloud Platform

· Versed in Spark, Scala/python, SQL

Microsoft Azure Experience

· Experience working on Real Time Data Processing Pipeline

Java Developer

AI-powered Growth Marketing platform

Agency job

via Jobdost by Sathish Kumar

Mumbai, Bengaluru (Bangalore)

2 - 7 yrs

₹8L - ₹25L / yr

Java

NOSQL Databases

MongoDB

Cassandra

Apache

+3 more

The Impact You Will Create

Build campaign generation services which can send app notifications at a speed of 10 million a minute
Dashboards to show Real time key performance indicators to clients
Develop complex user segmentation engines which creates segments on Terabytes of data within few seconds
Building highly available & horizontally scalable platform services for ever growing data
Use cloud based services like AWS Lambda for blazing fast throughput & auto scalability
Work on complex analytics on terabytes of data like building Cohorts, Funnels, User path analysis, Recency Frequency & Monetary analysis at blazing speed
You will build backend services and APIs to create scalable engineering systems.
As an individual contributor, you will tackle some of our broadest technical challenges that requires deep technical knowledge, hands-on software development and seamless collaboration with all functions.
You will envision and develop features that are highly reliable and fault tolerant to deliver a superior customer experience.
Collaborating various highly-functional teams in the company to meet deliverables throughout the software development lifecycle.
Identify and improvise areas of improvement through data insights and research.

What we look for?

2-5 years of experience in backend development and must have worked on Java/shell/Perl/python scripting.
Solid understanding of engineering best practices, continuous integration, and incremental delivery.
Strong analytical skills, debugging and troubleshooting skills, product line analysis.
Follower of agile methodology (Sprint planning, working on JIRA, retrospective etc).
Proficiency in usage of tools like Docker, Maven, Jenkins and knowledge on frameworks in Java like spring, spring boot, hibernate, JPA.
Ability to design application modules using various concepts like object oriented, multi-threading, synchronization, caching, fault tolerance, sockets, various IPCs, database interfaces etc.
Hands on experience on Redis, MySQL and streaming technologies like Kafka producer consumers and NoSQL databases like mongo dB/Cassandra.
Knowledge about versioning like Git and deployment processes like CICD.

The Impact You Will Create

Build campaign generation services which can send app notifications at a speed of 10 million a minute
Dashboards to show Real time key performance indicators to clients
Develop complex user segmentation engines which creates segments on Terabytes of data within few seconds
Building highly available & horizontally scalable platform services for ever growing data
Use cloud based services like AWS Lambda for blazing fast throughput & auto scalability
Work on complex analytics on terabytes of data like building Cohorts, Funnels, User path analysis, Recency Frequency & Monetary analysis at blazing speed
You will build backend services and APIs to create scalable engineering systems.
As an individual contributor, you will tackle some of our broadest technical challenges that requires deep technical knowledge, hands-on software development and seamless collaboration with all functions.
You will envision and develop features that are highly reliable and fault tolerant to deliver a superior customer experience.
Collaborating various highly-functional teams in the company to meet deliverables throughout the software development lifecycle.
Identify and improvise areas of improvement through data insights and research.

What we look for?

2-5 years of experience in backend development and must have worked on Java/shell/Perl/python scripting.
Solid understanding of engineering best practices, continuous integration, and incremental delivery.
Strong analytical skills, debugging and troubleshooting skills, product line analysis.
Follower of agile methodology (Sprint planning, working on JIRA, retrospective etc).
Proficiency in usage of tools like Docker, Maven, Jenkins and knowledge on frameworks in Java like spring, spring boot, hibernate, JPA.
Ability to design application modules using various concepts like object oriented, multi-threading, synchronization, caching, fault tolerance, sockets, various IPCs, database interfaces etc.
Hands on experience on Redis, MySQL and streaming technologies like Kafka producer consumers and NoSQL databases like mongo dB/Cassandra.
Knowledge about versioning like Git and deployment processes like CICD.

Data Warehouse Architect

Agency job

via The Hub by Sridevi Viswanathan

Mumbai

8 - 10 yrs

₹20L - ₹23L / yr

Data Warehouse (DWH)

ETL

Hadoop

Apache Spark

Spark

+4 more

• You will work alongside the Project Management to ensure alignment of plans with what is being
delivered.
• You will utilize your configuration management and software release experience; as well as
change management concepts to drive the success of the projects.
• You will partner with senior leaders to understand and communicate the business needs to
translate them into IT requirements. Consult with Customer’s Business Analysts on their Data
warehouse requirements
• You will assist the technical team in identification and resolution of Data Quality issues.
• You will manage small to medium-sized projects relating to the delivery of applications or
application changes.
• You will use Managed Services or 3rd party resources to meet application support requirements.
• You will interface daily with multi-functional team members within the EDW team and across the
enterprise to resolve issues.
• Recommend and advocate different approaches and designs to the requirements
• Write technical design docs
• Execute Data modelling
• Solution inputs for the presentation layer
• You will craft and generate summary, statistical, and presentation reports; as well as provide reporting and metrics for strategic initiatives.
• Performs miscellaneous job-related duties as assigned

Preferred Qualifications

• Strong interpersonal, teamwork, organizational and workload planning skills
• Strong analytical, evaluative, and problem-solving abilities as well as exceptional customer service orientation
• Ability to drive clarity of purpose and goals during release and planning activities
• Excellent organizational skills including ability to prioritize tasks efficiently with high level of attention to detail
• Excited by the opportunity to continually improve processes within a large company
• Healthcare background/ Automobile background.
• Familiarity with major big data solutions and products available in the market.
• Proven ability to drive continuous

Data Engineer_1

SAP company

Agency job

via Mgneto Resource Management by Sonali Kamani

Mumbai, Navi Mumbai

3 - 8 yrs

₹7L - ₹13L / yr

Data engineering

Apache Kafka

Apache Spark

Hadoop

apache flink

+7 more

Build data systems and pipelines using Apache Flink (or similar) pipelines.
Understand various raw data input formats, build consumers on Kafka/ksqldb for them and ingest large amounts of raw data into Flink and Spark.
Conduct complex data analysis and report on results.
Build various aggregation streams for data and convert raw data into various logical processing streams.
Build algorithms to integrate multiple sources of data and create a unified data model from all the sources.
Build a unified data model on both SQL and NO-SQL databases to act as data sink.
Communicate the designs effectively with the fullstack engineering team for development.
Explore machine learning models that can be fitted on top of the data pipelines.

Mandatory Qualifications Skills:

Deep knowledge of Scala and Java programming languages is mandatory
Strong background in streaming data frameworks (Apache Flink, Apache Spark) is mandatory
Good understanding and hands on skills on streaming messaging platforms such as Kafka
Familiarity with R, C and Python is an asset
Analytical mind and business acumen with strong math skills (e.g. statistics, algebra)
Problem-solving aptitude
Excellent communication and presentation skills

Data Scientist

Innovative Brand Design Studio

Agency job

via Unnati by Astha Bharadwaj

Mumbai

2 - 5 yrs

₹8L - ₹15L / yr

Data Science

Data Scientist

Python

Tableau

R Programming

+7 more

Come work with a growing consumer market research team that is currently serving one of the biggest FMCG companies in the world.

Our client works with global brands and creates projects that are user-centric. They build cost-effective and compelling product stories that help their clients gain a competitive edge and growth in their brand image. Their team of experts consists of academicians, designers, startup specialists and experts are working for clients across 12 countries targeting new markets and solutions with an excellent understanding of end-users.

They work with global brands from FMCG, Beauty and Hospitality sectors, namely Unilever, Lipton, Lakme, Loreal, AXE etc. who have chosen them for a long-term relationship, depending on their insights, consumer research, storytelling and contetnt experience. The founder is a design and product activation expert with over 10 years of impact and over 300 completed projects in India, UK, South Asia and USA.

As a Data Scientist, you will help to deliver quantitative consumer primary market research through Survey.

What you will do:

Handling Survey Scripting Process through the use of survey software platform such as Toluna, QuestionPro, Decipher.
Mining large & complex data sets using SQL, Hadoop, NoSQL or Spark.
Delivering complex consumer data analysis through the use of software like R, Python, Excel and etc such as
Working on Basic Statistical Analysis such as:T-Test &Correlation
Performing more complex data analysis processes through Machine Learning technique such as:

Classification
Regression
Clustering
Text
Analysis
Neural Networking

Creating an Interactive Dashboard Creation through the use of software like Tableau or any other software you are able to use.
Working on Statistical and mathematical modelling, application of ML and AI algorithms

What you need to have:

Bachelor or Master's degree in highly quantitative field (CS, machine learning, mathematics, statistics, economics) or equivalent experience.
An opportunity for one, who is eager of proving his or her data analytical skills with one of the Biggest FMCG market player.

Come work with a growing consumer market research team that is currently serving one of the biggest FMCG companies in the world.

As a Data Scientist, you will help to deliver quantitative consumer primary market research through Survey.

What you will do:

Handling Survey Scripting Process through the use of survey software platform such as Toluna, QuestionPro, Decipher.
Mining large & complex data sets using SQL, Hadoop, NoSQL or Spark.
Delivering complex consumer data analysis through the use of software like R, Python, Excel and etc such as
Working on Basic Statistical Analysis such as:T-Test &Correlation
Performing more complex data analysis processes through Machine Learning technique such as:

Classification
Regression
Clustering
Text
Analysis
Neural Networking

Creating an Interactive Dashboard Creation through the use of software like Tableau or any other software you are able to use.
Working on Statistical and mathematical modelling, application of ML and AI algorithms

What you need to have:

Bachelor or Master's degree in highly quantitative field (CS, machine learning, mathematics, statistics, economics) or equivalent experience.
An opportunity for one, who is eager of proving his or her data analytical skills with one of the Biggest FMCG market player.

Azure Data Engineer

at Fragma Data Systems

8 recruiters

Posted by Evelyn Charles

Remote, Bengaluru (Bangalore), Hyderabad, Chennai, Mumbai, Pune

8 - 15 yrs

₹16L - ₹28L / yr

PySpark

SQL Azure

azure synapse

Windows Azure

Azure Data Engineer

+3 more

Technology Skills:

Building and operationalizing large scale enterprise data solutions and applications using one or more of AZURE data and analytics services in combination with custom solutions - Azure Synapse/Azure SQL DWH, Azure Data Lake, Azure Blob Storage, Spark, HDInsights, Databricks, CosmosDB, EventHub/IOTHub.
Experience in migrating on-premise data warehouses to data platforms on AZURE cloud.
Designing and implementing data engineering, ingestion, and transformation functions

Good to Have:

Experience with Azure Analysis Services
Experience in Power BI
Experience with third-party solutions like Attunity/Stream sets, Informatica
Experience with PreSales activities (Responding to RFPs, Executing Quick POCs)
Capacity Planning and Performance Tuning on Azure Stack and Spark.

Technology Skills:

Building and operationalizing large scale enterprise data solutions and applications using one or more of AZURE data and analytics services in combination with custom solutions - Azure Synapse/Azure SQL DWH, Azure Data Lake, Azure Blob Storage, Spark, HDInsights, Databricks, CosmosDB, EventHub/IOTHub.
Experience in migrating on-premise data warehouses to data platforms on AZURE cloud.
Designing and implementing data engineering, ingestion, and transformation functions

Good to Have:

Experience with Azure Analysis Services
Experience in Power BI
Experience with third-party solutions like Attunity/Stream sets, Informatica
Experience with PreSales activities (Responding to RFPs, Executing Quick POCs)
Capacity Planning and Performance Tuning on Azure Stack and Spark.

Data Engineer

at PAGO Analytics India Pvt Ltd

2 recruiters

Posted by Vijay Cheripally

Remote, Bengaluru (Bangalore), Mumbai, NCR (Delhi | Gurgaon | Noida)

2 - 8 yrs

₹8L - ₹15L / yr

Python

PySpark

Microsoft Windows Azure

SQL Azure

Data Analytics

+6 more

Be an integral part of large scale client business development and delivery engagements

Develop the software and systems needed for end-to-end execution on large projects

Work across all phases of SDLC, and use Software Engineering principles to build scaled solutions

Build the knowledge base required to deliver increasingly complex technology projects

Object-oriented languages (e.g. Python, PySpark, Java, C#, C++ ) and frameworks (e.g. J2EE or .NET)

Database programming using any flavours of SQL

Expertise in relational and dimensional modelling, including big data technologies

Exposure across all the SDLC process, including testing and deployment

Expertise in Microsoft Azure is mandatory including components like Azure Data Factory, Azure Data Lake Storage, Azure SQL, Azure DataBricks, HD Insights, ML Service etc.

Good knowledge of Python and Spark are required

Good understanding of how to enable analytics using cloud technology and ML Ops

Experience in Azure Infrastructure and Azure Dev Ops will be a strong plus

Be an integral part of large scale client business development and delivery engagements

Develop the software and systems needed for end-to-end execution on large projects

Work across all phases of SDLC, and use Software Engineering principles to build scaled solutions

Build the knowledge base required to deliver increasingly complex technology projects

Object-oriented languages (e.g. Python, PySpark, Java, C#, C++ ) and frameworks (e.g. J2EE or .NET)

Database programming using any flavours of SQL

Expertise in relational and dimensional modelling, including big data technologies

Exposure across all the SDLC process, including testing and deployment

Expertise in Microsoft Azure is mandatory including components like Azure Data Factory, Azure Data Lake Storage, Azure SQL, Azure DataBricks, HD Insights, ML Service etc.

Good knowledge of Python and Spark are required

Good understanding of how to enable analytics using cloud technology and ML Ops

Experience in Azure Infrastructure and Azure Dev Ops will be a strong plus

Data Engineer

at Crisp Analytics

8 recruiters

Posted by Seema Pahwa

Mumbai

2 - 6 yrs

₹6L - ₹15L / yr

Big Data

Spark

Scala

Amazon Web Services (AWS)

Apache Kafka

The Data Engineering team is one of the core technology teams of Lumiq.ai and is responsible for creating all the Data related products and platforms which scale for any amount of data, users, and processing. The team also interacts with our customers to work out solutions, create technical architectures and deliver the products and solutions.

If you are someone who is always pondering how to make things better, how technologies can interact, how various tools, technologies, and concepts can help a customer or how a customer can use our products, then Lumiq is the place of opportunities.

Who are you?

Enthusiast is your middle name. You know what’s new in Big Data technologies and how things are moving
Apache is your toolbox and you have been a contributor to open source projects or have discussed the problems with the community on several occasions
You use cloud for more than just provisioning a Virtual Machine
Vim is friendly to you and you know how to exit Nano
You check logs before screaming about an error
You are a solid engineer who writes modular code and commits in GIT
You are a doer who doesn’t say “no” without first understanding
You understand the value of documentation of your work
You are familiar with Machine Learning Ecosystem and how you can help your fellow Data Scientists to explore data and create production-ready ML pipelines

Eligibility

Experience

At least 2 years of Data Engineering Experience
Have interacted with Customers

Must Have Skills

Amazon Web Services (AWS) - EMR, Glue, S3, RDS, EC2, Lambda, SQS, SES
Apache Spark
Python
Scala
PostgreSQL
Git
Linux

Good to have Skills

Apache NiFi
Apache Kafka
Apache Hive
Docker
Amazon Certification

Who are you?

Enthusiast is your middle name. You know what’s new in Big Data technologies and how things are moving
Apache is your toolbox and you have been a contributor to open source projects or have discussed the problems with the community on several occasions
You use cloud for more than just provisioning a Virtual Machine
Vim is friendly to you and you know how to exit Nano
You check logs before screaming about an error
You are a solid engineer who writes modular code and commits in GIT
You are a doer who doesn’t say “no” without first understanding
You understand the value of documentation of your work
You are familiar with Machine Learning Ecosystem and how you can help your fellow Data Scientists to explore data and create production-ready ML pipelines

Eligibility

Experience

At least 2 years of Data Engineering Experience
Have interacted with Customers

Must Have Skills

Amazon Web Services (AWS) - EMR, Glue, S3, RDS, EC2, Lambda, SQS, SES
Apache Spark
Python
Scala
PostgreSQL
Git
Linux

Good to have Skills

Apache NiFi
Apache Kafka
Apache Hive
Docker
Amazon Certification

Data Engineer

at Codalyze Technologies

4 recruiters

Posted by Aishwarya Hire

Mumbai

3 - 9 yrs

₹5L - ₹12L / yr

Apache Hive

Hadoop

Scala

Spark

Amazon Web Services (AWS)

+2 more

Job Overview :

Your mission is to help lead team towards creating solutions that improve the way our business is run. Your knowledge of design, development, coding, testing and application programming will help your team raise their game, meeting your standards, as well as satisfying both business and functional requirements. Your expertise in various technology domains will be counted on to set strategic direction and solve complex and mission critical problems, internally and externally. Your quest to embracing leading-edge technologies and methodologies inspires your team to follow suit.

Responsibilities and Duties :

- As a Data Engineer you will be responsible for the development of data pipelines for numerous applications handling all kinds of data like structured, semi-structured &
unstructured. Having big data knowledge specially in Spark & Hive is highly preferred.

- Work in team and provide proactive technical oversight, advice development teams fostering re-use, design for scale, stability, and operational efficiency of data/analytical solutions

Education level :

- Bachelor's degree in Computer Science or equivalent

Experience :

- Minimum 3+ years relevant experience working on production grade projects experience in hands on, end to end software development

- Expertise in application, data and infrastructure architecture disciplines

- Expert designing data integrations using ETL and other data integration patterns

- Advanced knowledge of architecture, design and business processes

Proficiency in :

- Modern programming languages like Java, Python, Scala

- Big Data technologies Hadoop, Spark, HIVE, Kafka

- Writing decently optimized SQL queries

- Orchestration and deployment tools like Airflow & Jenkins for CI/CD (Optional)

- Responsible for design and development of integration solutions with Hadoop/HDFS, Real-Time Systems, Data Warehouses, and Analytics solutions

- Knowledge of system development lifecycle methodologies, such as waterfall and AGILE.

- An understanding of data architecture and modeling practices and concepts including entity-relationship diagrams, normalization, abstraction, denormalization, dimensional
modeling, and Meta data modeling practices.

- Experience generating physical data models and the associated DDL from logical data models.

- Experience developing data models for operational, transactional, and operational reporting, including the development of or interfacing with data analysis, data mapping,
and data rationalization artifacts.

- Experience enforcing data modeling standards and procedures.

- Knowledge of web technologies, application programming languages, OLTP/OLAP technologies, data strategy disciplines, relational databases, data warehouse development and Big Data solutions.

- Ability to work collaboratively in teams and develop meaningful relationships to achieve common goals

Skills :

Must Know :

- Core big-data concepts

- Spark - PySpark/Scala

- Data integration tool like Pentaho, Nifi, SSIS, etc (at least 1)

- Handling of various file formats

- Cloud platform - AWS/Azure/GCP

- Orchestration tool - Airflow

Data Engineer

at Codalyze Technologies

4 recruiters

Posted by Aishwarya Hire

Mumbai

3 - 7 yrs

₹7L - ₹20L / yr

Hadoop

Big Data

Scala

Spark

Amazon Web Services (AWS)

+3 more

Job Overview :

Your mission is to help lead team towards creating solutions that improve the way our business is run. Your knowledge of design, development, coding, testing and application programming will help your team raise their game, meeting your standards, as well as satisfying both business and functional requirements. Your expertise in various technology domains will be counted on to set strategic direction and solve complex and mission critical problems, internally and externally. Your quest to embracing leading-edge technologies and methodologies inspires your team to follow suit.

Responsibilities and Duties :

- As a Data Engineer you will be responsible for the development of data pipelines for numerous applications handling all kinds of data like structured, semi-structured &
unstructured. Having big data knowledge specially in Spark & Hive is highly preferred.

- Work in team and provide proactive technical oversight, advice development teams fostering re-use, design for scale, stability, and operational efficiency of data/analytical solutions

Education level :

- Bachelor's degree in Computer Science or equivalent

Experience :

- Minimum 5+ years relevant experience working on production grade projects experience in hands on, end to end software development

- Expertise in application, data and infrastructure architecture disciplines

- Expert designing data integrations using ETL and other data integration patterns

- Advanced knowledge of architecture, design and business processes

Proficiency in :

- Modern programming languages like Java, Python, Scala

- Big Data technologies Hadoop, Spark, HIVE, Kafka

- Writing decently optimized SQL queries

- Orchestration and deployment tools like Airflow & Jenkins for CI/CD (Optional)

- Responsible for design and development of integration solutions with Hadoop/HDFS, Real-Time Systems, Data Warehouses, and Analytics solutions

- Knowledge of system development lifecycle methodologies, such as waterfall and AGILE.

- An understanding of data architecture and modeling practices and concepts including entity-relationship diagrams, normalization, abstraction, denormalization, dimensional
modeling, and Meta data modeling practices.

- Experience generating physical data models and the associated DDL from logical data models.

- Experience developing data models for operational, transactional, and operational reporting, including the development of or interfacing with data analysis, data mapping,
and data rationalization artifacts.

- Experience enforcing data modeling standards and procedures.

- Knowledge of web technologies, application programming languages, OLTP/OLAP technologies, data strategy disciplines, relational databases, data warehouse development and Big Data solutions.

- Ability to work collaboratively in teams and develop meaningful relationships to achieve common goals

Skills :

Must Know :

- Core big-data concepts

- Spark - PySpark/Scala

- Data integration tool like Pentaho, Nifi, SSIS, etc (at least 1)

- Handling of various file formats

- Cloud platform - AWS/Azure/GCP

- Orchestration tool - Airflow

Core Java Developer

at Globant

2 recruiters

Posted by Risha P

Mumbai

5 - 10 yrs

₹12L - ₹20L / yr

Object Oriented Programming (OOPs)

Shell Scripting

Java

SOAP

JSON

+8 more

We are looking for a Java developer for one of our major investment banking client- who can take ownership for the whole end to end delivery, performing analysis, design, coding, testing and maintenance of large- scale and distributed applications. Please find JD for your reference . Job Profile : Java Developer : Location : Mumbai Description: A core Java developer is required for a Tier 1 investment bank supporting the Delta One Structured Products IT group. This is a global front-office team that supports the global OTC Equity Swap Portfolio, Single Name, and Index derivative businesses. We are designing a complete restructure of the Equity Swaps trading platform, and this particular role is within the core cash flow and valuations area. The role will require the candidate to work closely with the cash flow engines team to solve problems that combine both finance and technology. This is an exciting hands-on role for a self-starter who has a thirst for new challenges as well as new technologies. The candidate should possess good analytical skills, strong software engineering skills, a logical approach to problem-solving, be able to work in a fast paced environment liaising with demanding stakeholders to understand complex requirements and be able to prioritize work under pressure with minimal supervision. The candidate should be a problem solver, and be able to bring with them some positivity and enthusiasm in trying to think about and offer potential solutions for architectural considerations. Position Profile: We are looking for someone to help own problems and be able to demonstrate leadership and responsibility for the delivery of new features. As part of the development cycle, you would be expected to write quality unit tests, supply documentation if relevant for new feature build-outs, and be involved in the test cycle (UAT, integration, regression) for the delivery and fixing of bugs for your new features. Although the role is predominantly Java, we require someone who is flexible with the development environment, as some days you might be writing Java, and other days you might be fixing stored procedures or Perl scripts. You would be expected to get involved in the Level 3 production support rota which is shared between our developers on a monthly cycle, and to occasionally help with weekend deployment activities to deploy and verify any code changes you have been involved in. Team Profile: The team and role are ideal for someone looking for a strong career development path with many opportunities to grow, learn and develop. The role requires someone who is flexible and able to respond to a dynamic business environment. The candidate must be adaptable to work across multiple technologies and disciplines, with a focus on delivering quality solutions for the business in a timely fashion. This role suits people experienced in complex data domains. Required Skills: * Experience of agile and scrum methodologies. * Core Java. * Unix shell scripting. * SQL and Relational Databases such as DB2. * Integration technologies - MQ/Xml/SOAP/JSON/Protocol Buffers/Spring. * Enterprise Architecture Patterns, GoF design * Build & agile - Ant, Gradle/Maven, Sonar, Jenkins/Hudson, GIT/perforce. * Sound understanding of Object Oriented Analysis, Design and Programming. * Strong communication and stakeholder management skills * Scala / spark or bigdata will be an added advantage * Candidate must have good experience in database. * Excellent communication and problem solving skill. Desired Skills: * Experience in banking and regulatory reporting (SFTR, MAS/ASIC etc.) * Knowledge of OTC, listed and cash products * Domain driven design and micro-services

Hadoop Developer

at Pion Global Solutions LTD

2 recruiters

Posted by Sheela P

Mumbai

3 - 100 yrs

₹4L - ₹15L / yr

Spark

Big Data

Hadoop

HDFS

Apache Sqoop

+2 more

Looking for Big data Developers in Mumbai Location

Technical Architect/CTO

at Arque Capital

2 recruiters

Posted by Hrishabh Sanghvi

Mumbai

5 - 11 yrs

₹15L - ₹30L / yr

C++

Big Data

Technical Architecture

Cloud Computing

Python

+4 more

ABOUT US: Arque Capital is a FinTech startup working with AI in Finance in domains like Asset Management (Hedge Funds, ETFs and Structured Products), Robo Advisory, Bespoke Research, Alternate Brokerage, and other applications of Technology & Quantitative methods in Big Finance. PROFILE DESCRIPTION: 1. Get the "Tech" in order for the Hedge Fund - Help answer fundamentals of technology blocks to be used, choice of certain platform/tech over other, helping team visualize product with the available resources and assets 2. Build, manage, and validate a Tech Roadmap for our Products 3. Architecture Practices - At startups, the dynamics changes very fast. Making sure that best practices are defined and followed by team is very important. CTO’s may have to garbage guy and clean the code time to time. Making reviews on Code Quality is an important activity that CTO should follow. 4. Build progressive learning culture and establish predictable model of envisioning, designing and developing products 5. Product Innovation through Research and continuous improvement 6. Build out the Technological Infrastructure for the Hedge Fund 7. Hiring and building out the Technology team 8. Setting up and managing the entire IT infrastructure - Hardware as well as Cloud 9. Ensure company-wide security and IP protection REQUIREMENTS: Computer Science Engineer from Tier-I colleges only (IIT, IIIT, NIT, BITS, DHU, Anna University, MU) 5-10 years of relevant Technology experience (no infra or database persons) Expertise in Python and C++ (3+ years minimum) 2+ years experience of building and managing Big Data projects Experience with technical design & architecture (1+ years minimum) Experience with High performance computing - OPTIONAL Experience as a Tech Lead, IT Manager, Director, VP, or CTO 1+ year Experience managing Cloud computing infrastructure (Amazon AWS preferred) - OPTIONAL Ability to work in an unstructured environment Looking to work in a small, start-up type environment based out of Mumbai COMPENSATION: Co-Founder status and Equity partnership

Hadoop Developer

at Accion Labs

14 recruiters

Posted by Neha Mayekar

Mumbai

5 - 14 yrs

₹8L - ₹18L / yr

HDFS

Hbase

Spark

Flume

hive

+2 more

US based Multinational Company Hands on Hadoop

Product Tech Lead

at Ixsight Technologies Pvt Ltd

2 recruiters

Posted by Uma Venkataraman

Pune, Mumbai

3 - 9 yrs

₹5L - ₹14L / yr

C++

Architecture

Spark

Ixsight Technologies is an innovative IT company with strong Intellectual Property. Ixsight is focused on creating Customer Data Value through its solutions for Identity Management, Locational Analytics, Address Science and Customer Engagement. Ixsight is also adapting its solutions to Big Data and Cloud. We are in the process of creating new solutions across platforms. Ixsight has served over 80+ clients in India – for various end user applications across traditional BFSI and telecom sector. In the recent past we are catering to the new generation verticals – Hospitality, ecommerce etc. Ixsight has been featured in the Gartner’s India Technology Hype Cycle and has been recognised by both clients and peers for pioneering and excellent solutions. If you wish to play a direct part in creating new products, building IP and being part of Product Creation - Ixsight is the place.

Get to hear about interesting companies hiring right now

Follow Cutshort

Why apply via Cutshort?

Connect with actual hiring teams and get their fast response. No spam.

Find more jobs

Get to hear about interesting companies hiring right now

Follow Cutshort