Cutshort logo
Spark Jobs in Mumbai

27+ Spark Jobs in Mumbai | Spark Job openings in Mumbai

Apply to 27+ Spark Jobs in Mumbai on CutShort.io. Explore the latest Spark Job opportunities across top companies like Google, Amazon & Adobe.

icon
Mumbai
5 - 10 yrs
₹8L - ₹20L / yr
Data Science
Machine Learning (ML)
Natural Language Processing (NLP)
Computer Vision
recommendation algorithm
+6 more


Data Scientist – Delivery & New Frontiers Manager 

Job Description:   

We are seeking highly skilled and motivated data scientist to join our Data Science team. The successful candidate will play a pivotal role in our data-driven initiatives and be responsible for designing, developing, and deploying data science solutions that drives business values for stakeholders. This role involves mapping business problems to a formal data science solution, working with wide range of structured and unstructured data, architecture design, creating sophisticated models, setting up operations for the data science product with the support from MLOps team and facilitating business workshops. In a nutshell, this person will represent data science and provide expertise in the full project cycle. Expectation of the successful candidate will be above that of a typical data scientist. Beyond technical expertise, problem solving in complex set-up will be key to the success for this role. 

Responsibilities: 

  • Collaborate with cross-functional teams, including software engineers, product managers, and business stakeholders, to understand business needs and identify data science opportunities. 
  • Map complex business problems to data science problem, design data science solution using GCP/Azure Databricks platform. 
  • Collect, clean, and preprocess large datasets from various internal and external sources.  
  • Streamlining data science process working with Data Engineering, and Technology teams. 
  • Managing multiple analytics projects within a Function to deliver end-to-end data science solutions, creation of insights and identify patterns.  
  • Develop and maintain data pipelines and infrastructure to support the data science projects 
  • Communicate findings and recommendations to stakeholders through data visualizations and presentations. 
  • Stay up to date with the latest data science trends and technologies, specifically for GCP companies 

 

Education / Certifications:  

Bachelor’s or Master’s in Computer Science, Engineering, Computational Statistics, Mathematics. 

Job specific requirements:  

  • Brings 5+ years of deep data science experience 

∙       Strong knowledge of machine learning and statistical modeling techniques in a in a clouds-based environment such as GCP, Azure, Amazon 

  • Experience with programming languages such as Python, R, Spark 
  • Experience with data visualization tools such as Tableau, Power BI, and D3.js 
  • Strong understanding of data structures, algorithms, and software design principles 
  • Experience with GCP platforms and services such as Big Query, Cloud ML Engine, and Cloud Storage 
  • Experience in configuring and setting up the version control on Code, Data, and Machine Learning Models using GitHub. 
  • Self-driven, be able to work with cross-functional teams in a fast-paced environment, adaptability to the changing business needs. 
  • Strong analytical and problem-solving skills 
  • Excellent verbal and written communication skills 
  • Working knowledge with application architecture, data security and compliance team. 


Read more
Databook

at Databook

5 candid answers
1 video
Nikhil Mohite
Posted by Nikhil Mohite
Mumbai
1 - 3 yrs
Upto ₹20L / yr (Varies
)
Data engineering
Python
Apache Kafka
Spark
Amazon Web Services (AWS)
+1 more

Lightning Job By Cutshort ⚡

 

As part of this feature, you can expect status updates about your application and replies within 72 hours (once the screening questions are answered)

 

 

About Databook:-

- Great salespeople let their customers’ strategies do the talking.

 

Databook’s award-winning Strategic Relationship Management (SRM) platform uses advanced AI and NLP to empower the world’s largest B2B sales teams to create, manage, and maintain strategic relationships at scale. The platform ingests and interprets billions of financial and market data signals to generate actionable sales strategies that connect the seller’s solutions to a buyer’s financial pain and urgency.

 

The Opportunity

We're seeking Junior Engineers to support and develop Databook’s capabilities. Working closely with our seasoned engineers, you'll contribute to crafting new features and ensuring our platform's reliability. If you're eager about playing a part in building the future of customer intelligence, with a keen eye towards quality, we'd love to meet you!

 

Specifically, you'll

- Participate in various stages of the engineering lifecycle alongside our experienced engineers.

- Assist in maintaining and enhancing features of the Databook platform.

- Collaborate with various teams to comprehend requirements and aid in implementing technology solutions.

 

Please note: As you progress and grow with us, you might be introduced to on-call rotations to handle any platform challenges.

 

Working Arrangements:

- This position offers a hybrid work mode, allowing employees to work both remotely and in-office as mutually agreed upon.

 

What we're looking for

- 1-2+ years experience as a Data Engineer

- Bachelor's degree in Engineering

- Willingness to work across different time zones

- Ability to work independently

- Knowledge of cloud (AWS or Azure)

- Exposure to distributed systems such as Spark, Flink or Kafka

- Fundamental knowledge of data modeling and optimizations

- Minimum of one year of experience using Python working as a Software Engineer

- Knowledge of SQL (Postgres) databases would be beneficial

- Experience with building analytics dashboard

- Familiarity with RESTful APIs and/or GraphQL is welcomed

- Hand-on experience with Numpy, Pandas, SpaCY would be a plus

- Exposure or working experience on GenAI (LLMs in general), LLMOps would be a plus

- Highly fluent in both spoken and written English language

 

Ideal candidates will also have:

- Self-motivated with great organizational skills.

- Ability to focus on small and subtle details.

- Are willing to learn and adapt in a rapidly changing environment.

- Excellent written and oral communication skills.

 

Join us and enjoy these perks!

- Competitive salary with bonus

- Medical insurance coverage

- 5 weeks leave plus public holidays

- Employee referral bonus program

- Annual learning stipend to spend on books, courses or other training materials that help you develop skills relevant to your role or professional development

- Complimentary subscription to Masterclass

Read more
Quinnox

at Quinnox

2 recruiters
MidhunKumar T
Posted by MidhunKumar T
Bengaluru (Bangalore), Mumbai
10 - 15 yrs
₹30L - ₹35L / yr
ADF
azure data lake services
SQL Azure
azure synapse
Spark
+4 more

Mandatory Skills: Azure Data Lake Storage, Azure SQL databases, Azure Synapse, Data Bricks (Pyspark/Spark), Python, SQL, Azure Data Factory.


Good to have: Power BI, Azure IAAS services, Azure Devops, Microsoft Fabric


Ø Very strong understanding on ETL and ELT

Ø Very strong understanding on Lakehouse architecture.

Ø Very strong knowledge in Pyspark and Spark architecture.

Ø Good knowledge in Azure data lake architecture and access controls

Ø Good knowledge in Microsoft Fabric architecture

Ø Good knowledge in Azure SQL databases

Ø Good knowledge in T-SQL

Ø Good knowledge in CI /CD process using Azure devops

Ø Power BI

Read more
Neysa Networks Pvt Ltd

at Neysa Networks Pvt Ltd

2 candid answers
Swapna Uchil
Posted by Swapna Uchil
Mumbai
7 - 10 yrs
Best in industry
TensorFlow
PyTorch
Python
Hadoop
R Programming
+9 more

Day in the life...


As a machine learning engineer at Neysa, you would be required to

- Collaborate with network engineers and IT teams to identify network-related challenges and areas where ML can provide solutions. Understand the specific network remediation problems that need to be addressed.

- Develop ML-based models and algorithms specific to issues that affect computer networks, for example, congestion, security threats and human errors.

- Be comfortable handling multiple types and data sources and then pre-process and clean them for modelling purposes. 

- Develop machine learning models that analyse networking data to predict (and possibly prevent) issues, detect anomalies or optimise performance. Choose the right approach for each, such as deep learning, reinforcement learning, or traditional statistical methods. 

- Train and evaluate the efficacy of your models, create performance metrics to assess robustness and effectiveness, and use feedback loops to make course corrections.

- Design solutions that can scale to handle large network environments efficiently. This typically means optimising execution latency and resource usage.  

- Integrate your model to work with running and maintaining a network. These could be with other machines, human operators, or both. 

- Document your design, architecture and your thought process. Work with the technical writers to make sure your message gets through.

- Stay updated with all that is changing in AI and ML.

 

Must have skills

 

- You should have expertise in machine learning algorithms, data processing, and model development. It would be best to have proficiency in associated frameworks such as TensorFlow, PyTorch or scikit-learn.

- You should understand how computer networks work, how they are used, how they fail, and what happens if they fail.

- You should be proficient in programming languages like Python, R, Go, LISP, etc. It would help if you also were very useful with old-school shell scripting. 

- Experience with data processing tools like Hadoop, Spark or Kafka. 

- An above-average understanding of Linux and operating systems in general.

 

What separates the best from the rest

 

ML and AI are continuously evolving, and so is understanding how these technologies can be applied to solve real-world problems. To do your best, you may also need to

- Conceptualise ways to map new problems to existing methodologies or create new ones.

- Be prepared to iterate, reiterate and then iterate your approach.

- Be able to interact with subject matter experts in multiple fields to identify potential use cases for machine learning.

 

What you can expect

 

An environment where you can do your best work...

- The best equipment that complements your talents

- The best tools in the business for you to bring your creations to life

- A great environment

- Flexible work hours and flexible work locations

- The opportunity to make your mark and shape the future

- And have fun...

Read more
Exponentia.ai

at Exponentia.ai

1 product
1 recruiter
Vipul Tiwari
Posted by Vipul Tiwari
Mumbai
4 - 6 yrs
₹12L - ₹19L / yr
ETL
Informatica
Data Warehouse (DWH)
databricks
Amazon Web Services (AWS)
+6 more

 Job DescriptionPosition: Sr Data Engineer – Databricks & AWS

Experience: 4 - 5 Years

 

Company Profile:


Exponentia.ai is an AI tech organization with a presence across India, Singapore, the Middle East, and the UK. We are an innovative and disruptive organization, working on cutting-edge technology to help our clients transform into the enterprises of the future. We provide artificial intelligence-based products/platforms capable of automated cognitive decision-making to improve productivity, quality, and economics of the underlying business processes. Currently, we are transforming ourselves and rapidly expanding our business.

Exponentia.ai has developed long-term relationships with world-class clients such as PayPal, PayU, SBI Group, HDFC Life, Kotak Securities, Wockhardt and Adani Group amongst others.

One of the top partners of Cloudera (leading analytics player) and Qlik (leader in BI technologies), Exponentia.ai has recently been awarded the ‘Innovation Partner Award’ by Qlik in 2017.

Get to know more about us on our website: http://www.exponentia.ai/ and Life @Exponentia.

 

​Role Overview: 


·         A Data Engineer understands the client requirements and develops and delivers the data engineering solutions as per the scope.

·         The role requires good skills in the development of solutions using various services required for data architecture on Databricks Delta Lake, streaming, AWS, ETL Development, and data modeling.

 

Job Responsibilities


•         Design of data solutions on Databricks including delta lake, data warehouse, data marts and other data solutions to support the analytics needs of the organization.

•         Apply best practices during design in data modeling (logical, physical) and ETL pipelines (streaming and batch) using cloud-based services.

•         Design, develop and manage the pipelining (collection, storage, access), data engineering (data quality, ETL, Data Modelling) and understanding (documentation, exploration) of the data.

•         Interact with stakeholders regarding data landscape understanding, conducting discovery exercises, developing proof of concepts and demonstrating it to stakeholders.

 

Technical Skills 


•         Has more than 2 Years of experience in developing data lakes, and datamarts on the Databricks platform.

•         Proven skill sets in AWS Data Lake services such as - AWS Glue, S3, Lambda, SNS, IAM, and skills in Spark, Python, and SQL.

•         Experience in Pentaho

•         Good understanding of developing a data warehouse, data marts etc.

•         Has a good understanding of system architectures, and design patterns and should be able to design and develop applications using these principles.

 

Personality Traits


•         Good collaboration and communication skills

•         Excellent problem-solving skills to be able to structure the right analytical solutions.

•         Strong sense of teamwork, ownership, and accountability

•         Analytical and conceptual thinking 

•         Ability to work in a fast-paced environment with tight schedules.

•         Good presentation skills with the ability to convey complex ideas to peers and management.

 

Education:

 

BE / ME / MS/MCA.

    



Read more
Consulting
Agency job
via Michael Page by Pratanu Chakraborty
Pune, Mumbai
6 - 8 yrs
₹5L - ₹20L / yr
Python
Spark
SQL
6-8 years of hands-on development experience using core Python
Hands-on experience with Spark and SQL
Good to have java knowledge
Read more
Encubate Tech Private Ltd
Mumbai
5 - 6 yrs
₹15L - ₹20L / yr
Amazon Web Services (AWS)
Amazon Redshift
Data modeling
ITL
Agile/Scrum
+7 more

Roles and

Responsibilities

Seeking AWS Cloud Engineer /Data Warehouse Developer for our Data CoE team to

help us in configure and develop new AWS environments for our Enterprise Data Lake,

migrate the on-premise traditional workloads to cloud. Must have a sound

understanding of BI best practices, relational structures, dimensional data modelling,

structured query language (SQL) skills, data warehouse and reporting techniques.

 Extensive experience in providing AWS Cloud solutions to various business

use cases.

 Creating star schema data models, performing ETLs and validating results with

business representatives

 Supporting implemented BI solutions by: monitoring and tuning queries and

data loads, addressing user questions concerning data integrity, monitoring

performance and communicating functional and technical issues.

Job Description: -

This position is responsible for the successful delivery of business intelligence

information to the entire organization and is experienced in BI development and

implementations, data architecture and data warehousing.

Requisite Qualification

Essential

-

AWS Certified Database Specialty or -

AWS Certified Data Analytics

Preferred

Any other Data Engineer Certification

Requisite Experience

Essential 4 -7 yrs of experience

Preferred 2+ yrs of experience in ETL & data pipelines

Skills Required

Special Skills Required

 AWS: S3, DMS, Redshift, EC2, VPC, Lambda, Delta Lake, CloudWatch etc.

 Bigdata: Databricks, Spark, Glue and Athena

 Expertise in Lake Formation, Python programming, Spark, Shell scripting

 Minimum Bachelor’s degree with 5+ years of experience in designing, building,

and maintaining AWS data components

 3+ years of experience in data component configuration, related roles and

access setup

 Expertise in Python programming

 Knowledge in all aspects of DevOps (source control, continuous integration,

deployments, etc.)

 Comfortable working with DevOps: Jenkins, Bitbucket, CI/CD

 Hands on ETL development experience, preferably using or SSIS

 SQL Server experience required

 Strong analytical skills to solve and model complex business requirements

 Sound understanding of BI Best Practices/Methodologies, relational structures,

dimensional data modelling, structured query language (SQL) skills, data

warehouse and reporting techniques

Preferred Skills

Required

 Experience working in the SCRUM Environment.

 Experience in Administration (Windows/Unix/Network/Database/Hadoop) is a

plus.

 Experience in SQL Server, SSIS, SSAS, SSRS

 Comfortable with creating data models and visualization using Power BI

 Hands on experience in relational and multi-dimensional data modelling,

including multiple source systems from databases and flat files, and the use of

standard data modelling tools

 Ability to collaborate on a team with infrastructure, BI report development and

business analyst resources, and clearly communicate solutions to both

technical and non-technical team members

Read more
HCL Technologies

at HCL Technologies

3 recruiters
Agency job
via Saiva System by Sunny Kumar
Delhi, Gurugram, Noida, Ghaziabad, Faridabad, Bengaluru (Bangalore), Hyderabad, Chennai, Pune, Mumbai, Kolkata
5 - 10 yrs
₹5L - ₹20L / yr
PySpark
Data engineering
Big Data
Hadoop
Spark
+2 more
Exp- 5 + years
Skill- Spark and Scala along with Azure
Location - Pan India

Looking for someone Bigdata along with Azure
Read more
LogiNext

at LogiNext

1 video
7 recruiters
Rakhi Daga
Posted by Rakhi Daga
Mumbai
2 - 3 yrs
₹8L - ₹12L / yr
Java
C++
Scala
Spark

LogiNext is looking for a technically savvy and passionate Software Engineer - Data Science to analyze large amounts of raw information to find patterns that will help improve our company. We will rely on you to build data products to extract valuable business insights.

In this role, you should be highly analytical with a knack for analysis, math and statistics. Critical thinking and problem-solving skills are essential for interpreting data. We also want to see a passion for machine-learning and research.

Your goal will be to help our company analyze trends to make better decisions. Without knowledge of how the software works, data scientists might have difficulty in work. Apart from experience in developing R and Python, they must know modern approaches to software development and their impact. DevOps continuous integration and deployment, experience in cloud computing are everyday skills to manage and process data.

Responsibilities:

Identify valuable data sources and automate collection processes Undertake preprocessing of structured and unstructured data Analyze large amounts of information to discover trends and patterns Build predictive models and machine-learning algorithms Combine models through ensemble modeling Present information using data visualization techniques Propose solutions and strategies to business challenges Collaborate with engineering and product development teams


Requirements:

Bachelors degree or higher in Computer Science, Information Technology, Information Systems, Statistics, Mathematics, Commerce, Engineering, Business Management, Marketing or related field from top-tier school 2 to 3 year experince in in data mining, data modeling, and reporting. Understading of SaaS based products and services. Understanding of machine-learning and operations research Experience of R, SQL and Python; familiarity with Scala, Java or C++ is an asset Experience using business intelligence tools (e.g. Tableau) and data frameworks (e.g. Hadoop) Analytical mind and business acumen and problem-solving aptitude Excellent communication and presentation skills Proficiency in Excel for data management and manipulation Experience in statistical modeling techniques and data wrangling Able to work independently and set goals keeping business objectives in mind

Read more
LogiNext

at LogiNext

1 video
7 recruiters
Rakhi Daga
Posted by Rakhi Daga
Mumbai
4 - 7 yrs
₹12L - ₹19L / yr
Machine Learning (ML)
Data Science
PHP
Java
Spark
+1 more

LogiNext is looking for a technically savvy and passionate Senior Software Engineer - Data Science to analyze large amounts of raw information to find patterns that will help improve our company. We will rely on you to build data products to extract valuable business insights.

In this role, you should be highly analytical with a knack for analysis, math and statistics. Critical thinking and problem-solving skills are essential for interpreting data. We also want to see a passion for machine-learning and research.

Your goal will be to help our company analyze trends to make better decisions. Without knowledge of how the software works, data scientists might have difficulty in work. Apart from experience in developing R and Python, they must know modern approaches to software development and their impact. DevOps continuous integration and deployment, experience in cloud computing are everyday skills to manage and process data.

Responsibilities :

Adapting and enhancing machine learning techniques based on physical intuition about the domain Design sampling methodology, prepare data, including data cleaning, univariate analysis, missing value imputation, , identify appropriate analytic and statistical methodology, develop predictive models and document process and results Lead projects both as a principal investigator and project manager, responsible for meeting project requirements on schedule and on budget Coordinate and lead efforts to innovate by deriving insights from heterogeneous sets of data generated by our suite of Aerospace products Support and mentor data scientists Maintain and work with our data pipeline that transfers and processes several terabytes of data using Spark, Scala, Python, Apache Kafka, Pig/Hive & Impala Work directly with application teams/partners (internal clients such as Xbox, Skype, Office) to understand their offerings/domain and help them become successful with data so they can run controlled experiments (a/b testing) Understand the data generated by experiments, and producing actionable, trustworthy conclusions from them Apply data analysis, data mining and data processing to present data clearly and develop experiments (ab testing) Work with development team to build tools for data logging and repeatable data tasks tol accelerate and automate data scientist duties


Requirements:

Bachelor’s or Master’s degree in Computer Science, Math, Physics, Engineering, Statistics or other technical field. PhD preferred 4 to 7 years of experience in data mining, data modeling, and reporting 3+ years of experience working with large data sets or do large scale quantitative analysis Expert SQL scripting required Development experience in one of the following: Scala, Java, Python, Perl, PHP, C++ or C# Experience working with Hadoop, Pig/Hive, Spark, MapReduce Ability to drive projects Basic understanding of statistics – hypothesis testing, p-values, confidence intervals, regression, classification, and optimization are core lingo Analysis - Should be able to perform Exploratory Data Analysis and get actionable insights from the data, with impressive visualization. Modeling - Should be familiar with ML concepts and algorithms; understanding of the internals and pros/cons of models is required. Strong algorithmic problem-solving skills Experience manipulating large data sets through statistical software (ex. R, SAS) or other methods Superior verbal, visual and written communication skills to educate and work with cross functional teams on controlled experiments Experimentation design or A/B testing experience is preferred. Experince in team management.

Read more
Tier 1 MNC
Chennai, Pune, Bengaluru (Bangalore), Noida, Gurugram, Kochi (Cochin), Coimbatore, Hyderabad, Mumbai, Navi Mumbai
3 - 12 yrs
₹3L - ₹15L / yr
Spark
Hadoop
Big Data
Data engineering
PySpark
+1 more
Greetings,
We are hiring for Tier 1 MNC for the software developer with good knowledge in Spark,Hadoop and Scala
Read more
Nascentvision

at Nascentvision

1 recruiter
Shanu Mohan
Posted by Shanu Mohan
Gurugram, Mumbai, Bengaluru (Bangalore)
2 - 4 yrs
₹10L - ₹17L / yr
Python
PySpark
Amazon Web Services (AWS)
Spark
Scala
+2 more
  • Hands-on experience in any Cloud Platform
· Versed in Spark, Scala/python, SQL
  • Microsoft Azure Experience
· Experience working on Real Time Data Processing Pipeline
Read more
AI-powered Growth Marketing platform
Mumbai, Bengaluru (Bangalore)
2 - 7 yrs
₹8L - ₹25L / yr
Java
NOSQL Databases
MongoDB
Cassandra
Apache
+3 more
The Impact You Will Create
  • Build campaign generation services which can send app notifications at a speed of 10 million a minute
  • Dashboards to show Real time key performance indicators to clients
  • Develop complex user segmentation engines which creates segments on Terabytes of data within few seconds
  • Building highly available & horizontally scalable platform services for ever growing data
  • Use cloud based services like AWS Lambda for blazing fast throughput & auto scalability
  • Work on complex analytics on terabytes of data like building Cohorts, Funnels, User path analysis, Recency Frequency & Monetary analysis at blazing speed
  • You will build backend services and APIs to create scalable engineering systems.
  • As an individual contributor, you will tackle some of our broadest technical challenges that requires deep technical knowledge, hands-on software development and seamless collaboration with all functions.
  • You will envision and develop features that are highly reliable and fault tolerant to deliver a superior customer experience.
  • Collaborating various highly-functional teams in the company to meet deliverables throughout the software development lifecycle.
  • Identify and improvise areas of improvement through data insights and research.
What we look for?
  • 2-5 years of experience in backend development and must have worked on Java/shell/Perl/python scripting.
  • Solid understanding of engineering best practices, continuous integration, and incremental delivery.
  • Strong analytical skills, debugging and troubleshooting skills, product line analysis.
  • Follower of agile methodology (Sprint planning, working on JIRA, retrospective etc).
  • Proficiency in usage of tools like Docker, Maven, Jenkins and knowledge on frameworks in Java like spring, spring boot, hibernate, JPA.
  • Ability to design application modules using various concepts like object oriented, multi-threading, synchronization, caching, fault tolerance, sockets, various IPCs, database interfaces etc.
  • Hands on experience on Redis, MySQL and streaming technologies like Kafka producer consumers and NoSQL databases like mongo dB/Cassandra.
  • Knowledge about versioning like Git and deployment processes like CICD.
Read more
Data Warehouse Architect
Agency job
via The Hub by Sridevi Viswanathan
Mumbai
8 - 10 yrs
₹20L - ₹23L / yr
Data Warehouse (DWH)
ETL
Hadoop
Apache Spark
Spark
+4 more
• You will work alongside the Project Management to ensure alignment of plans with what is being
delivered.
• You will utilize your configuration management and software release experience; as well as
change management concepts to drive the success of the projects.
• You will partner with senior leaders to understand and communicate the business needs to
translate them into IT requirements. Consult with Customer’s Business Analysts on their Data
warehouse requirements
• You will assist the technical team in identification and resolution of Data Quality issues.
• You will manage small to medium-sized projects relating to the delivery of applications or
application changes.
• You will use Managed Services or 3rd party resources to meet application support requirements.
• You will interface daily with multi-functional team members within the EDW team and across the
enterprise to resolve issues.
• Recommend and advocate different approaches and designs to the requirements
• Write technical design docs
• Execute Data modelling
• Solution inputs for the presentation layer
• You will craft and generate summary, statistical, and presentation reports; as well as provide reporting and metrics for strategic initiatives.
• Performs miscellaneous job-related duties as assigned

Preferred Qualifications

• Strong interpersonal, teamwork, organizational and workload planning skills
• Strong analytical, evaluative, and problem-solving abilities as well as exceptional customer service orientation
• Ability to drive clarity of purpose and goals during release and planning activities
• Excellent organizational skills including ability to prioritize tasks efficiently with high level of attention to detail
• Excited by the opportunity to continually improve processes within a large company
• Healthcare background/ Automobile background.
• Familiarity with major big data solutions and products available in the market.
• Proven ability to drive continuous
Read more
Oneture Technologies

at Oneture Technologies

1 recruiter
Ravi Mevcha
Posted by Ravi Mevcha
Mumbai, Navi Mumbai
2 - 4 yrs
₹8L - ₹12L / yr
Spark
Big Data
ETL
Data engineering
ADF
+4 more

Job Overview


We are looking for a Data Engineer to join our data team to solve data-driven critical

business problems. The hire will be responsible for expanding and optimizing the existing

end-to-end architecture including the data pipeline architecture. The Data Engineer will

collaborate with software developers, database architects, data analysts, data scientists and platform team on data initiatives and will ensure optimal data delivery architecture is

consistent throughout ongoing projects. The right candidate should have hands on in

developing a hybrid set of data-pipelines depending on the business requirements.

Responsibilities

  • Develop, construct, test and maintain existing and new data-driven architectures.
  • Align architecture with business requirements and provide solutions which fits best
  • to solve the business problems.
  • Build the infrastructure required for optimal extraction, transformation, and loading
  • of data from a wide variety of data sources using SQL and Azure ‘big data’
  • technologies.
  • Data acquisition from multiple sources across the organization.
  • Use programming language and tools efficiently to collate the data.
  • Identify ways to improve data reliability, efficiency and quality
  • Use data to discover tasks that can be automated.
  • Deliver updates to stakeholders based on analytics.
  • Set up practices on data reporting and continuous monitoring

Required Technical Skills

  • Graduate in Computer Science or in similar quantitative area
  • 1+ years of relevant work experience as a Data Engineer or in a similar role.
  • Advanced SQL knowledge, Data-Modelling and experience working with relational
  • databases, query authoring (SQL) as well as working familiarity with a variety of
  • databases.
  • Experience in developing and optimizing ETL pipelines, big data pipelines, and datadriven
  • architectures.
  • Must have strong big-data core knowledge & experience in programming using Spark - Python/Scala
  • Experience with orchestrating tool like Airflow or similar
  • Experience with Azure Data Factory is good to have
  • Build processes supporting data transformation, data structures, metadata,
  • dependency and workload management.
  • Experience supporting and working with cross-functional teams in a dynamic
  • environment.
  • Good understanding of Git workflow, Test-case driven development and using CICD
  • is good to have
  • Good to have some understanding of Delta tables It would be advantage if the candidate also have below mentioned experience using
  • the following software/tools:
  • Experience with big data tools: Hadoop, Spark, Hive, etc.
  • Experience with relational SQL and NoSQL databases
  • Experience with cloud data services
  • Experience with object-oriented/object function scripting languages: Python, Scala, etc.
Read more
SAP company
Mumbai, Navi Mumbai
3 - 8 yrs
₹7L - ₹13L / yr
Data engineering
Apache Kafka
Apache Spark
Hadoop
apache flink
+7 more
Build data systems and pipelines using Apache Flink (or similar) pipelines.
Understand various raw data input formats, build consumers on Kafka/ksqldb for them and ingest large amounts of raw data into Flink and Spark.
Conduct complex data analysis and report on results.
Build various aggregation streams for data and convert raw data into various logical processing streams.
Build algorithms to integrate multiple sources of data and create a unified data model from all the sources.
Build a unified data model on both SQL and NO-SQL databases to act as data sink.
Communicate the designs effectively with the fullstack engineering team for development.
Explore machine learning models that can be fitted on top of the data pipelines.

Mandatory Qualifications Skills:

Deep knowledge of Scala and Java programming languages is mandatory
Strong background in streaming data frameworks (Apache Flink, Apache Spark) is mandatory
Good understanding and hands on skills on streaming messaging platforms such as Kafka
Familiarity with R, C and Python is an asset
Analytical mind and business acumen with strong math skills (e.g. statistics, algebra)
Problem-solving aptitude
Excellent communication and presentation skills
Read more
Innovative Brand Design Studio
Agency job
via Unnati by Astha Bharadwaj
Mumbai
2 - 5 yrs
₹8L - ₹15L / yr
Data Science
Data Scientist
Python
Tableau
R Programming
+7 more
Come work with a growing consumer market research team that is currently serving one of the biggest FMCG companies in the world.
 
Our client works with global brands and creates projects that are user-centric. They build cost-effective and compelling product stories that help their clients gain a competitive edge and growth in their brand image. Their team of experts consists of academicians, designers, startup specialists and experts are working for clients across 12 countries targeting new markets and solutions with an excellent understanding of end-users.
 
They work with global brands from FMCG, Beauty and Hospitality sectors, namely Unilever, Lipton, Lakme, Loreal, AXE etc. who have chosen them for a long-term relationship, depending on their insights, consumer research, storytelling and contetnt experience. The founder is a design and product activation expert with over 10 years of impact and over 300 completed projects in India, UK, South Asia and USA.
 
As a Data Scientist, you will help to deliver quantitative consumer primary market research through Survey.
 
What you will do:
  • Handling Survey Scripting Process through the use of survey software platform such as Toluna, QuestionPro, Decipher.
  • Mining large & complex data sets using SQL, Hadoop, NoSQL or Spark.
  • Delivering complex consumer data analysis through the use of software like R, Python, Excel and etc such as
  • Working on Basic Statistical Analysis such as:T-Test &Correlation
  • Performing more complex data analysis processes through Machine Learning technique such as:
  1. Classification
  2. Regression
  3. Clustering
  4. Text
  5. Analysis
  6. Neural Networking
  • Creating an Interactive Dashboard Creation through the use of software like Tableau or any other software you are able to use.
  • Working on Statistical and mathematical modelling, application of ML and AI algorithms

 

What you need to have:
  • Bachelor or Master's degree in highly quantitative field (CS, machine learning, mathematics, statistics, economics) or equivalent experience.
  • An opportunity for one, who is eager of proving his or her data analytical skills with one of the Biggest FMCG market player.

 

Read more
Fragma Data Systems

at Fragma Data Systems

8 recruiters
Evelyn Charles
Posted by Evelyn Charles
Remote, Bengaluru (Bangalore), Hyderabad, Chennai, Mumbai, Pune
8 - 15 yrs
₹16L - ₹28L / yr
PySpark
SQL Azure
azure synapse
Windows Azure
Azure Data Engineer
+3 more
Technology Skills:
  • Building and operationalizing large scale enterprise data solutions and applications using one or more of AZURE data and analytics services in combination with custom solutions - Azure Synapse/Azure SQL DWH, Azure Data Lake, Azure Blob Storage, Spark, HDInsights, Databricks, CosmosDB, EventHub/IOTHub.
  • Experience in migrating on-premise data warehouses to data platforms on AZURE cloud. 
  • Designing and implementing data engineering, ingestion, and transformation functions
Good to Have: 
  • Experience with Azure Analysis Services
  • Experience in Power BI
  • Experience with third-party solutions like Attunity/Stream sets, Informatica
  • Experience with PreSales activities (Responding to RFPs, Executing Quick POCs)
  • Capacity Planning and Performance Tuning on Azure Stack and Spark.
Read more
PAGO Analytics India Pvt Ltd
Vijay Cheripally
Posted by Vijay Cheripally
Remote, Bengaluru (Bangalore), Mumbai, NCR (Delhi | Gurgaon | Noida)
2 - 8 yrs
₹8L - ₹15L / yr
Python
PySpark
Microsoft Windows Azure
SQL Azure
Data Analytics
+6 more
Be an integral part of large scale client business development and delivery engagements
Develop the software and systems needed for end-to-end execution on large projects
Work across all phases of SDLC, and use Software Engineering principles to build scaled solutions
Build the knowledge base required to deliver increasingly complex technology projects


Object-oriented languages (e.g. Python, PySpark, Java, C#, C++ ) and frameworks (e.g. J2EE or .NET)
Database programming using any flavours of SQL
Expertise in relational and dimensional modelling, including big data technologies
Exposure across all the SDLC process, including testing and deployment
Expertise in Microsoft Azure is mandatory including components like Azure Data Factory, Azure Data Lake Storage, Azure SQL, Azure DataBricks, HD Insights, ML Service etc.
Good knowledge of Python and Spark are required
Good understanding of how to enable analytics using cloud technology and ML Ops
Experience in Azure Infrastructure and Azure Dev Ops will be a strong plus
Read more
Crisp Analytics

at Crisp Analytics

8 recruiters
Seema Pahwa
Posted by Seema Pahwa
Mumbai
2 - 6 yrs
₹6L - ₹15L / yr
Big Data
Spark
Scala
Amazon Web Services (AWS)
Apache Kafka

 

The Data Engineering team is one of the core technology teams of Lumiq.ai and is responsible for creating all the Data related products and platforms which scale for any amount of data, users, and processing. The team also interacts with our customers to work out solutions, create technical architectures and deliver the products and solutions.

If you are someone who is always pondering how to make things better, how technologies can interact, how various tools, technologies, and concepts can help a customer or how a customer can use our products, then Lumiq is the place of opportunities.

 

Who are you?

  • Enthusiast is your middle name. You know what’s new in Big Data technologies and how things are moving
  • Apache is your toolbox and you have been a contributor to open source projects or have discussed the problems with the community on several occasions
  • You use cloud for more than just provisioning a Virtual Machine
  • Vim is friendly to you and you know how to exit Nano
  • You check logs before screaming about an error
  • You are a solid engineer who writes modular code and commits in GIT
  • You are a doer who doesn’t say “no” without first understanding
  • You understand the value of documentation of your work
  • You are familiar with Machine Learning Ecosystem and how you can help your fellow Data Scientists to explore data and create production-ready ML pipelines

 

Eligibility

Experience

  • At least 2 years of Data Engineering Experience
  • Have interacted with Customers


Must Have Skills

  • Amazon Web Services (AWS) - EMR, Glue, S3, RDS, EC2, Lambda, SQS, SES
  • Apache Spark
  • Python
  • Scala
  • PostgreSQL
  • Git
  • Linux


Good to have Skills

  • Apache NiFi
  • Apache Kafka
  • Apache Hive
  • Docker
  • Amazon Certification

 

 

Read more
Codalyze Technologies

at Codalyze Technologies

4 recruiters
Aishwarya Hire
Posted by Aishwarya Hire
Mumbai
3 - 9 yrs
₹5L - ₹12L / yr
Apache Hive
Hadoop
Scala
Spark
Amazon Web Services (AWS)
+2 more
Job Overview :

Your mission is to help lead team towards creating solutions that improve the way our business is run. Your knowledge of design, development, coding, testing and application programming will help your team raise their game, meeting your standards, as well as satisfying both business and functional requirements. Your expertise in various technology domains will be counted on to set strategic direction and solve complex and mission critical problems, internally and externally. Your quest to embracing leading-edge technologies and methodologies inspires your team to follow suit.

Responsibilities and Duties :

- As a Data Engineer you will be responsible for the development of data pipelines for numerous applications handling all kinds of data like structured, semi-structured &
unstructured. Having big data knowledge specially in Spark & Hive is highly preferred.

- Work in team and provide proactive technical oversight, advice development teams fostering re-use, design for scale, stability, and operational efficiency of data/analytical solutions

Education level :

- Bachelor's degree in Computer Science or equivalent

Experience :

- Minimum 3+ years relevant experience working on production grade projects experience in hands on, end to end software development

- Expertise in application, data and infrastructure architecture disciplines

- Expert designing data integrations using ETL and other data integration patterns

- Advanced knowledge of architecture, design and business processes

Proficiency in :

- Modern programming languages like Java, Python, Scala

- Big Data technologies Hadoop, Spark, HIVE, Kafka

- Writing decently optimized SQL queries

- Orchestration and deployment tools like Airflow & Jenkins for CI/CD (Optional)

- Responsible for design and development of integration solutions with Hadoop/HDFS, Real-Time Systems, Data Warehouses, and Analytics solutions

- Knowledge of system development lifecycle methodologies, such as waterfall and AGILE.

- An understanding of data architecture and modeling practices and concepts including entity-relationship diagrams, normalization, abstraction, denormalization, dimensional
modeling, and Meta data modeling practices.

- Experience generating physical data models and the associated DDL from logical data models.

- Experience developing data models for operational, transactional, and operational reporting, including the development of or interfacing with data analysis, data mapping,
and data rationalization artifacts.

- Experience enforcing data modeling standards and procedures.

- Knowledge of web technologies, application programming languages, OLTP/OLAP technologies, data strategy disciplines, relational databases, data warehouse development and Big Data solutions.

- Ability to work collaboratively in teams and develop meaningful relationships to achieve common goals

Skills :

Must Know :

- Core big-data concepts

- Spark - PySpark/Scala

- Data integration tool like Pentaho, Nifi, SSIS, etc (at least 1)

- Handling of various file formats

- Cloud platform - AWS/Azure/GCP

- Orchestration tool - Airflow
Read more
Codalyze Technologies

at Codalyze Technologies

4 recruiters
Aishwarya Hire
Posted by Aishwarya Hire
Mumbai
3 - 7 yrs
₹7L - ₹20L / yr
Hadoop
Big Data
Scala
Spark
Amazon Web Services (AWS)
+3 more
Job Overview :

Your mission is to help lead team towards creating solutions that improve the way our business is run. Your knowledge of design, development, coding, testing and application programming will help your team raise their game, meeting your standards, as well as satisfying both business and functional requirements. Your expertise in various technology domains will be counted on to set strategic direction and solve complex and mission critical problems, internally and externally. Your quest to embracing leading-edge technologies and methodologies inspires your team to follow suit.

Responsibilities and Duties :

- As a Data Engineer you will be responsible for the development of data pipelines for numerous applications handling all kinds of data like structured, semi-structured &
unstructured. Having big data knowledge specially in Spark & Hive is highly preferred.

- Work in team and provide proactive technical oversight, advice development teams fostering re-use, design for scale, stability, and operational efficiency of data/analytical solutions

Education level :

- Bachelor's degree in Computer Science or equivalent

Experience :

- Minimum 5+ years relevant experience working on production grade projects experience in hands on, end to end software development

- Expertise in application, data and infrastructure architecture disciplines

- Expert designing data integrations using ETL and other data integration patterns

- Advanced knowledge of architecture, design and business processes

Proficiency in :

- Modern programming languages like Java, Python, Scala

- Big Data technologies Hadoop, Spark, HIVE, Kafka

- Writing decently optimized SQL queries

- Orchestration and deployment tools like Airflow & Jenkins for CI/CD (Optional)

- Responsible for design and development of integration solutions with Hadoop/HDFS, Real-Time Systems, Data Warehouses, and Analytics solutions

- Knowledge of system development lifecycle methodologies, such as waterfall and AGILE.

- An understanding of data architecture and modeling practices and concepts including entity-relationship diagrams, normalization, abstraction, denormalization, dimensional
modeling, and Meta data modeling practices.

- Experience generating physical data models and the associated DDL from logical data models.

- Experience developing data models for operational, transactional, and operational reporting, including the development of or interfacing with data analysis, data mapping,
and data rationalization artifacts.

- Experience enforcing data modeling standards and procedures.

- Knowledge of web technologies, application programming languages, OLTP/OLAP technologies, data strategy disciplines, relational databases, data warehouse development and Big Data solutions.

- Ability to work collaboratively in teams and develop meaningful relationships to achieve common goals

Skills :

Must Know :

- Core big-data concepts

- Spark - PySpark/Scala

- Data integration tool like Pentaho, Nifi, SSIS, etc (at least 1)

- Handling of various file formats

- Cloud platform - AWS/Azure/GCP

- Orchestration tool - Airflow
Read more
Globant

at Globant

2 recruiters
Risha P
Posted by Risha P
Mumbai
5 - 10 yrs
₹12L - ₹20L / yr
Object Oriented Programming (OOPs)
Shell Scripting
Java
SOAP
JSON
+8 more
We are looking for a Java developer for one of our major investment banking client- who can take ownership for the whole end to end delivery, performing analysis, design, coding, testing and maintenance of large- scale and distributed applications. Please find JD for your reference . Job Profile : Java Developer : Location : Mumbai Description: A core Java developer is required for a Tier 1 investment bank supporting the Delta One Structured Products IT group. This is a global front-office team that supports the global OTC Equity Swap Portfolio, Single Name, and Index derivative businesses. We are designing a complete restructure of the Equity Swaps trading platform, and this particular role is within the core cash flow and valuations area. The role will require the candidate to work closely with the cash flow engines team to solve problems that combine both finance and technology. This is an exciting hands-on role for a self-starter who has a thirst for new challenges as well as new technologies. The candidate should possess good analytical skills, strong software engineering skills, a logical approach to problem-solving, be able to work in a fast paced environment liaising with demanding stakeholders to understand complex requirements and be able to prioritize work under pressure with minimal supervision. The candidate should be a problem solver, and be able to bring with them some positivity and enthusiasm in trying to think about and offer potential solutions for architectural considerations. Position Profile: We are looking for someone to help own problems and be able to demonstrate leadership and responsibility for the delivery of new features. As part of the development cycle, you would be expected to write quality unit tests, supply documentation if relevant for new feature build-outs, and be involved in the test cycle (UAT, integration, regression) for the delivery and fixing of bugs for your new features. Although the role is predominantly Java, we require someone who is flexible with the development environment, as some days you might be writing Java, and other days you might be fixing stored procedures or Perl scripts. You would be expected to get involved in the Level 3 production support rota which is shared between our developers on a monthly cycle, and to occasionally help with weekend deployment activities to deploy and verify any code changes you have been involved in. Team Profile: The team and role are ideal for someone looking for a strong career development path with many opportunities to grow, learn and develop. The role requires someone who is flexible and able to respond to a dynamic business environment. The candidate must be adaptable to work across multiple technologies and disciplines, with a focus on delivering quality solutions for the business in a timely fashion. This role suits people experienced in complex data domains. Required Skills: * Experience of agile and scrum methodologies. * Core Java. * Unix shell scripting. * SQL and Relational Databases such as DB2. * Integration technologies - MQ/Xml/SOAP/JSON/Protocol Buffers/Spring. * Enterprise Architecture Patterns, GoF design * Build & agile - Ant, Gradle/Maven, Sonar, Jenkins/Hudson, GIT/perforce. * Sound understanding of Object Oriented Analysis, Design and Programming. * Strong communication and stakeholder management skills * Scala / spark or bigdata will be an added advantage * Candidate must have good experience in database. * Excellent communication and problem solving skill. Desired Skills: * Experience in banking and regulatory reporting (SFTR, MAS/ASIC etc.) * Knowledge of OTC, listed and cash products * Domain driven design and micro-services
Read more
Pion Global Solutions LTD
Sheela P
Posted by Sheela P
Mumbai
3 - 100 yrs
₹4L - ₹15L / yr
Spark
Big Data
Hadoop
HDFS
Apache Sqoop
+2 more
Looking for Big data Developers in Mumbai Location
Read more
Arque Capital

at Arque Capital

2 recruiters
Hrishabh Sanghvi
Posted by Hrishabh Sanghvi
Mumbai
5 - 11 yrs
₹15L - ₹30L / yr
C++
Big Data
Technical Architecture
Cloud Computing
Python
+4 more
ABOUT US: Arque Capital is a FinTech startup working with AI in Finance in domains like Asset Management (Hedge Funds, ETFs and Structured Products), Robo Advisory, Bespoke Research, Alternate Brokerage, and other applications of Technology & Quantitative methods in Big Finance. PROFILE DESCRIPTION: 1. Get the "Tech" in order for the Hedge Fund - Help answer fundamentals of technology blocks to be used, choice of certain platform/tech over other, helping team visualize product with the available resources and assets 2. Build, manage, and validate a Tech Roadmap for our Products 3. Architecture Practices - At startups, the dynamics changes very fast. Making sure that best practices are defined and followed by team is very important. CTO’s may have to garbage guy and clean the code time to time. Making reviews on Code Quality is an important activity that CTO should follow. 4. Build progressive learning culture and establish predictable model of envisioning, designing and developing products 5. Product Innovation through Research and continuous improvement 6. Build out the Technological Infrastructure for the Hedge Fund 7. Hiring and building out the Technology team 8. Setting up and managing the entire IT infrastructure - Hardware as well as Cloud 9. Ensure company-wide security and IP protection REQUIREMENTS: Computer Science Engineer from Tier-I colleges only (IIT, IIIT, NIT, BITS, DHU, Anna University, MU) 5-10 years of relevant Technology experience (no infra or database persons) Expertise in Python and C++ (3+ years minimum) 2+ years experience of building and managing Big Data projects Experience with technical design & architecture (1+ years minimum) Experience with High performance computing - OPTIONAL Experience as a Tech Lead, IT Manager, Director, VP, or CTO 1+ year Experience managing Cloud computing infrastructure (Amazon AWS preferred) - OPTIONAL Ability to work in an unstructured environment Looking to work in a small, start-up type environment based out of Mumbai COMPENSATION: Co-Founder status and Equity partnership
Read more
Accion Labs

at Accion Labs

14 recruiters
Neha Mayekar
Posted by Neha Mayekar
Mumbai
5 - 14 yrs
₹8L - ₹18L / yr
HDFS
Hbase
Spark
Flume
hive
+2 more
US based Multinational Company Hands on Hadoop
Read more
Ixsight Technologies Pvt Ltd
Uma Venkataraman
Posted by Uma Venkataraman
Pune, Mumbai
3 - 9 yrs
₹5L - ₹14L / yr
C++
Architecture
C#
Spark
C
Ixsight Technologies is an innovative IT company with strong Intellectual Property. Ixsight is focused on creating Customer Data Value through its solutions for Identity Management, Locational Analytics, Address Science and Customer Engagement. Ixsight is also adapting its solutions to Big Data and Cloud. We are in the process of creating new solutions across platforms. Ixsight has served over 80+ clients in India – for various end user applications across traditional BFSI and telecom sector. In the recent past we are catering to the new generation verticals – Hospitality, ecommerce etc. Ixsight has been featured in the Gartner’s India Technology Hype Cycle and has been recognised by both clients and peers for pioneering and excellent solutions. If you wish to play a direct part in creating new products, building IP and being part of Product Creation - Ixsight is the place.
Read more
Get to hear about interesting companies hiring right now
Company logo
Company logo
Company logo
Company logo
Company logo
Linkedin iconFollow Cutshort
Why apply via Cutshort?
Connect with actual hiring teams and get their fast response. No spam.
Find more jobs
Get to hear about interesting companies hiring right now
Company logo
Company logo
Company logo
Company logo
Company logo
Linkedin iconFollow Cutshort