Hadoop Jobs in Chennai

18+ Hadoop Jobs in Chennai | Hadoop Job openings in Chennai

Apply to 18+ Hadoop Jobs in Chennai on CutShort.io. Explore the latest Hadoop Job opportunities across top companies like Google, Amazon & Adobe.

Hadoop jobs in other cities

Apache Hadoop Jobs Apache Hadoop Jobs in Bangalore (Bengaluru)Apache Hadoop Jobs in Chennai Apache Hadoop Jobs in Coimbatore Hadoop Jobs Hadoop Jobs in Ahmedabad Hadoop Jobs in Bangalore (Bengaluru)Hadoop Jobs in Chandigarh Hadoop Jobs in Coimbatore Hadoop Jobs in Delhi, NCR and Gurgaon Hadoop Jobs in Hyderabad Hadoop Jobs in Jaipur Hadoop Jobs in Kochi (Cochin)Hadoop Jobs in Mumbai Hadoop Jobs in Pune

Jobs by Category

Fullstack Developer Jobs Backend Developer Jobs Frontend Developer Jobs Android Developer Jobs iOS Developer Jobs DevOps Jobs Data Science Jobs

Business Developer Jobs Digital Marketing Jobs Sales Jobs

UX Designer Jobs Graphic Designer Jobs

Jobs by Location

Startup Jobs in Bangalore Startup Jobs in Pune Startup Jobs in Delhi All Startup jobs

Collections

Funded Startup Jobs Product Startup Jobs

Data Engineer

at Pluginlive

1 recruiter

Posted by Harsha Saggi

Chennai, Mumbai

4 - 6 yrs

₹10L - ₹20L / yr

Python

SQL

NOSQL Databases

Data architecture

Data modeling

+7 more

Role Overview:

We are seeking a talented and experienced Data Architect with strong data visualization capabilities to join our dynamic team in Mumbai. As a Data Architect, you will be responsible for designing, building, and managing our data infrastructure, ensuring its reliability, scalability, and performance. You will also play a crucial role in transforming complex data into insightful visualizations that drive business decisions. This role requires a deep understanding of data modeling, database technologies (particularly Oracle Cloud), data warehousing principles, and proficiency in data manipulation and visualization tools, including Python and SQL.

Responsibilities:

Design and implement robust and scalable data architectures, including data warehouses, data lakes, and operational data stores, primarily leveraging Oracle Cloud services.
Develop and maintain data models (conceptual, logical, and physical) that align with business requirements and ensure data integrity and consistency.
Define data governance policies and procedures to ensure data quality, security, and compliance.
Collaborate with data engineers to build and optimize ETL/ELT pipelines for efficient data ingestion, transformation, and loading.
Develop and execute data migration strategies to Oracle Cloud.
Utilize strong SQL skills to query, manipulate, and analyze large datasets from various sources.
Leverage Python and relevant libraries (e.g., Pandas, NumPy) for data cleaning, transformation, and analysis.
Design and develop interactive and insightful data visualizations using tools like [Specify Visualization Tools - e.g., Tableau, Power BI, Matplotlib, Seaborn, Plotly] to communicate data-driven insights to both technical and non-technical stakeholders.
Work closely with business analysts and stakeholders to understand their data needs and translate them into effective data models and visualizations.
Ensure the performance and reliability of data visualization dashboards and reports.
Stay up-to-date with the latest trends and technologies in data architecture, cloud computing (especially Oracle Cloud), and data visualization.
Troubleshoot data-related issues and provide timely resolutions.
Document data architectures, data flows, and data visualization solutions.
Participate in the evaluation and selection of new data technologies and tools.

Qualifications:

Bachelor's or Master's degree in Computer Science, Data Science, Information Systems, or a related field.
Proven experience (typically 5+ years) as a Data Architect, Data Modeler, or similar role.
Deep understanding of data warehousing concepts, dimensional modeling (e.g., star schema, snowflake schema), and ETL/ELT processes.
Extensive experience working with relational databases, particularly Oracle, and proficiency in SQL.
Hands-on experience with Oracle Cloud data services (e.g., Autonomous Data Warehouse, Object Storage, Data Integration).
Strong programming skills in Python and experience with data manipulation and analysis libraries (e.g., Pandas, NumPy).
Demonstrated ability to create compelling and effective data visualizations using industry-standard tools (e.g., Tableau, Power BI, Matplotlib, Seaborn, Plotly).
Excellent analytical and problem-solving skills with the ability to interpret complex data and translate it into actionable insights.
Strong communication and presentation skills, with the ability to effectively communicate technical concepts to non-technical audiences.
Experience with data governance and data quality principles.
Familiarity with agile development methodologies.
Ability to work independently and collaboratively within a team environment.

Application Link- https://forms.gle/km7n2WipJhC2Lj2r5

Role Overview:

Responsibilities:

Design and implement robust and scalable data architectures, including data warehouses, data lakes, and operational data stores, primarily leveraging Oracle Cloud services.
Develop and maintain data models (conceptual, logical, and physical) that align with business requirements and ensure data integrity and consistency.
Define data governance policies and procedures to ensure data quality, security, and compliance.
Collaborate with data engineers to build and optimize ETL/ELT pipelines for efficient data ingestion, transformation, and loading.
Develop and execute data migration strategies to Oracle Cloud.
Utilize strong SQL skills to query, manipulate, and analyze large datasets from various sources.
Leverage Python and relevant libraries (e.g., Pandas, NumPy) for data cleaning, transformation, and analysis.
Design and develop interactive and insightful data visualizations using tools like [Specify Visualization Tools - e.g., Tableau, Power BI, Matplotlib, Seaborn, Plotly] to communicate data-driven insights to both technical and non-technical stakeholders.
Work closely with business analysts and stakeholders to understand their data needs and translate them into effective data models and visualizations.
Ensure the performance and reliability of data visualization dashboards and reports.
Stay up-to-date with the latest trends and technologies in data architecture, cloud computing (especially Oracle Cloud), and data visualization.
Troubleshoot data-related issues and provide timely resolutions.
Document data architectures, data flows, and data visualization solutions.
Participate in the evaluation and selection of new data technologies and tools.

Qualifications:

Bachelor's or Master's degree in Computer Science, Data Science, Information Systems, or a related field.
Proven experience (typically 5+ years) as a Data Architect, Data Modeler, or similar role.
Deep understanding of data warehousing concepts, dimensional modeling (e.g., star schema, snowflake schema), and ETL/ELT processes.
Extensive experience working with relational databases, particularly Oracle, and proficiency in SQL.
Hands-on experience with Oracle Cloud data services (e.g., Autonomous Data Warehouse, Object Storage, Data Integration).
Strong programming skills in Python and experience with data manipulation and analysis libraries (e.g., Pandas, NumPy).
Demonstrated ability to create compelling and effective data visualizations using industry-standard tools (e.g., Tableau, Power BI, Matplotlib, Seaborn, Plotly).
Excellent analytical and problem-solving skills with the ability to interpret complex data and translate it into actionable insights.
Strong communication and presentation skills, with the ability to effectively communicate technical concepts to non-technical audiences.
Experience with data governance and data quality principles.
Familiarity with agile development methodologies.
Ability to work independently and collaboratively within a team environment.

Application Link- https://forms.gle/km7n2WipJhC2Lj2r5

Data Engineer

at ZeMoSo Technologies

11 recruiters

Agency job

via TIGI HR Solution Pvt. Ltd. by Vaidehi Sarkar

Mumbai, Bengaluru (Bangalore), Hyderabad, Chennai, Pune

4 - 8 yrs

₹10L - ₹15L / yr

Data engineering

Python

SQL

Data Warehouse (DWH)

Amazon Web Services (AWS)

+3 more

Work Mode: Hybrid

Need B.Tech, BE, M.Tech, ME candidates - Mandatory

Must-Have Skills:

● Educational Qualification :- B.Tech, BE, M.Tech, ME in any field.

● Minimum of 3 years of proven experience as a Data Engineer.

● Strong proficiency in Python programming language and SQL.

● Experience in DataBricks and setting up and managing data pipelines, data warehouses/lakes.

● Good comprehension and critical thinking skills.

● Kindly note Salary bracket will vary according to the exp. of the candidate -

- Experience from 4 yrs to 6 yrs - Salary upto 22 LPA

- Experience from 5 yrs to 8 yrs - Salary upto 30 LPA

- Experience more than 8 yrs - Salary upto 40 LPA

Work Mode: Hybrid

Need B.Tech, BE, M.Tech, ME candidates - Mandatory

Must-Have Skills:

● Educational Qualification :- B.Tech, BE, M.Tech, ME in any field.

● Minimum of 3 years of proven experience as a Data Engineer.

● Strong proficiency in Python programming language and SQL.

● Experience in DataBricks and setting up and managing data pipelines, data warehouses/lakes.

● Good comprehension and critical thinking skills.

● Kindly note Salary bracket will vary according to the exp. of the candidate -

- Experience from 4 yrs to 6 yrs - Salary upto 22 LPA

- Experience from 5 yrs to 8 yrs - Salary upto 30 LPA

- Experience more than 8 yrs - Salary upto 40 LPA

Kafka Developer

at iLink Systems

1 video

1 recruiter

Posted by Ganesh Sooriyamoorthu

Chennai, Pune, Noida, Bengaluru (Bangalore)

5 - 15 yrs

₹10L - ₹15L / yr

Apache Kafka

Big Data

Java

Spark

Hadoop

+1 more

KSQL
Data Engineering spectrum (Java/Spark)
Spark Scala / Kafka Streaming
Confluent Kafka components
Basic understanding of Hadoop

KSQL
Data Engineering spectrum (Java/Spark)
Spark Scala / Kafka Streaming
Confluent Kafka components
Basic understanding of Hadoop

Software developer

Tier 1 MNC

Agency job

via People First Consultants by Jayaraj E

Chennai, Pune, Bengaluru (Bangalore), Noida, Gurugram, Kochi (Cochin), Coimbatore, Hyderabad, Mumbai, Navi Mumbai

3 - 12 yrs

₹3L - ₹15L / yr

Spark

Hadoop

Big Data

Data engineering

PySpark

+1 more

Greetings,
We are hiring for Tier 1 MNC for the software developer with good knowledge in Spark,Hadoop and Scala

Big Data Developer / Lead / Architect

Telecom Client

Agency job

via Eurka IT SOL by Srikanth a

Chennai

5 - 13 yrs

₹9L - ₹28L / yr

PySpark

Data engineering

Big Data

Hadoop

Spark

+6 more

Demonstrable experience owning and developing big data solutions, using Hadoop, Hive/Hbase, Spark, Databricks, ETL/ELT for 5+ years

· 10+ years of Information Technology experience, preferably with Telecom / wireless service providers.

· Experience in designing data solution following Agile practices (SAFe methodology); designing for testability, deployability and releaseability; rapid prototyping, data modeling, and decentralized innovation

DataOps mindset: allowing the architecture of a system to evolve continuously over time, while simultaneously supporting the needs of current users
Create and maintain Architectural Runway, and Non-Functional Requirements.
Design for Continuous Delivery Pipeline (CI/CD data pipeline) and enables Built-in Quality & Security from the start.

· To be able to demonstrate an understanding and ideally use of, at least one recognised architecture framework or standard e.g. TOGAF, Zachman Architecture Framework etc

· The ability to apply data, research, and professional judgment and experience to ensure our products are making the biggest difference to consumers

· Demonstrated ability to work collaboratively

· Excellent written, verbal and social skills - You will be interacting with all types of people (user experience designers, developers, managers, marketers, etc.)

· Ability to work in a fast paced, multiple project environment on an independent basis and with minimal supervision

· Technologies: .NET, AWS, Azure; Azure Synapse, Nifi, RDS, Apache Kafka, Azure Data bricks, Azure datalake storage, Power BI, Reporting Analytics, QlickView, SQL on-prem Datawarehouse; BSS, OSS & Enterprise Support Systems

Demonstrable experience owning and developing big data solutions, using Hadoop, Hive/Hbase, Spark, Databricks, ETL/ELT for 5+ years

· 10+ years of Information Technology experience, preferably with Telecom / wireless service providers.

DataOps mindset: allowing the architecture of a system to evolve continuously over time, while simultaneously supporting the needs of current users
Create and maintain Architectural Runway, and Non-Functional Requirements.
Design for Continuous Delivery Pipeline (CI/CD data pipeline) and enables Built-in Quality & Security from the start.

· To be able to demonstrate an understanding and ideally use of, at least one recognised architecture framework or standard e.g. TOGAF, Zachman Architecture Framework etc

· The ability to apply data, research, and professional judgment and experience to ensure our products are making the biggest difference to consumers

· Demonstrated ability to work collaboratively

· Excellent written, verbal and social skills - You will be interacting with all types of people (user experience designers, developers, managers, marketers, etc.)

· Ability to work in a fast paced, multiple project environment on an independent basis and with minimal supervision

Data Science

Leading Manufacturing Company

Agency job

via People First Consultants by Jayaraj E

Chennai

3 - 6 yrs

₹3L - ₹8L / yr

Machine Learning (ML)

Data Science

Natural Language Processing (NLP)

Data modeling

Data Analytics

+2 more

Location: Chennai
Education: BE/BTech
Experience: Minimum 3+ years of experience as a Data Scientist/Data Engineer

Domain knowledge: Data cleaning, modelling, analytics, statistics, machine learning, AI

Requirements:

To be part of Digital Manufacturing and Industrie 4.0 projects across client group of companies
Design and develop AI//ML models to be deployed across factories
Knowledge on Hadoop, Apache Spark, MapReduce, Scala, Python programming, SQL and NoSQL databases is required
Should be strong in statistics, data analysis, data modelling, machine learning techniques and Neural Networks
Prior experience in developing AI and ML models is required
Experience with data from the Manufacturing Industry would be a plus

Roles and Responsibilities:

Develop AI and ML models for the Manufacturing Industry with a focus on Energy, Asset Performance Optimization and Logistics
Multitasking, good communication necessary
Entrepreneurial attitude

Additional Information:

Travel: Must be willing to travel on shorter duration within India and abroad

Job Location: Chennai
Reporting to: Team Leader, Energy Management System

Location: Chennai
Education: BE/BTech
Experience: Minimum 3+ years of experience as a Data Scientist/Data Engineer

Domain knowledge: Data cleaning, modelling, analytics, statistics, machine learning, AI

Requirements:

To be part of Digital Manufacturing and Industrie 4.0 projects across client group of companies
Design and develop AI//ML models to be deployed across factories
Knowledge on Hadoop, Apache Spark, MapReduce, Scala, Python programming, SQL and NoSQL databases is required
Should be strong in statistics, data analysis, data modelling, machine learning techniques and Neural Networks
Prior experience in developing AI and ML models is required
Experience with data from the Manufacturing Industry would be a plus

Roles and Responsibilities:

Develop AI and ML models for the Manufacturing Industry with a focus on Energy, Asset Performance Optimization and Logistics
Multitasking, good communication necessary
Entrepreneurial attitude

Additional Information:

Travel: Must be willing to travel on shorter duration within India and abroad

Job Location: Chennai
Reporting to: Team Leader, Energy Management System

Python + Data scientist

A leading global information technology and business process

Agency job

via Jobdost by Mamatha A

Chennai

5 - 14 yrs

₹13L - ₹21L / yr

Python

Java

PySpark

Javascript

Hadoop

Python + Data scientist :
• Hands-on and sound knowledge of Python, Pyspark, Java script

• Build data-driven models to understand the characteristics of engineering systems

• Train, tune, validate, and monitor predictive models

• Sound knowledge on Statistics

• Experience in developing data processing tasks using PySpark such as reading,

merging, enrichment, loading of data from external systems to target data destinations

• Working knowledge on Big Data or/and Hadoop environments

• Experience creating CI/CD Pipelines using Jenkins or like tools

• Practiced in eXtreme Programming (XP) disciplines

Python + Data scientist :
• Hands-on and sound knowledge of Python, Pyspark, Java script

• Build data-driven models to understand the characteristics of engineering systems

• Train, tune, validate, and monitor predictive models

• Sound knowledge on Statistics

• Experience in developing data processing tasks using PySpark such as reading,

merging, enrichment, loading of data from external systems to target data destinations

• Working knowledge on Big Data or/and Hadoop environments

• Experience creating CI/CD Pipelines using Jenkins or like tools

• Practiced in eXtreme Programming (XP) disciplines

QA Manager

at Amagi Media Labs

3 recruiters

Posted by Rajesh C

Chennai

12 - 15 yrs

₹20L - ₹30L / yr

Software Testing (QA)

Test Automation (QA)

Appium

Selenium

JMeter

+3 more

Job Title: QA Manager Job Location: India
Job Summary
Condé Nast is looking for a talented Software Quality Assurance Manager. In this role you will
be part of an IT organization and to provide direction and leadership on all quality matters
across the platform or domain. This is an opportunity to act as a key contributor in making sure
that work packages transition seamlessly from development into live environments Hone your
knowledge of quality best practices and collaboration and stakeholder management skills
Objectives of the Role
● Lead a global team of QE engineers who are responsible for ensuring E2E(end to end) quality
of our products
● Provide high direction for test teams delivering a clear and consistent vision
● Develop a roadmap for building needed test expertise. Establish and evolve formal QA
processes
● Define quality and automation KPI goals for the team and drive success towards achieving
them.
● To recommend, implement, and monitor preventative and corrective actions to ensure that
quality assurance standards are achieved
● Participate in brainstorming sessions and cross-departmental meetings to ensure collaborating
and cohesion
● Oversee all aspects of quality assurance including establishing metrics, implementing best
practices, and developing new tools and processes to ensure quality goals are met
● Act as a key point of contact for all QA aspects of releases, providing QA services and
coordinating QA resources internally and externally
● Review test strategies and test plans and ensure coverage for functional, performance and
scalability aspects are covered and tested
● Working cross functionally with Product and Engineering to ensure smooth delivery of product
that are on time, feature rich and with high quality
● Present QA status of testing with state-of-the-art dashboards and highlight key blockers and
where help is needed
● Ensure automated scripts are produced for E2E testing scenarios, including performance and
scale testing
● Review user found defects and continue to enhance and improve testing methodology and fill
gaps in testing as needed
● Communicate product status, key issues, and insights to key constituents across the
organization, including the executive team.
● Develop process for manual and automated testing strategy for business applications, web
applications and BI solutions.
● Experience in establishing testing strategy for dealing with quarterly releases by business
applications
Required Skills
● 12 + years’ experience in Product Quality testing role and 2-3 yrs in Management position
● Seasoned QA Manager with track record of delivering high quality products in a fast paced
environment
● Must have prior experience in building QA / Data Validation framework in bigdata for various
BU's
● Experience in "closing the loop" and "continuous improvement" of QA strategies based on
learnings from field issues
● Experience in business applications, web application, Business Intelligence applications
● Hands-on QA experience including testing automation. Deep knowledge of automation best
practices and industry trends.
● Good to have knowledge on One or more Test automation tools like xUnit , Selenium, Jmeter ,
Ranorex etc.
● Experience in automation testing using Selenium Webdriver, Testing Frameworks like TestNG
and NUnit.
● Strong proficiency in SQL coding (T-SQL or PL-SQL)
● Familiarity with data warehousing practices, ETL processing and denormalized data structures.
● Background in programming and testing across platforms is a plus
● Proficient with Bigdata concepts and/or tools is a plus (Hadoop, Hive, Spark).
● Strong analytical and reasoning skills
● Strong verbal, written communication skills and strong interpersonal skills.
About Condé Nast
CONDÉ NAST INDIA (DATA)
Over the years, Condé Nast successfully expanded and diversified into digital, TV, and social
platforms - in other words, a staggering amount of user data. Condé Nast made the right move
to invest heavily in understanding this data and formed a whole new Data team entirely
dedicated to data processing, engineering, analytics, and visualization. This team helps drive
engagement, fuel process innovation, further content enrichment, and increase market
revenue. The Data team aimed to create a company culture where data was the common
language and facilitate an environment where insights shared in real-time could improve
performance.
The Global Data team operates out of Los Angeles, New York, Chennai, and London. The team
at Condé Nast Chennai works extensively with data to amplify its brands' digital capabilities and
boost online revenue. We are broadly divided into four groups, Data Intelligence, Data
Engineering, Data Science, and Operations (including Product and Marketing Ops, Client
Services) along with Data Strategy and monetization. The teams built capabilities and products
to create data-driven solutions for better audience engagement.
What we look forward to:
We want to welcome bright, new minds into our midst and work together to create diverse
forms of self-expression. At Condé Nast, we encourage the imaginative and celebrate the
extraordinary. We are a media company for the future, with a remarkable past. We are Condé
Nast, and It Starts Here.

Senior consultant

An IT Services Major, hiring for a leading insurance player.

Agency job

via Indventur Partner by Vanshika kaur

Chennai

3 - 5 yrs

₹5L - ₹10L / yr

Big Data

Hadoop

Apache Kafka

Apache Hive

Microsoft Windows Azure

+1 more

Client An IT Services Major, hiring for a leading insurance player.

Position: SENIOR CONSULTANT

Job Description:

Azure admin- senior consultant with HD Insights(Big data)

Skills and Experience

Microsoft Azure Administrator certification
Bigdata project experience in Azure HDInsight Stack. big data processing frameworks such as Spark, Hadoop, Hive, Kafka or Hbase.
Preferred: Insurance or BFSI domain experience
5 to 5 years of experience is required.

Client An IT Services Major, hiring for a leading insurance player.

Position: SENIOR CONSULTANT

Job Description:

Azure admin- senior consultant with HD Insights(Big data)

Skills and Experience

Microsoft Azure Administrator certification
Bigdata project experience in Azure HDInsight Stack. big data processing frameworks such as Spark, Hadoop, Hive, Kafka or Hbase.
Preferred: Insurance or BFSI domain experience
5 to 5 years of experience is required.

Big Data Engineer

at netmedscom

3 recruiters

Posted by Vijay Hemnath

Chennai

2 - 5 yrs

₹6L - ₹25L / yr

Big Data

Hadoop

Apache Hive

Scala

Spark

+12 more

We are looking for an outstanding Big Data Engineer with experience setting up and maintaining Data Warehouse and Data Lakes for an Organization. This role would closely collaborate with the Data Science team and assist the team build and deploy machine learning and deep learning models on big data analytics platforms.

Roles and Responsibilities:

Develop and maintain scalable data pipelines and build out new integrations and processes required for optimal extraction, transformation, and loading of data from a wide variety of data sources using 'Big Data' technologies.
Develop programs in Scala and Python as part of data cleaning and processing.
Assemble large, complex data sets that meet functional / non-functional business requirements and fostering data-driven decision making across the organization.
Responsible to design and develop distributed, high volume, high velocity multi-threaded event processing systems.
Implement processes and systems to validate data, monitor data quality, ensuring production data is always accurate and available for key stakeholders and business processes that depend on it.
Perform root cause analysis on internal and external data and processes to answer specific business questions and identify opportunities for improvement.
Provide high operational excellence guaranteeing high availability and platform stability.
Closely collaborate with the Data Science team and assist the team build and deploy machine learning and deep learning models on big data analytics platforms.

Skills:

Experience with Big Data pipeline, Big Data analytics, Data warehousing.
Experience with SQL/No-SQL, schema design and dimensional data modeling.
Strong understanding of Hadoop Architecture, HDFS ecosystem and eexperience with Big Data technology stack such as HBase, Hadoop, Hive, MapReduce.
Experience in designing systems that process structured as well as unstructured data at large scale.
Experience in AWS/Spark/Java/Scala/Python development.
Should have Strong skills in PySpark (Python & SPARK). Ability to create, manage and manipulate Spark Dataframes. Expertise in Spark query tuning and performance optimization.
Experience in developing efficient software code/frameworks for multiple use cases leveraging Python and big data technologies.
Prior exposure to streaming data sources such as Kafka.
Should have knowledge on Shell Scripting and Python scripting.
High proficiency in database skills (e.g., Complex SQL), for data preparation, cleaning, and data wrangling/munging, with the ability to write advanced queries and create stored procedures.
Experience with NoSQL databases such as Cassandra / MongoDB.
Solid experience in all phases of Software Development Lifecycle - plan, design, develop, test, release, maintain and support, decommission.
Experience with DevOps tools (GitHub, Travis CI, and JIRA) and methodologies (Lean, Agile, Scrum, Test Driven Development).
Experience building and deploying applications on on-premise and cloud-based infrastructure.
Having a good understanding of machine learning landscape and concepts.

Qualifications and Experience:

Engineering and post graduate candidates, preferably in Computer Science, from premier institutions with proven work experience as a Big Data Engineer or a similar role for 3-5 years.

Certifications:

Good to have at least one of the Certifications listed here:

AZ 900 - Azure Fundamentals

DP 200, DP 201, DP 203, AZ 204 - Data Engineering

AZ 400 - Devops Certification

Roles and Responsibilities:

Develop and maintain scalable data pipelines and build out new integrations and processes required for optimal extraction, transformation, and loading of data from a wide variety of data sources using 'Big Data' technologies.
Develop programs in Scala and Python as part of data cleaning and processing.
Assemble large, complex data sets that meet functional / non-functional business requirements and fostering data-driven decision making across the organization.
Responsible to design and develop distributed, high volume, high velocity multi-threaded event processing systems.
Implement processes and systems to validate data, monitor data quality, ensuring production data is always accurate and available for key stakeholders and business processes that depend on it.
Perform root cause analysis on internal and external data and processes to answer specific business questions and identify opportunities for improvement.
Provide high operational excellence guaranteeing high availability and platform stability.
Closely collaborate with the Data Science team and assist the team build and deploy machine learning and deep learning models on big data analytics platforms.

Skills:

Experience with Big Data pipeline, Big Data analytics, Data warehousing.
Experience with SQL/No-SQL, schema design and dimensional data modeling.
Strong understanding of Hadoop Architecture, HDFS ecosystem and eexperience with Big Data technology stack such as HBase, Hadoop, Hive, MapReduce.
Experience in designing systems that process structured as well as unstructured data at large scale.
Experience in AWS/Spark/Java/Scala/Python development.
Should have Strong skills in PySpark (Python & SPARK). Ability to create, manage and manipulate Spark Dataframes. Expertise in Spark query tuning and performance optimization.
Experience in developing efficient software code/frameworks for multiple use cases leveraging Python and big data technologies.
Prior exposure to streaming data sources such as Kafka.
Should have knowledge on Shell Scripting and Python scripting.
High proficiency in database skills (e.g., Complex SQL), for data preparation, cleaning, and data wrangling/munging, with the ability to write advanced queries and create stored procedures.
Experience with NoSQL databases such as Cassandra / MongoDB.
Solid experience in all phases of Software Development Lifecycle - plan, design, develop, test, release, maintain and support, decommission.
Experience with DevOps tools (GitHub, Travis CI, and JIRA) and methodologies (Lean, Agile, Scrum, Test Driven Development).
Experience building and deploying applications on on-premise and cloud-based infrastructure.
Having a good understanding of machine learning landscape and concepts.

Qualifications and Experience:

Engineering and post graduate candidates, preferably in Computer Science, from premier institutions with proven work experience as a Big Data Engineer or a similar role for 3-5 years.

Certifications:

Good to have at least one of the Certifications listed here:

AZ 900 - Azure Fundamentals

DP 200, DP 201, DP 203, AZ 204 - Data Engineering

AZ 400 - Devops Certification

Data Engineer

at Bungee Tech India

Posted by Abigail David

Remote, NCR (Delhi | Gurgaon | Noida), Chennai

5 - 10 yrs

₹10L - ₹30L / yr

Big Data

Hadoop

Apache Hive

Spark

ETL

+3 more

Company Description

At Bungee Tech, we help retailers and brands meet customers everywhere and, on every occasion, they are in. We believe that accurate, high-quality data matched with compelling market insights empowers retailers and brands to keep their customers at the center of all innovation and value they are delivering.

We provide a clear and complete omnichannel picture of their competitive landscape to retailers and brands. We collect billions of data points every day and multiple times in a day from publicly available sources. Using high-quality extraction, we uncover detailed information on products or services, which we automatically match, and then proactively track for price, promotion, and availability. Plus, anything we do not match helps to identify a new assortment opportunity.

Empowered with this unrivalled intelligence, we unlock compelling analytics and insights that once blended with verified partner data from trusted sources such as Nielsen, paints a complete, consolidated picture of the competitive landscape.

We are looking for a Big Data Engineer who will work on the collecting, storing, processing, and analyzing of huge sets of data. The primary focus will be on choosing optimal solutions to use for these purposes, then maintaining, implementing, and monitoring them.

You will also be responsible for integrating them with the architecture used in the company.

We're working on the future. If you are seeking an environment where you can drive innovation, If you want to apply state-of-the-art software technologies to solve real world problems, If you want the satisfaction of providing visible benefit to end-users in an iterative fast paced environment, this is your opportunity.

Responsibilities

As an experienced member of the team, in this role, you will:

Contribute to evolving the technical direction of analytical Systems and play a critical role their design and development

You will research, design and code, troubleshoot and support. What you create is also what you own.

Develop the next generation of automation tools for monitoring and measuring data quality, with associated user interfaces.

Be able to broaden your technical skills and work in an environment that thrives on creativity, efficient execution, and product innovation.

BASIC QUALIFICATIONS

Bachelor’s degree or higher in an analytical area such as Computer Science, Physics, Mathematics, Statistics, Engineering or similar.
5+ years relevant professional experience in Data Engineering and Business Intelligence
5+ years in with Advanced SQL (analytical functions), ETL, Data Warehousing.
Strong knowledge of data warehousing concepts, including data warehouse technical architectures, infrastructure components, ETL/ ELT and reporting/analytic tools and environments, data structures, data modeling and performance tuning.
Ability to effectively communicate with both business and technical teams.
Excellent coding skills in Java, Python, C++, or equivalent object-oriented programming language
Understanding of relational and non-relational databases and basic SQL
Proficiency with at least one of these scripting languages: Perl / Python / Ruby / shell script

PREFERRED QUALIFICATIONS

Experience with building data pipelines from application databases.
Experience with AWS services - S3, Redshift, Spectrum, EMR, Glue, Athena, ELK etc.
Experience working with Data Lakes.
Experience providing technical leadership and mentor other engineers for the best practices on the data engineering space
Sharp problem solving skills and ability to resolve ambiguous requirements
Experience on working with Big Data
Knowledge and experience on working with Hive and the Hadoop ecosystem
Knowledge of Spark
Experience working with Data Science teams

Company Description

You will also be responsible for integrating them with the architecture used in the company.

Responsibilities

As an experienced member of the team, in this role, you will:

Contribute to evolving the technical direction of analytical Systems and play a critical role their design and development

You will research, design and code, troubleshoot and support. What you create is also what you own.

Develop the next generation of automation tools for monitoring and measuring data quality, with associated user interfaces.

Be able to broaden your technical skills and work in an environment that thrives on creativity, efficient execution, and product innovation.

BASIC QUALIFICATIONS

Bachelor’s degree or higher in an analytical area such as Computer Science, Physics, Mathematics, Statistics, Engineering or similar.
5+ years relevant professional experience in Data Engineering and Business Intelligence
5+ years in with Advanced SQL (analytical functions), ETL, Data Warehousing.
Strong knowledge of data warehousing concepts, including data warehouse technical architectures, infrastructure components, ETL/ ELT and reporting/analytic tools and environments, data structures, data modeling and performance tuning.
Ability to effectively communicate with both business and technical teams.
Excellent coding skills in Java, Python, C++, or equivalent object-oriented programming language
Understanding of relational and non-relational databases and basic SQL
Proficiency with at least one of these scripting languages: Perl / Python / Ruby / shell script

PREFERRED QUALIFICATIONS

Experience with building data pipelines from application databases.
Experience with AWS services - S3, Redshift, Spectrum, EMR, Glue, Athena, ELK etc.
Experience working with Data Lakes.
Experience providing technical leadership and mentor other engineers for the best practices on the data engineering space
Sharp problem solving skills and ability to resolve ambiguous requirements
Experience on working with Big Data
Knowledge and experience on working with Hive and the Hadoop ecosystem
Knowledge of Spark
Experience working with Data Science teams

Big Data Engineer

at YourHRfolks

6 recruiters

Posted by Bharat Saxena

Remote, Jaipur, NCR (Delhi | Gurgaon | Noida), Chennai, Bangarmau

5 - 10 yrs

₹15L - ₹30L / yr

Big Data

Hadoop

Spark

Apache Kafka

Amazon Web Services (AWS)

+2 more

Position: Big Data Engineer

What You'll Do

Punchh is seeking to hire Big Data Engineer at either a senior or tech lead level. Reporting to the Director of Big Data, he/she will play a critical role in leading Punchh’s big data innovations. By leveraging prior industrial experience in big data, he/she will help create cutting-edge data and analytics products for Punchh’s business partners.

This role requires close collaborations with data, engineering, and product organizations. His/her job functions include

Work with large data sets and implement sophisticated data pipelines with both structured and structured data.
Collaborate with stakeholders to design scalable solutions.
Manage and optimize our internal data pipeline that supports marketing, customer success and data science to name a few.
A technical leader of Punchh’s big data platform that supports AI and BI products.
Work with infra and operations team to monitor and optimize existing infrastructure
Occasional business travels are required.

What You'll Need

5+ years of experience as a Big Data engineering professional, developing scalable big data solutions.
Advanced degree in computer science, engineering or other related fields.
Demonstrated strength in data modeling, data warehousing and SQL.
Extensive knowledge with cloud technologies, e.g. AWS and Azure.
Excellent software engineering background. High familiarity with software development life cycle. Familiarity with GitHub/Airflow.
Advanced knowledge of big data technologies, such as programming language (Python, Java), relational (Postgres, mysql), NoSQL (Mongodb), Hadoop (EMR) and streaming (Kafka, Spark).
Strong problem solving skills with demonstrated rigor in building and maintaining a complex data pipeline.
Exceptional communication skills and ability to articulate a complex concept with thoughtful, actionable recommendations.

Position: Big Data Engineer

What You'll Do

This role requires close collaborations with data, engineering, and product organizations. His/her job functions include

Work with large data sets and implement sophisticated data pipelines with both structured and structured data.
Collaborate with stakeholders to design scalable solutions.
Manage and optimize our internal data pipeline that supports marketing, customer success and data science to name a few.
A technical leader of Punchh’s big data platform that supports AI and BI products.
Work with infra and operations team to monitor and optimize existing infrastructure
Occasional business travels are required.

What You'll Need

5+ years of experience as a Big Data engineering professional, developing scalable big data solutions.
Advanced degree in computer science, engineering or other related fields.
Demonstrated strength in data modeling, data warehousing and SQL.
Extensive knowledge with cloud technologies, e.g. AWS and Azure.
Excellent software engineering background. High familiarity with software development life cycle. Familiarity with GitHub/Airflow.
Advanced knowledge of big data technologies, such as programming language (Python, Java), relational (Postgres, mysql), NoSQL (Mongodb), Hadoop (EMR) and streaming (Kafka, Spark).
Strong problem solving skills with demonstrated rigor in building and maintaining a complex data pipeline.
Exceptional communication skills and ability to articulate a complex concept with thoughtful, actionable recommendations.

Hadoop Administrator

at Indium Software

16 recruiters

Posted by Ivarajneasan S K

Chennai

9 - 14 yrs

₹12L - ₹18L / yr

Apache Hadoop

Hadoop

Cloudera

HDFS

MapReduce

+2 more

Deploying a Hadoop cluster, maintaining a hadoop cluster, adding and removing nodes using cluster monitoring tools like Ganglia Nagios or Cloudera Manager, configuring the NameNode high availability and keeping a track of all the running hadoop jobs.

Good understating or hand's on in Kafka Admin / Apache Kafka Streaming.

Implementing, managing, and administering the overall hadoop infrastructure.

Takes care of the day-to-day running of Hadoop clusters

A hadoop administrator will have to work closely with the database team, network team, BI team, and application teams to make sure that all the big data applications are highly available and performing as expected.

If working with open source Apache Distribution, then hadoop admins have to manually setup all the configurations- Core-Site, HDFS-Site, YARN-Site and Map Red-Site. However, when working with popular hadoop distribution like Hortonworks, Cloudera or MapR the configuration files are setup on startup and the hadoop admin need not configure them manually.

Hadoop admin is responsible for capacity planning and estimating the requirements for lowering or increasing the capacity of the hadoop cluster.

Hadoop admin is also responsible for deciding the size of the hadoop cluster based on the data to be stored in HDFS.

Ensure that the hadoop cluster is up and running all the time.

Monitoring the cluster connectivity and performance.

Manage and review Hadoop log files.

Backup and recovery tasks

Resource and security management

Troubleshooting application errors and ensuring that they do not occur again.

Big Data Developer

at Maveric Systems

3 recruiters

Posted by Rashmi Poovaiah

Bengaluru (Bangalore), Chennai, Pune

4 - 10 yrs

₹8L - ₹15L / yr

Big Data

Hadoop

Spark

Apache Kafka

HiveQL

+2 more

Role Summary/Purpose:

We are looking for a Developer/Senior Developers to be a part of building advanced analytical platform leveraging Big Data technologies and transform the legacy systems. This role is an exciting, fast-paced, constantly changing and challenging work environment, and will play an important role in resolving and influencing high-level decisions.

Requirements:

The candidate must be a self-starter, who can work under general guidelines in a fast-spaced environment.
Overall minimum of 4 to 8 year of software development experience and 2 years in Data Warehousing domain knowledge
Must have 3 years of hands-on working knowledge on Big Data technologies such as Hadoop, Hive, Hbase, Spark, Kafka, Spark Streaming, SCALA etc…
Excellent knowledge in SQL & Linux Shell scripting
Bachelors/Master’s/Engineering Degree from a well-reputed university.
Strong communication, Interpersonal, Learning and organizing skills matched with the ability to manage stress, Time, and People effectively
Proven experience in co-ordination of many dependencies and multiple demanding stakeholders in a complex, large-scale deployment environment
Ability to manage a diverse and challenging stakeholder community
Diverse knowledge and experience of working on Agile Deliveries and Scrum teams.

Responsibilities

Should works as a senior developer/individual contributor based on situations
Should be part of SCRUM discussions and to take requirements
Adhere to SCRUM timeline and deliver accordingly
Participate in a team environment for the design, development and implementation
Should take L3 activities on need basis
Prepare Unit/SIT/UAT testcase and log the results
Co-ordinate SIT and UAT Testing. Take feedbacks and provide necessary remediation/recommendation in time.
Quality delivery and automation should be a top priority
Co-ordinate change and deployment in time
Should create healthy harmony within the team
Owns interaction points with members of core team (e.g.BA team, Testing and business team) and any other relevant stakeholders

Requirements:

The candidate must be a self-starter, who can work under general guidelines in a fast-spaced environment.
Overall minimum of 4 to 8 year of software development experience and 2 years in Data Warehousing domain knowledge
Must have 3 years of hands-on working knowledge on Big Data technologies such as Hadoop, Hive, Hbase, Spark, Kafka, Spark Streaming, SCALA etc…
Excellent knowledge in SQL & Linux Shell scripting
Bachelors/Master’s/Engineering Degree from a well-reputed university.
Strong communication, Interpersonal, Learning and organizing skills matched with the ability to manage stress, Time, and People effectively
Proven experience in co-ordination of many dependencies and multiple demanding stakeholders in a complex, large-scale deployment environment
Ability to manage a diverse and challenging stakeholder community
Diverse knowledge and experience of working on Agile Deliveries and Scrum teams.

Responsibilities

Should works as a senior developer/individual contributor based on situations
Should be part of SCRUM discussions and to take requirements
Adhere to SCRUM timeline and deliver accordingly
Participate in a team environment for the design, development and implementation
Should take L3 activities on need basis
Prepare Unit/SIT/UAT testcase and log the results
Co-ordinate SIT and UAT Testing. Take feedbacks and provide necessary remediation/recommendation in time.
Quality delivery and automation should be a top priority
Co-ordinate change and deployment in time
Should create healthy harmony within the team
Owns interaction points with members of core team (e.g.BA team, Testing and business team) and any other relevant stakeholders

Data Engineer

at Mobile Programming LLC

1 video

34 recruiters

Posted by vandana chauhan

Remote, Chennai

3 - 7 yrs

₹12L - ₹18L / yr

Big Data

Amazon Web Services (AWS)

Hadoop

SQL

Python

+5 more

Position: Data Engineer
Location: Chennai- Guindy Industrial Estate
Duration: Full time role
Company: Mobile Programming (https://www.mobileprogramming.com/" target="_blank">https://www.mobileprogramming.com/)
Client Name: Samsung

We are looking for a Data Engineer to join our growing team of analytics experts. The hire will be
responsible for expanding and optimizing our data and data pipeline architecture, as well as optimizing
data flow and collection for cross functional teams. The ideal candidate is an experienced data pipeline
builder and data wrangler who enjoy optimizing data systems and building them from the ground up.
The Data Engineer will support our software developers, database architects, data analysts and data
scientists on data initiatives and will ensure optimal data delivery architecture is consistent throughout
ongoing projects. They must be self-directed and comfortable supporting the data needs of multiple
teams, systems and products.

Responsibilities for Data Engineer
 Create and maintain optimal data pipeline architecture,
 Assemble large, complex data sets that meet functional / non-functional business requirements.
 Identify, design, and implement internal process improvements: automating manual processes,
optimizing data delivery, re-designing infrastructure for greater scalability, etc.
 Build the infrastructure required for optimal extraction, transformation, and loading of data
from a wide variety of data sources using SQL and AWS big data technologies.
 Build analytics tools that utilize the data pipeline to provide actionable insights into customer
acquisition, operational efficiency and other key business performance metrics.
 Work with stakeholders including the Executive, Product, Data and Design teams to assist with
data-related technical issues and support their data infrastructure needs.
 Create data tools for analytics and data scientist team members that assist them in building and
optimizing our product into an innovative industry leader.
 Work with data and analytics experts to strive for greater functionality in our data systems.

Qualifications for Data Engineer
 Experience building and optimizing big data ETL pipelines, architectures and data sets.
 Advanced working SQL knowledge and experience working with relational databases, query
authoring (SQL) as well as working familiarity with a variety of databases.
 Experience performing root cause analysis on internal and external data and processes to
answer specific business questions and identify opportunities for improvement.
 Strong analytic skills related to working with unstructured datasets.
 Build processes supporting data transformation, data structures, metadata, dependency and
workload management.
 A successful history of manipulating, processing and extracting value from large disconnected
datasets.

 Working knowledge of message queuing, stream processing and highly scalable ‘big data’ data
stores.
 Strong project management and organizational skills.
 Experience supporting and working with cross-functional teams in a dynamic environment.

We are looking for a candidate with 3-6 years of experience in a Data Engineer role, who has
attained a Graduate degree in Computer Science, Statistics, Informatics, Information Systems or another quantitative field. They should also have experience using the following software/tools:
 Experience with big data tools: Spark, Kafka, HBase, Hive etc.
 Experience with relational SQL and NoSQL databases
 Experience with AWS cloud services: EC2, EMR, RDS, Redshift
 Experience with stream-processing systems: Storm, Spark-Streaming, etc.
 Experience with object-oriented/object function scripting languages: Python, Java, Scala, etc.

Skills: Big Data, AWS, Hive, Spark, Python, SQL

Data Scientist

at TVS Credit Services

2 recruiters

Posted by Vinodhkumar Panneerselvam

Chennai

4 - 10 yrs

₹10L - ₹20L / yr

Data Science

R Programming

Python

Machine Learning (ML)

Hadoop

+3 more

Job Description: Be responsible for scaling our analytics capability across all internal disciplines and guide our strategic direction in regards to analytics Organize and analyze large, diverse data sets across multiple platforms Identify key insights and leverage them to inform and influence product strategy Technical Interactions with vendor or partners in technical capacity for scope/ approach & deliverables. Develops proof of concept to prove or disprove validity of concept. Working with all parts of the business to identify analytical requirements and formalize an approach for reliable, relevant, accurate, efficientreporting on those requirements Designing and implementing advanced statistical testing for customized problem solving Deliver concise verbal and written explanations of analyses to senior management that elevate findings into strategic recommendations Desired Candidate Profile: MTech / BE / BTech / MSc in CS or Stats or Maths, Operation Research, Statistics, Econometrics or in any quantitative field Experience in using Python, R, SAS Experience in working with large data sets and big data systems (SQL, Hadoop, Hive, etc.) Keen aptitude for large-scale data analysis with a passion for identifying key insights from data Expert working knowledge in various machine learning algorithms such XGBoost, SVM Etc. We are looking candidates from the following: Experience in Unsecured Loans & SME Loans analytics (cards, installment loans) - risk based pricing analytics Experience in Differential pricing / selection analytics (retail, airlines / travel etc). Experience in Digital product companies or Digital eCommerce with Product mindset and experience Experience in Fraud / Risk from Banks, NBFC / Fintech / Credit Bureau Experience in Online media with knowledge of media, online ads & sales (agencies) - Knowledge of DMP, DFP, Adobe/Omniture tools, Cloud Experience in Consumer Durable Loans lending companies (Experience in Credit Cards, Personal Loan - optional) Experience in Tractor Loans lending companies (Experience in Farm) Experience in Recovery, Collections analytics Experience in Marketing Analytics with Digital Marketing, Market Mix modelling, Advertising Technology

Big Data Developer

at GeakMinds Technologies Pvt Ltd

3 recruiters

Posted by John Richardson

Chennai

1 - 5 yrs

₹1L - ₹6L / yr

Hadoop

Big Data

HDFS

Apache Sqoop

Apache Flume

+2 more

• Looking for Big Data Engineer with 3+ years of experience. • Hands-on experience with MapReduce-based platforms, like Pig, Spark, Shark. • Hands-on experience with data pipeline tools like Kafka, Storm, Spark Streaming. • Store and query data with Sqoop, Hive, MySQL, HBase, Cassandra, MongoDB, Drill, Phoenix, and Presto. • Hands-on experience in managing Big Data on a cluster with HDFS and MapReduce. • Handle streaming data in real time with Kafka, Flume, Spark Streaming, Flink, and Storm. • Experience with Azure cloud, Cognitive Services, Databricks is preferred.

Technical Architect/CTO

at auzmor

5 recruiters

Posted by Loga B

Chennai

3 - 10 yrs

₹10L - ₹30L / yr

Java

React.js

AngularJS (1.x)

Selenium Web driver

Hadoop

+3 more

Description Auzmor is US HQ’ed, funded SaaS startup focussed on disrupting the HR space. We combine passion, domain expertise and build products with focus on great end user experiences We are looking for Technical Architect to envision, build, launch and scale multiple SaaS products What You Will Do: • Understand the broader strategy, business goals, and engineering priorities of the company and how to incorporate them into your designs of systems, components, or features • Designing applications and architectures for multi-tenant SaaS software • Responsible for the selection and use of frameworks, platforms and design patterns for Cloud based multi-tenant SaaS based application • Collaborate with engineers, QA, product managers, UX designers, partners/vendors, and other architects to build scalable systems, services, and products for our diverse ecosystem of users across apps What you will need • Minimum of 5+ years of Hands on engineering experience in SaaS, Cloud services environments with architecture design and definition experience using Java/JEE, Struts, Spring, JMS & ORM (Hibernate, JPA) or other Server side technologies, frameworks. • Strong understanding of architecture patterns such as multi-tenancy, scalability, and federation, microservices(design, decomposition, and maintenance ) to build cloud-ready systems • Experience with server-side technologies (preferably Java or Go),frontend technologies (HTML/CSS, Native JS, React, Angular, etc.) and testing frameworks and automation (PHPUnit, Codeception, Behat, Selenium, webdriver, etc.) • Passion for quality and engineering excellence at scale What we would love to see • Exposure to Big data -related technologies such as Hadoop, Spark, Cassandra, Mapreduce or NoSQL, and data management, data retrieval , data quality , ETL, data analysis. • Familiarity with containerized deployments and cloud computing platforms (AWS, Azure, GCP)

Get to hear about interesting companies hiring right now

Follow Cutshort

Why apply via Cutshort?

Connect with actual hiring teams and get their fast response. No spam.

Find more jobs

Get to hear about interesting companies hiring right now

Follow Cutshort