HDFS Jobs in Chennai

5+ HDFS Jobs in Chennai | HDFS Job openings in Chennai

Apply to 5+ HDFS Jobs in Chennai on CutShort.io. Explore the latest HDFS Job opportunities across top companies like Google, Amazon & Adobe.

Lead Data Engineer

at Sahaj AI Software

1 video

6 recruiters

Posted by Priya R

Chennai, Bengaluru (Bangalore), Pune

8 - 14 yrs

Best in industry

Data engineering

Python

Scala

databricks

Apache Spark

+3 more

About Us

Sahaj Software is an artisanal software engineering firm built on the values of trust, respect, curiosity, and craftsmanship, and delivering purpose-built solutions to drive data-led transformation for organisations. Our emphasis is on craft as we create purpose-built solutions, leveraging Data Engineering, Platform Engineering and Data Science with a razor-sharp focus to solve complex business and technology challenges and provide customers with a competitive edge

About The Role

As a Data Engineer, you’ll feel at home if you are hands-on, grounded, opinionated and passionate about delivering comprehensive data solutions that align with modern data architecture approaches. Your work will range from building a full data platform to building data pipelines or helping with data architecture and strategy. This role is ideal for those looking to have a large impact and huge scope for growth, while still being hands-on with technology. We aim to allow growth without becoming “post-technical”.

Responsibilities

Collaborate with Data Scientists and Engineers to deliver production-quality AI and Machine Learning systems
Build frameworks and supporting tooling for data ingestion from a complex variety of sources
Consult with our clients on data strategy, modernising their data infrastructure, architecture and technology
Model their data for increased visibility and performance
You will be given ownership of your work, and are encouraged to propose alternatives and make a case for doing things differently; our clients trust us and we manage ourselves.
You will work in short sprints to deliver working software
You will be working with other data engineers in Sahaj and work on building Data Engineering capability across the organisation

You can read more about what we do and how we think here: https://sahaj.ai/client-stories/

Skills you’ll need

Demonstrated experience as a Senior Data Engineer in complex enterprise environments
Deep understanding of technology fundamentals and experience with languages like Python, or functional programming languages like Scala
Demonstrated experience in the design and development of big data applications using tech stacks like Databricks, Apache Spark, HDFS, HBase and Snowflake
Commendable skills in building data products, by integrating large sets of data from hundreds of internal and external sources would be highly critical
A nuanced understanding of code quality, maintainability and practices like Test Driven Development
Ability to deliver an application end to end; having an opinion on how your code should be built, packaged and deployed using CI/CD
Understanding of Cloud platforms, DevOps, GitOps, and Containers

What will you experience as a culture at Sahaj?

At Sahaj, people's collective stands for a shared purpose where everyone owns the dreams, ideas, ideologies, successes, and failures of the organisation - a synergy that is rooted in the ethos of honesty, respect, trust, and equitability. At Sahaj, you will experience

Creativity
Ownership
Curiosity
Craftsmanship
A culture of trust, respect and transparency
Opportunity to collaborate with some of the finest minds in the industry
Work across multiple domains

What are the benefits of being at Sahaj?

Unlimited leaves
Life Insurance & Private Health insurance paid by Sahaj
Stock options
No hierarchy
Open Salaries

About Us

About The Role

Responsibilities

Collaborate with Data Scientists and Engineers to deliver production-quality AI and Machine Learning systems
Build frameworks and supporting tooling for data ingestion from a complex variety of sources
Consult with our clients on data strategy, modernising their data infrastructure, architecture and technology
Model their data for increased visibility and performance
You will be given ownership of your work, and are encouraged to propose alternatives and make a case for doing things differently; our clients trust us and we manage ourselves.
You will work in short sprints to deliver working software
You will be working with other data engineers in Sahaj and work on building Data Engineering capability across the organisation

You can read more about what we do and how we think here: https://sahaj.ai/client-stories/

Skills you’ll need

Demonstrated experience as a Senior Data Engineer in complex enterprise environments
Deep understanding of technology fundamentals and experience with languages like Python, or functional programming languages like Scala
Demonstrated experience in the design and development of big data applications using tech stacks like Databricks, Apache Spark, HDFS, HBase and Snowflake
Commendable skills in building data products, by integrating large sets of data from hundreds of internal and external sources would be highly critical
A nuanced understanding of code quality, maintainability and practices like Test Driven Development
Ability to deliver an application end to end; having an opinion on how your code should be built, packaged and deployed using CI/CD
Understanding of Cloud platforms, DevOps, GitOps, and Containers

What will you experience as a culture at Sahaj?

Creativity
Ownership
Curiosity
Craftsmanship
A culture of trust, respect and transparency
Opportunity to collaborate with some of the finest minds in the industry
Work across multiple domains

What are the benefits of being at Sahaj?

Unlimited leaves
Life Insurance & Private Health insurance paid by Sahaj
Stock options
No hierarchy
Open Salaries

Platform Engineer

at Mobile Programming LLC

1 video

34 recruiters

Posted by Sukhdeep Singh

Chennai

4 - 7 yrs

₹13L - ₹15L / yr

Data Analytics

Data Visualization

PowerBI

Tableau

Qlikview

+10 more

Title: Platform Engineer Location: Chennai Work Mode: Hybrid (Remote and Chennai Office) Experience: 4+ years Budget: 16 - 18 LPA

Responsibilities:

Parse data using Python, create dashboards in Tableau.
Utilize Jenkins for Airflow pipeline creation and CI/CD maintenance.
Migrate Datastage jobs to Snowflake, optimize performance.
Work with HDFS, Hive, Kafka, and basic Spark.
Develop Python scripts for data parsing, quality checks, and visualization.
Conduct unit testing and web application testing.
Implement Apache Airflow and handle production migration.
Apply data warehousing techniques for data cleansing and dimension modeling.

Requirements:

4+ years of experience as a Platform Engineer.
Strong Python skills, knowledge of Tableau.
Experience with Jenkins, Snowflake, HDFS, Hive, and Kafka.
Proficient in Unix Shell Scripting and SQL.
Familiarity with ETL tools like DataStage and DMExpress.
Understanding of Apache Airflow.
Strong problem-solving and communication skills.

Note: Only candidates willing to work in Chennai and available for immediate joining will be considered. Budget for this position is 16 - 18 LPA.

Title: Platform Engineer Location: Chennai Work Mode: Hybrid (Remote and Chennai Office) Experience: 4+ years Budget: 16 - 18 LPA

Responsibilities:

Parse data using Python, create dashboards in Tableau.
Utilize Jenkins for Airflow pipeline creation and CI/CD maintenance.
Migrate Datastage jobs to Snowflake, optimize performance.
Work with HDFS, Hive, Kafka, and basic Spark.
Develop Python scripts for data parsing, quality checks, and visualization.
Conduct unit testing and web application testing.
Implement Apache Airflow and handle production migration.
Apply data warehousing techniques for data cleansing and dimension modeling.

Requirements:

4+ years of experience as a Platform Engineer.
Strong Python skills, knowledge of Tableau.
Experience with Jenkins, Snowflake, HDFS, Hive, and Kafka.
Proficient in Unix Shell Scripting and SQL.
Familiarity with ETL tools like DataStage and DMExpress.
Understanding of Apache Airflow.
Strong problem-solving and communication skills.

Note: Only candidates willing to work in Chennai and available for immediate joining will be considered. Budget for this position is 16 - 18 LPA.

Big Data Engineer

at netmedscom

3 recruiters

Posted by Vijay Hemnath

Chennai

2 - 5 yrs

₹6L - ₹25L / yr

Big Data

Hadoop

Apache Hive

Scala

Spark

+12 more

We are looking for an outstanding Big Data Engineer with experience setting up and maintaining Data Warehouse and Data Lakes for an Organization. This role would closely collaborate with the Data Science team and assist the team build and deploy machine learning and deep learning models on big data analytics platforms.

Roles and Responsibilities:

Develop and maintain scalable data pipelines and build out new integrations and processes required for optimal extraction, transformation, and loading of data from a wide variety of data sources using 'Big Data' technologies.
Develop programs in Scala and Python as part of data cleaning and processing.
Assemble large, complex data sets that meet functional / non-functional business requirements and fostering data-driven decision making across the organization.
Responsible to design and develop distributed, high volume, high velocity multi-threaded event processing systems.
Implement processes and systems to validate data, monitor data quality, ensuring production data is always accurate and available for key stakeholders and business processes that depend on it.
Perform root cause analysis on internal and external data and processes to answer specific business questions and identify opportunities for improvement.
Provide high operational excellence guaranteeing high availability and platform stability.
Closely collaborate with the Data Science team and assist the team build and deploy machine learning and deep learning models on big data analytics platforms.

Skills:

Experience with Big Data pipeline, Big Data analytics, Data warehousing.
Experience with SQL/No-SQL, schema design and dimensional data modeling.
Strong understanding of Hadoop Architecture, HDFS ecosystem and eexperience with Big Data technology stack such as HBase, Hadoop, Hive, MapReduce.
Experience in designing systems that process structured as well as unstructured data at large scale.
Experience in AWS/Spark/Java/Scala/Python development.
Should have Strong skills in PySpark (Python & SPARK). Ability to create, manage and manipulate Spark Dataframes. Expertise in Spark query tuning and performance optimization.
Experience in developing efficient software code/frameworks for multiple use cases leveraging Python and big data technologies.
Prior exposure to streaming data sources such as Kafka.
Should have knowledge on Shell Scripting and Python scripting.
High proficiency in database skills (e.g., Complex SQL), for data preparation, cleaning, and data wrangling/munging, with the ability to write advanced queries and create stored procedures.
Experience with NoSQL databases such as Cassandra / MongoDB.
Solid experience in all phases of Software Development Lifecycle - plan, design, develop, test, release, maintain and support, decommission.
Experience with DevOps tools (GitHub, Travis CI, and JIRA) and methodologies (Lean, Agile, Scrum, Test Driven Development).
Experience building and deploying applications on on-premise and cloud-based infrastructure.
Having a good understanding of machine learning landscape and concepts.

Qualifications and Experience:

Engineering and post graduate candidates, preferably in Computer Science, from premier institutions with proven work experience as a Big Data Engineer or a similar role for 3-5 years.

Certifications:

Good to have at least one of the Certifications listed here:

AZ 900 - Azure Fundamentals

DP 200, DP 201, DP 203, AZ 204 - Data Engineering

AZ 400 - Devops Certification

Roles and Responsibilities:

Develop and maintain scalable data pipelines and build out new integrations and processes required for optimal extraction, transformation, and loading of data from a wide variety of data sources using 'Big Data' technologies.
Develop programs in Scala and Python as part of data cleaning and processing.
Assemble large, complex data sets that meet functional / non-functional business requirements and fostering data-driven decision making across the organization.
Responsible to design and develop distributed, high volume, high velocity multi-threaded event processing systems.
Implement processes and systems to validate data, monitor data quality, ensuring production data is always accurate and available for key stakeholders and business processes that depend on it.
Perform root cause analysis on internal and external data and processes to answer specific business questions and identify opportunities for improvement.
Provide high operational excellence guaranteeing high availability and platform stability.
Closely collaborate with the Data Science team and assist the team build and deploy machine learning and deep learning models on big data analytics platforms.

Skills:

Experience with Big Data pipeline, Big Data analytics, Data warehousing.
Experience with SQL/No-SQL, schema design and dimensional data modeling.
Strong understanding of Hadoop Architecture, HDFS ecosystem and eexperience with Big Data technology stack such as HBase, Hadoop, Hive, MapReduce.
Experience in designing systems that process structured as well as unstructured data at large scale.
Experience in AWS/Spark/Java/Scala/Python development.
Should have Strong skills in PySpark (Python & SPARK). Ability to create, manage and manipulate Spark Dataframes. Expertise in Spark query tuning and performance optimization.
Experience in developing efficient software code/frameworks for multiple use cases leveraging Python and big data technologies.
Prior exposure to streaming data sources such as Kafka.
Should have knowledge on Shell Scripting and Python scripting.
High proficiency in database skills (e.g., Complex SQL), for data preparation, cleaning, and data wrangling/munging, with the ability to write advanced queries and create stored procedures.
Experience with NoSQL databases such as Cassandra / MongoDB.
Solid experience in all phases of Software Development Lifecycle - plan, design, develop, test, release, maintain and support, decommission.
Experience with DevOps tools (GitHub, Travis CI, and JIRA) and methodologies (Lean, Agile, Scrum, Test Driven Development).
Experience building and deploying applications on on-premise and cloud-based infrastructure.
Having a good understanding of machine learning landscape and concepts.

Qualifications and Experience:

Engineering and post graduate candidates, preferably in Computer Science, from premier institutions with proven work experience as a Big Data Engineer or a similar role for 3-5 years.

Certifications:

Good to have at least one of the Certifications listed here:

AZ 900 - Azure Fundamentals

DP 200, DP 201, DP 203, AZ 204 - Data Engineering

AZ 400 - Devops Certification

Hadoop Administrator

at Indium Software

16 recruiters

Posted by Ivarajneasan S K

Chennai

9 - 14 yrs

₹12L - ₹18L / yr

Apache Hadoop

Hadoop

Cloudera

HDFS

MapReduce

+2 more

Deploying a Hadoop cluster, maintaining a hadoop cluster, adding and removing nodes using cluster monitoring tools like Ganglia Nagios or Cloudera Manager, configuring the NameNode high availability and keeping a track of all the running hadoop jobs.

Good understating or hand's on in Kafka Admin / Apache Kafka Streaming.

Implementing, managing, and administering the overall hadoop infrastructure.

Takes care of the day-to-day running of Hadoop clusters

A hadoop administrator will have to work closely with the database team, network team, BI team, and application teams to make sure that all the big data applications are highly available and performing as expected.

If working with open source Apache Distribution, then hadoop admins have to manually setup all the configurations- Core-Site, HDFS-Site, YARN-Site and Map Red-Site. However, when working with popular hadoop distribution like Hortonworks, Cloudera or MapR the configuration files are setup on startup and the hadoop admin need not configure them manually.

Hadoop admin is responsible for capacity planning and estimating the requirements for lowering or increasing the capacity of the hadoop cluster.

Hadoop admin is also responsible for deciding the size of the hadoop cluster based on the data to be stored in HDFS.

Ensure that the hadoop cluster is up and running all the time.

Monitoring the cluster connectivity and performance.

Manage and review Hadoop log files.

Backup and recovery tasks

Resource and security management

Troubleshooting application errors and ensuring that they do not occur again.

Big Data Developer

at GeakMinds Technologies Pvt Ltd

3 recruiters

Posted by John Richardson

Chennai

1 - 5 yrs

₹1L - ₹6L / yr

Hadoop

Big Data

HDFS

Apache Sqoop

Apache Flume

+2 more

• Looking for Big Data Engineer with 3+ years of experience. • Hands-on experience with MapReduce-based platforms, like Pig, Spark, Shark. • Hands-on experience with data pipeline tools like Kafka, Storm, Spark Streaming. • Store and query data with Sqoop, Hive, MySQL, HBase, Cassandra, MongoDB, Drill, Phoenix, and Presto. • Hands-on experience in managing Big Data on a cluster with HDFS and MapReduce. • Handle streaming data in real time with Kafka, Flume, Spark Streaming, Flink, and Storm. • Experience with Azure cloud, Cognitive Services, Databricks is preferred.

Get to hear about interesting companies hiring right now

Follow Cutshort

Why apply via Cutshort?

Connect with actual hiring teams and get their fast response. No spam.

Find more jobs

Get to hear about interesting companies hiring right now

Follow Cutshort