Cutshort logo
Apache Spark Jobs in Hyderabad

10+ Apache Spark Jobs in Hyderabad | Apache Spark Job openings in Hyderabad

Apply to 10+ Apache Spark Jobs in Hyderabad on CutShort.io. Explore the latest Apache Spark Job opportunities across top companies like Google, Amazon & Adobe.

icon
Frisco Analytics Pvt Ltd
Cedrick Mariadas
Posted by Cedrick Mariadas
Bengaluru (Bangalore), Hyderabad
5 - 8 yrs
₹15L - ₹20L / yr
databricks
Apache Spark
skill iconPython
SQL
MySQL
+3 more

We are actively seeking a self-motivated Data Engineer with expertise in Azure cloud and Databricks, with a thorough understanding of Delta Lake and Lake-house Architecture. The ideal candidate should excel in developing scalable data solutions, crafting platform tools, and integrating systems, while demonstrating proficiency in cloud-native database solutions and distributed data processing.


Key Responsibilities:

  • Contribute to the development and upkeep of a scalable data platform, incorporating tools and frameworks that leverage Azure and Databricks capabilities.
  • Exhibit proficiency in various RDBMS databases such as MySQL and SQL-Server, emphasizing their integration in applications and pipeline development.
  • Design and maintain high-caliber code, including data pipelines and applications, utilizing Python, Scala, and PHP.
  • Implement effective data processing solutions via Apache Spark, optimizing Spark applications for large-scale data handling.
  • Optimize data storage using formats like Parquet and Delta Lake to ensure efficient data accessibility and reliable performance.
  • Demonstrate understanding of Hive Metastore, Unity Catalog Metastore, and the operational dynamics of external tables.
  • Collaborate with diverse teams to convert business requirements into precise technical specifications.

Requirements:

  • Bachelor’s degree in Computer Science, Engineering, or a related discipline.
  • Demonstrated hands-on experience with Azure cloud services and Databricks.
  • Proficient programming skills in Python, Scala, and PHP.
  • In-depth knowledge of SQL, NoSQL databases, and data warehousing principles.
  • Familiarity with distributed data processing and external table management.
  • Insight into enterprise data solutions for PIM, CDP, MDM, and ERP applications.
  • Exceptional problem-solving acumen and meticulous attention to detail.

Additional Qualifications :

  • Acquaintance with data security and privacy standards.
  • Experience in CI/CD pipelines and version control systems, notably Git.
  • Familiarity with Agile methodologies and DevOps practices.
  • Competence in technical writing for comprehensive documentation.


Read more
Publicis Sapient

at Publicis Sapient

10 recruiters
Mohit Singh
Posted by Mohit Singh
Bengaluru (Bangalore), Pune, Hyderabad, Gurugram, Noida
5 - 11 yrs
₹20L - ₹36L / yr
PySpark
Data engineering
Big Data
Hadoop
Spark
+7 more

Publicis Sapient Overview:

The Senior Associate People Senior Associate L1 in Data Engineering, you will translate client requirements into technical design, and implement components for data engineering solution. Utilize deep understanding of data integration and big data design principles in creating custom solutions or implementing package solutions. You will independently drive design discussions to insure the necessary health of the overall solution 

.

Job Summary:

As Senior Associate L2 in Data Engineering, you will translate client requirements into technical design, and implement components for data engineering solution. Utilize deep understanding of data integration and big data design principles in creating custom solutions or implementing package solutions. You will independently drive design discussions to insure the necessary health of the overall solution

The role requires a hands-on technologist who has strong programming background like Java / Scala / Python, should have experience in Data Ingestion, Integration and data Wrangling, Computation, Analytics pipelines and exposure to Hadoop ecosystem components. You are also required to have hands-on knowledge on at least one of AWS, GCP, Azure cloud platforms.


Role & Responsibilities:

Your role is focused on Design, Development and delivery of solutions involving:

• Data Integration, Processing & Governance

• Data Storage and Computation Frameworks, Performance Optimizations

• Analytics & Visualizations

• Infrastructure & Cloud Computing

• Data Management Platforms

• Implement scalable architectural models for data processing and storage

• Build functionality for data ingestion from multiple heterogeneous sources in batch & real-time mode

• Build functionality for data analytics, search and aggregation

Experience Guidelines:

Mandatory Experience and Competencies:

# Competency

1.Overall 5+ years of IT experience with 3+ years in Data related technologies

2.Minimum 2.5 years of experience in Big Data technologies and working exposure in at least one cloud platform on related data services (AWS / Azure / GCP)

3.Hands-on experience with the Hadoop stack – HDFS, sqoop, kafka, Pulsar, NiFi, Spark, Spark Streaming, Flink, Storm, hive, oozie, airflow and other components required in building end to end data pipeline.

4.Strong experience in at least of the programming language Java, Scala, Python. Java preferable

5.Hands-on working knowledge of NoSQL and MPP data platforms like Hbase, MongoDb, Cassandra, AWS Redshift, Azure SQLDW, GCP BigQuery etc

6.Well-versed and working knowledge with data platform related services on at least 1 cloud platform, IAM and data security


Preferred Experience and Knowledge (Good to Have):

# Competency

1.Good knowledge of traditional ETL tools (Informatica, Talend, etc) and database technologies (Oracle, MySQL, SQL Server, Postgres) with hands on experience

2.Knowledge on data governance processes (security, lineage, catalog) and tools like Collibra, Alation etc

3.Knowledge on distributed messaging frameworks like ActiveMQ / RabbiMQ / Solace, search & indexing and Micro services architectures

4.Performance tuning and optimization of data pipelines

5.CI/CD – Infra provisioning on cloud, auto build & deployment pipelines, code quality

6.Cloud data specialty and other related Big data technology certifications


Personal Attributes:

• Strong written and verbal communication skills

• Articulation skills

• Good team player

• Self-starter who requires minimal oversight

• Ability to prioritize and manage multiple tasks

• Process orientation and the ability to define and set up processes


Read more
A LEADING US BASED MNC

A LEADING US BASED MNC

Agency job
via Zeal Consultants by Zeal Consultants
Bengaluru (Bangalore), Hyderabad, Delhi, Gurugram
5 - 10 yrs
₹14L - ₹15L / yr
Google Cloud Platform (GCP)
Spark
PySpark
Apache Spark
"DATA STREAMING"

Data Engineering : Senior Engineer / Manager


As Senior Engineer/ Manager in Data Engineering, you will translate client requirements into technical design, and implement components for a data engineering solutions. Utilize a deep understanding of data integration and big data design principles in creating custom solutions or implementing package solutions. You will independently drive design discussions to insure the necessary health of the overall solution.


Must Have skills :


1. GCP


2. Spark streaming : Live data streaming experience is desired.


3. Any 1 coding language: Java/Pyhton /Scala



Skills & Experience :


- Overall experience of MINIMUM 5+ years with Minimum 4 years of relevant experience in Big Data technologies


- Hands-on experience with the Hadoop stack - HDFS, sqoop, kafka, Pulsar, NiFi, Spark, Spark Streaming, Flink, Storm, hive, oozie, airflow and other components required in building end to end data pipeline. Working knowledge on real-time data pipelines is added advantage.


- Strong experience in at least of the programming language Java, Scala, Python. Java preferable


- Hands-on working knowledge of NoSQL and MPP data platforms like Hbase, MongoDb, Cassandra, AWS Redshift, Azure SQLDW, GCP BigQuery etc.


- Well-versed and working knowledge with data platform related services on GCP


- Bachelor's degree and year of work experience of 6 to 12 years or any combination of education, training and/or experience that demonstrates the ability to perform the duties of the position


Your Impact :


- Data Ingestion, Integration and Transformation


- Data Storage and Computation Frameworks, Performance Optimizations


- Analytics & Visualizations


- Infrastructure & Cloud Computing


- Data Management Platforms


- Build functionality for data ingestion from multiple heterogeneous sources in batch & real-time


- Build functionality for data analytics, search and aggregation

Read more
[x]cube LABS

at [x]cube LABS

2 candid answers
1 video
Krishna kandregula
Posted by Krishna kandregula
Hyderabad
2 - 6 yrs
₹8L - ₹20L / yr
ETL
Informatica
Data Warehouse (DWH)
PowerBI
DAX
+12 more
  • Creating and managing ETL/ELT pipelines based on requirements
  • Build PowerBI dashboards and manage datasets needed.
  • Work with stakeholders to identify data structures needed for future and perform any transformations including aggregations.
  • Build data cubes for real-time visualisation needs and CXO dashboards.


Required Tech Skills


  • Microsoft PowerBI & DAX
  • Python, Pandas, PyArrow, Jupyter Noteboks, ApacheSpark
  • Azure Synapse, Azure DataBricks, Azure HDInsight, Azure Data Factory



Read more
Accolite Digital
Nitesh Parab
Posted by Nitesh Parab
Bengaluru (Bangalore), Hyderabad, Gurugram, Delhi, Noida, Ghaziabad, Faridabad
4 - 8 yrs
₹5L - ₹15L / yr
ETL
Informatica
Data Warehouse (DWH)
SSIS
SQL Server Integration Services (SSIS)
+10 more

Job Title: Data Engineer

Job Summary: As a Data Engineer, you will be responsible for designing, building, and maintaining the infrastructure and tools necessary for data collection, storage, processing, and analysis. You will work closely with data scientists and analysts to ensure that data is available, accessible, and in a format that can be easily consumed for business insights.

Responsibilities:

  • Design, build, and maintain data pipelines to collect, store, and process data from various sources.
  • Create and manage data warehousing and data lake solutions.
  • Develop and maintain data processing and data integration tools.
  • Collaborate with data scientists and analysts to design and implement data models and algorithms for data analysis.
  • Optimize and scale existing data infrastructure to ensure it meets the needs of the business.
  • Ensure data quality and integrity across all data sources.
  • Develop and implement best practices for data governance, security, and privacy.
  • Monitor data pipeline performance / Errors and troubleshoot issues as needed.
  • Stay up-to-date with emerging data technologies and best practices.

Requirements:

Bachelor's degree in Computer Science, Information Systems, or a related field.

Experience with ETL tools like Matillion,SSIS,Informatica

Experience with SQL and relational databases such as SQL server, MySQL, PostgreSQL, or Oracle.

Experience in writing complex SQL queries

Strong programming skills in languages such as Python, Java, or Scala.

Experience with data modeling, data warehousing, and data integration.

Strong problem-solving skills and ability to work independently.

Excellent communication and collaboration skills.

Familiarity with big data technologies such as Hadoop, Spark, or Kafka.

Familiarity with data warehouse/Data lake technologies like Snowflake or Databricks

Familiarity with cloud computing platforms such as AWS, Azure, or GCP.

Familiarity with Reporting tools

Teamwork/ growth contribution

  • Helping the team in taking the Interviews and identifying right candidates
  • Adhering to timelines
  • Intime status communication and upfront communication of any risks
  • Tech, train, share knowledge with peers.
  • Good Communication skills
  • Proven abilities to take initiative and be innovative
  • Analytical mind with a problem-solving aptitude

Good to have :

Master's degree in Computer Science, Information Systems, or a related field.

Experience with NoSQL databases such as MongoDB or Cassandra.

Familiarity with data visualization and business intelligence tools such as Tableau or Power BI.

Knowledge of machine learning and statistical modeling techniques.

If you are passionate about data and want to work with a dynamic team of data scientists and analysts, we encourage you to apply for this position.

Read more
Hammoq

at Hammoq

1 recruiter
Nikitha Muthuswamy
Posted by Nikitha Muthuswamy
Remote, Indore, Ujjain, Hyderabad, Bengaluru (Bangalore)
5 - 8 yrs
₹5L - ₹15L / yr
pandas
NumPy
Data engineering
Data Engineer
Apache Spark
+6 more
  • Does analytics to extract insights from raw historical data of the organization. 
  • Generates usable training dataset for any/all MV projects with the help of Annotators, if needed.
  • Analyses user trends, and identifies their biggest bottlenecks in Hammoq Workflow.
  • Tests the short/long term impact of productized MV models on those trends.
  • Skills - Numpy, Pandas, SPARK, APACHE SPARK, PYSPARK, ETL mandatory. 
Read more
DataMetica

at DataMetica

1 video
7 recruiters
Sumangali Desai
Posted by Sumangali Desai
Pune, Hyderabad
7 - 12 yrs
₹7L - ₹20L / yr
Apache Spark
Big Data
Spark
skill iconScala
Hadoop
+3 more
We at Datametica Solutions Private Limited are looking for Big Data Spark Lead who have a passion for cloud with knowledge of different on-premise and cloud Data implementation in the field of Big Data and Analytics including and not limiting to Teradata, Netezza, Exadata, Oracle, Cloudera, Hortonworks and alike.
Ideal candidates should have technical experience in migrations and the ability to help customers get value from Datametica's tools and accelerators.

Job Description
Experience : 7+ years
Location : Pune / Hyderabad
Skills :
  • Drive and participate in requirements gathering workshops, estimation discussions, design meetings and status review meetings
  • Participate and contribute in Solution Design and Solution Architecture for implementing Big Data Projects on-premise and on cloud
  • Technical Hands on experience in design, coding, development and managing Large Hadoop implementation
  • Proficient in SQL, Hive, PIG, Spark SQL, Shell Scripting, Kafka, Flume, Scoop with large Big Data and Data Warehousing projects with either Java, Python or Scala based Hadoop programming background
  • Proficient with various development methodologies like waterfall, agile/scrum and iterative
  • Good Interpersonal skills and excellent communication skills for US and UK based clients

About Us!
A global Leader in the Data Warehouse Migration and Modernization to the Cloud, we empower businesses by migrating their Data/Workload/ETL/Analytics to the Cloud by leveraging Automation.

We have expertise in transforming legacy Teradata, Oracle, Hadoop, Netezza, Vertica, Greenplum along with ETLs like Informatica, Datastage, AbInitio & others, to cloud-based data warehousing with other capabilities in data engineering, advanced analytics solutions, data management, data lake and cloud optimization.

Datametica is a key partner of the major cloud service providers - Google, Microsoft, Amazon, Snowflake.


We have our own products!
Eagle –
Data warehouse Assessment & Migration Planning Product
Raven –
Automated Workload Conversion Product
Pelican -
Automated Data Validation Product, which helps automate and accelerate data migration to the cloud.

Why join us!
Datametica is a place to innovate, bring new ideas to live and learn new things. We believe in building a culture of innovation, growth and belonging. Our people and their dedication over these years are the key factors in achieving our success.

Benefits we Provide!
Working with Highly Technical and Passionate, mission-driven people
Subsidized Meals & Snacks
Flexible Schedule
Approachable leadership
Access to various learning tools and programs
Pet Friendly
Certification Reimbursement Policy

Check out more about us on our website below!
www.datametica.com
Read more
SpringML

at SpringML

1 video
2 recruiters
Kayal Vizhi
Posted by Kayal Vizhi
Hyderabad
4 - 11 yrs
₹8L - ₹20L / yr
Big Data
Hadoop
Apache Spark
Spark
Data Structures
+3 more

SpringML is looking to hire a top-notch Senior  Data Engineer who is passionate about working with data and using the latest distributed framework to process large dataset. As an Associate Data Engineer, your primary role will be to design and build data pipelines. You will be focused on helping client projects on data integration, data prep and implementing machine learning on datasets. In this role, you will work on some of the latest technologies, collaborate with partners on early win, consultative approach with clients, interact daily with executive leadership, and help build a great company. Chosen team members will be part of the core team and play a critical role in scaling up our emerging practice.

RESPONSIBILITIES:

 

  • Ability to work as a member of a team assigned to design and implement data integration solutions.
  • Build Data pipelines using standard frameworks in Hadoop, Apache Beam and other open-source solutions.
  • Learn quickly – ability to understand and rapidly comprehend new areas – functional and technical – and apply detailed and critical thinking to customer solutions.
  • Propose design solutions and recommend best practices for large scale data analysis

 

SKILLS:

 

  • B.tech  degree in computer science, mathematics or other relevant fields.
  • 4+years of experience in ETL, Data Warehouse, Visualization and building data pipelines.
  • Strong Programming skills – experience and expertise in one of the following: Java, Python, Scala, C.
  • Proficient in big data/distributed computing frameworks such as Apache,Spark, Kafka,
  • Experience with Agile implementation methodologies
Read more
service based company

service based company

Agency job
via Myna Solutions by Preethi M
Hyderabad
5 - 9 yrs
₹12L - ₹14L / yr
ETL
Snowflake
Data Warehouse (DWH)
Datawarehousing
Apache Spark
+4 more
Overall experience of 4 – 8 years of experience in DW / BI technologies.
Minimum 2 years of work experience on Snowflake and Azure storage.
Minimum 3 years of development experience in ETL Tool Experience.
Strong SQL database skills in other databases like Oracle, SQL Server, DB2 and Teradata
Good to have Hadoop and Spark experience.
Good conceptual knowledge on Data-Warehouse and various methodologies.
Working knowledge in any of the scripting like UNIX / Shell
Good Presentation and communication skills.
Should be flexible with the overlapping working hours.
Should be able to work independently and be proactive.
Good understanding of Agile development cycle.
Read more
Milestone Hr Consultancy

at Milestone Hr Consultancy

2 recruiters
Jyoti Sharma
Posted by Jyoti Sharma
Remote, Hyderabad
3 - 8 yrs
₹6L - ₹16L / yr
skill iconPython
skill iconDjango
Data engineering
Apache Hive
Apache Spark
We are currently looking for passionate Data Engineers to join our team and mission. In this role, you will help doctors from across the world improve care and save lives by helping extract insights and predict risk. Our Data Engineers ensure that data are ingested and prepared, ready for insights and intelligence to be derived from them. We’re looking for smart individuals to join our incredibly talented team, that is on a mission to transform healthcare.As a Data Engineer you will be engaged in some or all of the following activities:• Implement, test and deploy distributed data ingestion, data processing and feature engineering systems computing on large volumes of Healthcare data using a variety of open source and proprietary technologies.• Design data architectures and schemas optimized for analytics and machine learning.• Implement telemetry to monitor the performance and operations of data pipelines.• Develop tools and libraries to implement and manage data processing pipelines, including ingestion, cleaning, transformation, and feature computation.• Work with large data sets, and integrate diverse data sources, data types and data structures.• Work with Data Scientists, Machine Learning Engineers and Visualization Engineers to understand data requirements, and translate them into production-ready data pipelines.• Write and automate unit, functional, integration and performance tests in a Continuous Integration environment.• Take initiative to find solutions to technical challenges for healthcare data.You are a great match if you have some or all of the following skills and qualifications.• Strong understanding of database design and feature engineering to support Machine Learning and analytics.• At least 3 years of industry experience building, testing and deploying large-scale, distributed data processing systems.• Proficiency in working with multiple data processing tools and query languages (Python, Spark, SQL, etc.).• Excellent understanding of distributed computing concepts and Big Data technologies (Spark, Hive, etc.).• Proficiency in performance tuning and optimization of data processing pipelines.• Attention to detail and focus on software quality, with experience in software testing.• Strong cross discipline communication skills and teamwork.• Demonstrated clear and thorough logical and analytical thinking, as well as problem solving skills.• Bachelor or Masters in Computer Science or related field. Skill - Apache Spark-Python-Hive Skill Description - Skill1– SparkSkill2- PythonSkill3 – Hive, SQL Responsibility - Sr. data engineer"
Read more
Get to hear about interesting companies hiring right now
Company logo
Company logo
Company logo
Company logo
Company logo
Linkedin iconFollow Cutshort
Why apply via Cutshort?
Connect with actual hiring teams and get their fast response. No spam.
Find more jobs
Get to hear about interesting companies hiring right now
Company logo
Company logo
Company logo
Company logo
Company logo
Linkedin iconFollow Cutshort