9+ Apache Spark Jobs in Chennai | Apache Spark Job openings in Chennai
Apply to 9+ Apache Spark Jobs in Chennai on CutShort.io. Explore the latest Apache Spark Job opportunities across top companies like Google, Amazon & Adobe.
Key Responsibilities
Architect and implement enterprise-grade Lakehouse solutions using Databricks
Design and deliver scalable batch and real-time data pipelines using Apache Spark (PySpark/SQL)
Build ETL/ELT pipelines, incremental data loads, and metadata-driven ingestion frameworks
Implement and optimize Databricks components: Delta Lake, Delta Live Tables, Autoloader, Structured Streaming, and Workflows
Design large-scale data warehousing solutions with 3NF and dimensional modeling
Establish data governance, security, and data quality frameworks, including Unity Catalog
Lead ML lifecycle management using MLflow and drive AI use cases (RAG, AI/BI)
Manage cloud-native deployments on Microsoft Azure and integrate with enterprise systems (e.g., ServiceNow)
Drive CI/CD, DevOps practices, and performance optimization of Spark workloads
Provide technical leadership, mentor teams, and ensure successful delivery
Collaborate with stakeholders to translate business requirements into scalable solutions
Required Skills & Experience
10+ years in Data Engineering / Analytics / AI with strong delivery ownership
Deep expertise in Databricks ecosystem (Notebooks, Delta Lake, Workflows, AI/BI, Apps, Genie)
Strong hands-on experience with:
a. Apache Spark (performance tuning & scalability)
b. Python and SQL
Proven experience in:
a. Solution architecture and large-scale data platforms
b. Data warehousing and advanced data modeling
c. Batch and real-time processing systems
Experience with:
a. Azure Databricks and Azure data services
b. MLflow and MLOps practices
c. ServiceNow or enterprise integrations
Exposure to AI technologies (RAG, LLM-based solutions)
Strong stakeholder management and leadership skills
Certifications (Preferred)
Databricks certifications aligned to data engineering and AI tracks, such as:
a. Databricks Certified Data Engineer Associate (validates foundational ETL, Spark, and Lakehouse capabilities)
b. Databricks Certified Data Engineer Professional (advanced expertise in pipeline design, optimization, and governance)
Certifications in Databricks Machine Learning or Generative AI tracks (e.g., ML Associate / Professional) for AI-driven use cases
Relevant cloud certifications in Microsoft Azure or Amazon Web Services for platform deployment and architecture
Job Title: Senior Specialist – Software Engineering (P4)
Experience: 8–12 Years
Location: Pune / Chennai / Kolkata / Bhubaneswar
Compensation: Up to 36 LPA
Job Description:
We are looking for an experienced Senior Specialist to lead the design and development of scalable enterprise applications. The ideal candidate should have deep expertise in backend technologies, system design, and distributed systems.
Key Responsibilities:
- Design and develop scalable systems using Java & Spring Boot
- Architect and implement microservices-based solutions
- Build high-performance REST APIs
- Work with Kafka for event-driven architecture
- Handle large-scale data processing using Apache Spark
- Optimize database interactions using JPA/Hibernate
- Provide technical leadership and mentor junior engineers
- Collaborate across teams (frontend, DevOps, product)
- Drive system performance, security, and scalability
Required Skills:
- Strong expertise in Java, Spring Boot
- Deep understanding of Microservices Architecture
- Experience with REST APIs & system design
- Hands-on with Kafka & distributed systems
- Experience with Apache Spark (big data)
- Working knowledge of Angular
- Exposure to AI tools / Prompt Engineering / Windsurf
Education:
- BE / B.Tech
Notice Period:
- Immediate to May joiners preferred
Technical Architect (Databricks)
- 10+ Years Data Engineering Experience with expertise in Databricks
- 3+ years of consulting experience
- Completed Data Engineering Professional certification & required classes
- Minimum 2-3 projects delivered with hands-on experience in Databricks
- Completed Apache Spark Programming with Databricks, Data Engineering with Databricks, Optimizing Apache Spark™ on Databricks
- Experience in Spark and/or Hadoop, Flink, Presto, other popular big data engines
- Familiarity with Databricks multi-hop pipeline architecture
Sr. Data Engineer (Databricks)
- 5+ Years Data Engineering Experience with expertise in Databricks
- Completed Data Engineering Associate certification & required classes
- Minimum 1 project delivered with hands-on experience in development on Databricks
- Completed Apache Spark Programming with Databricks, Data Engineering with Databricks, Optimizing Apache Spark™ on Databricks
- SQL delivery experience, and familiarity with Bigquery, Synapse or Redshift
- Proficient in Python, knowledge of additional databricks programming languages (Scala)
Job Summary:
Seeking an experienced Senior Data Engineer to lead data ingestion, transformation, and optimization initiatives using the modern Apache and Azure data stack. The role involves working on scalable pipelines, large-scale distributed systems, and data lake management.
Core Responsibilities:
· Build and manage high-volume data pipelines using Spark/Databricks.
· Implement ELT frameworks using Azure Data Factory/Synapse Pipelines.
· Optimize large-scale datasets in Delta/Iceberg formats.
· Implement robust data quality, monitoring, and governance layers.
· Collaborate with Data Scientists, Analysts, and Business stakeholders.
Technical Stack:
· Big Data: Apache Spark, Kafka, Hive, Airflow, Hudi/Iceberg
· Cloud: Azure (Synapse, ADF, ADLS Gen2), Databricks, AWS (Glue/S3)
· Languages: Python, Scala, SQL
· Storage Formats: Delta Lake, Iceberg, Parquet, ORC
· CI/CD: Azure DevOps, Terraform (infra as code), Git
Senior Data Engineer (Apache Stack + Databricks/Synapse)
Share cv to
Thirega@ vysystems dot com - WhatsApp - 91Five0033Five2Three
Data Engineer- Senior
Cubera is a data company revolutionizing big data analytics and Adtech through data share value principles wherein the users entrust their data to us. We refine the art of understanding, processing, extracting, and evaluating the data that is entrusted to us. We are a gateway for brands to increase their lead efficiency as the world moves towards web3.
What are you going to do?
Design & Develop high performance and scalable solutions that meet the needs of our customers.
Closely work with the Product Management, Architects and cross functional teams.
Build and deploy large-scale systems in Java/Python.
Identify, design, and implement internal process improvements: automating manual processes, optimizing data delivery, re-designing infrastructure for greater scalability, etc.
Create data tools for analytics and data scientist team members that assist them in building and optimizing their algorithms.
Follow best practices that can be adopted in Bigdata stack.
Use your engineering experience and technical skills to drive the features and mentor the engineers.
What are we looking for ( Competencies) :
Bachelor’s degree in computer science, computer engineering, or related technical discipline.
Overall 5 to 8 years of programming experience in Java, Python including object-oriented design.
Data handling frameworks: Should have a working knowledge of one or more data handling frameworks like- Hive, Spark, Storm, Flink, Beam, Airflow, Nifi etc.
Data Infrastructure: Should have experience in building, deploying and maintaining applications on popular cloud infrastructure like AWS, GCP etc.
Data Store: Must have expertise in one of general-purpose No-SQL data stores like Elasticsearch, MongoDB, Redis, RedShift, etc.
Strong sense of ownership, focus on quality, responsiveness, efficiency, and innovation.
Ability to work with distributed teams in a collaborative and productive manner.
Benefits:
Competitive Salary Packages and benefits.
Collaborative, lively and an upbeat work environment with young professionals.
Job Category: Development
Job Type: Full Time
Job Location: Bangalore
Location: Chennai
Education: BE/BTech
Experience: Minimum 3+ years of experience as a Data Scientist/Data Engineer
Domain knowledge: Data cleaning, modelling, analytics, statistics, machine learning, AI
Requirements:
- To be part of Digital Manufacturing and Industrie 4.0 projects across client group of companies
- Design and develop AI//ML models to be deployed across factories
- Knowledge on Hadoop, Apache Spark, MapReduce, Scala, Python programming, SQL and NoSQL databases is required
- Should be strong in statistics, data analysis, data modelling, machine learning techniques and Neural Networks
- Prior experience in developing AI and ML models is required
- Experience with data from the Manufacturing Industry would be a plus
Roles and Responsibilities:
- Develop AI and ML models for the Manufacturing Industry with a focus on Energy, Asset Performance Optimization and Logistics
- Multitasking, good communication necessary
- Entrepreneurial attitude
Additional Information:
- Travel: Must be willing to travel on shorter duration within India and abroad
- Job Location: Chennai
- Reporting to: Team Leader, Energy Management System

American Multinational Retail Corp
Should have Passion to learn and adapt new technologies, understanding,
solving/troubleshooting issues and risks, able to make informed decisions and ability to
lead the projects.
Your Qualifications
- 2-5 Years’ Experience with functional programming
- Experience with functional programming using Scala with Spark framework.
- Strong understanding of Object-oriented programming, data structures and algorithms
- Good experience in any of the cloud platforms (Azure, AWS, GCP) etc.,
- Experience with distributed (multi-tiered) systems, relational databases and NoSql storage solutions
- Desire to learn new technologies and languages
- Participation in software design, development, and code reviews
- High level of proficiency with Computer Science/Software Engineering knowledge and contribution to the technical skills growth of other team members
Your Responsibility
- Design, build and configure applications to meet business process and application requirements
- Proactively identify and communicate potential issues and concerns and recommend/implement alternative solutions as appropriate.
- Troubleshooting & Optimization of existing solution
Provide advice on technical design to ensure solutions are forward looking and flexible for potential future requirements and business needs.
- Must have the experience of leading teams and drive customer interactions
- Must have multiple successful deployments user stories
- Extensive hands on experience in Apache Spark along with HiveQL
- Sound knowledge in Amazon Web Services or any other Cloud environment.
- Experienced in data flow orchestration using Apache Airflow
- JSON, XML, CSV, Parquet file formats with snappy compression.
- File movements between HDFS and AWS S3
- Experience in shell scripting and scripting to automate report generation and migration of reports to AWS S3
- Worked in building a data pipeline using Pandas and Flask FrameworkGood Familiarity with Anaconda and Jupyternotebook


