Cutshort logo

50+ Spark Jobs in India

Apply to 50+ Spark Jobs on CutShort.io. Find your next job, effortlessly. Browse Spark Jobs and apply today!

icon
Tecblic Private LImited
Ahmedabad
7 - 8 yrs
₹8L - ₹18L / yr
Windows Azure
Data engineering
skill iconPython
SQL
Data modeling
+4 more

Job Description: Data Engineer

Location: Ahmedabad

Experience: 7+ years

Employment Type: Full-Time



We are looking for a highly motivated and experienced Data Engineer to join our  team. As a Data Engineer, you will play a critical role in designing, building, and optimizing data pipelines that ensure the availability, reliability, and performance of our data infrastructure. You will collaborate closely with data scientists, analysts, and cross-functional teams to provide timely and efficient data solutions.


Responsibilities


● Design and optimize data pipelines for various data sources


● Design and implement efficient data storage and retrieval mechanisms


● Develop data modelling solutions and data validation mechanisms


● Troubleshoot data-related issues and recommend process improvements


● Collaborate with data scientists and stakeholders to provide data-driven insights and solutions


● Coach and mentor junior data engineers in the team


Skills Required: 


● Minimum 5 years of experience in data engineering or related field


● Proficient in designing and optimizing data pipelines and data modeling


● Strong programming expertise in Python


● Hands-on experience with big data technologies such as Hadoop, Spark, and Hive


● Extensive experience with cloud data services such as AWS, Azure, and GCP


● Advanced knowledge of database technologies like SQL, NoSQL, and data warehousing


● Knowledge of distributed computing and storage systems


● Familiarity with DevOps practices and power automate and Microsoft Fabric will be an added advantage


● Strong analytical and problem-solving skills with outstanding communication and collaboration abilities


Qualifications


● Bachelor's degree in Computer Science, Data Science, or a Computer related field


Read more
Thinkgrid Labs

at Thinkgrid Labs

2 candid answers
Eman Khan
Posted by Eman Khan
Remote only
2 - 10 yrs
₹10L - ₹18L / yr
Microsoft Fabric
Fabric Mirroring
skill iconPython
skill iconScala
Spark
+4 more

Job Description


Who are you?

  • SQL & CDC Pro: Strong SQL Server/T-SQL; hands-on CDC or replication patterns for initial snapshot + incremental syncs, including delete handling.
  • Fabric Mirroring Practitioner: You’ve set up and tuned Fabric Mirroring to land data into OneLake/Lakehouse; comfortable with OneLake shortcuts and workspace/domain organisation.
  • Schema-Drift Aware: You detect, evolve, and communicate schema changes safely (contracts, tests, alerts), minimising downstream breakage.
  • High-Volume Ingestion Mindset: You design for throughput, resiliency, and backpressure—retries, idempotency, partitioning, and efficient file sizing.
  • Python/Scala/Spark Capable: You can build notebooks/ingestion frameworks for advanced scenarios and data quality checks.
  • Operationally Excellent: You add observability (logging/metrics/alerts), document runbooks, and partner well with platform, security, and analytics teams.
  • Data Security Conscious: You respect PII/PHI, apply least privilege, and align with RLS/CLS patterns and governance guardrails.


What you will be doing?

  • Stand Up Mirroring: Configure Fabric Mirroring from SQL Server (and other relational sources) into OneLake; tune schedules, snapshots, retention, and throughput.
  • Land to Bronze Cleanly: Define Lakehouse folder structures, naming/tagging conventions, and partitioning for fast, organised Bronze ingestion.
  • Handle Change at Scale: Implement CDC—including soft/hard deletes, late-arriving data, and backfills—using reliable watermarking and reconciliation checks.
  • Design Resilient Pipelines: Build ingestion with Fabric Data Factory and/or notebooks; add retries, dead-lettering, and circuit-breaker patterns for fault tolerance.
  • Manage Schema Drift: Automate drift detection and schema evolution; publish change notes and guardrails so downstream consumers aren’t surprised.
  • Performance & Cost Tuning: Optimise batch sizes, file counts, partitions, parallelism, and capacity usage to balance speed, reliability, and spend.
  • Observability & Quality: Instrument lineage, logs, metrics, and DQ tests (nulls, ranges, uniqueness); set up alerts and simple SLOs for ingestion health.
  • Collaboration & Documentation: Partner with the Fabric Platform Architect on domains, security, and workspaces; document pipelines, SLAs, and runbooks.


Must-have skills

  • SQL Server, T-SQL; CDC/replication fundamentals
  • Microsoft Fabric Mirroring; OneLake/Lakehouse; OneLake shortcuts
  • Schema drift detection/management and data contracts
  • Familiarity with large, complex relational databases
  • Python/Scala/Spark for ingestion and validation
  • Git-based workflow; basic CI/CD (Fabric deployment pipelines or Azure DevOps)

Benefits

  • 5 day work week
  • 100% remote setup with flexible work culture and international exposure
  • Opportunity to work on mission-critical healthcare projects impacting providers and patients globally
Read more
Tata Consultancy Services
Chennai, Hyderabad, Kolkata, Delhi, Pune, Bengaluru (Bangalore)
4 - 10 yrs
₹6L - ₹30L / yr
Scala
PySpark
Spark
skill iconAmazon Web Services (AWS)

Job Title: PySpark/Scala Developer

 

Functional Skills: Experience in Credit Risk/Regulatory risk domain

Technical Skills: Spark ,PySpark, Python, Hive, Scala, MapReduce, Unix shell scripting

Good to Have Skills: Exposure to Machine Learning Techniques

Job Description:

5+ Years of experience with Developing/Fine tuning and implementing programs/applications

Using Python/PySpark/Scala on Big Data/Hadoop Platform.

Roles and Responsibilities:

a)     Work with a Leading Bank’s Risk Management team on specific projects/requirements pertaining to risk Models in

 consumer and wholesale banking

b)     Enhance Machine Learning Models using PySpark or Scala

c)     Work with Data Scientists to Build ML Models based on Business Requirements and Follow ML Cycle to Deploy them all

the way to Production Environment

d)     Participate Feature Engineering, Training Models, Scoring and retraining

e)     Architect Data Pipeline and Automate Data Ingestion and Model Jobs

 

Skills and competencies:

Required:

·       Strong analytical skills in conducting sophisticated statistical analysis using bureau/vendor data, customer performance

Data and macro-economic data to solve business problems.

·       Working experience in languages PySpark & Scala to develop code to validate and implement models and codes in

Credit Risk/Banking

·       Experience with distributed systems such as Hadoop/MapReduce, Spark, streaming data processing, cloud architecture.

  • Familiarity with machine learning frameworks and libraries (like scikit-learn, SparkML, tensorflow, pytorch etc.
  • Experience in systems integration, web services, batch processing
  • Experience in migrating codes to PySpark/Scala is big Plus
  • The ability to act as liaison conveying information needs of the business to IT and data constraints to the business

applies equal conveyance regarding business strategy and IT strategy, business processes and work flow

·       Flexibility in approach and thought process

·       Attitude to learn and comprehend the periodical changes in the regulatory requirement as per FED

 

 

Read more
Tata Consultancy Services
Bengaluru (Bangalore), Hyderabad, Pune, Delhi, Kolkata, Chennai
5 - 8 yrs
₹7L - ₹30L / yr
skill iconScala
skill iconPython
PySpark
Apache Hive
Spark
+3 more

Skills and competencies:

Required:

·        Strong analytical skills in conducting sophisticated statistical analysis using bureau/vendor data, customer performance

Data and macro-economic data to solve business problems.

·        Working experience in languages PySpark & Scala to develop code to validate and implement models and codes in

Credit Risk/Banking

·        Experience with distributed systems such as Hadoop/MapReduce, Spark, streaming data processing, cloud architecture.

  • Familiarity with machine learning frameworks and libraries (like scikit-learn, SparkML, tensorflow, pytorch etc.
  • Experience in systems integration, web services, batch processing
  • Experience in migrating codes to PySpark/Scala is big Plus
  • The ability to act as liaison conveying information needs of the business to IT and data constraints to the business

applies equal conveyance regarding business strategy and IT strategy, business processes and work flow

·        Flexibility in approach and thought process

·        Attitude to learn and comprehend the periodical changes in the regulatory requirement as per FED

Read more
NeoGenCode Technologies Pvt Ltd
Shivank Bhardwaj
Posted by Shivank Bhardwaj
Bengaluru (Bangalore)
4 - 8 yrs
₹5L - ₹20L / yr
Object Oriented Programming (OOPs)
skill iconJava
Spark
Microservices
CI/CD
+6 more

Job Description

We are seeking a highly skilled and experienced Backend Engineer to join our dynamic and fast-paced development team in Bangalore. The ideal candidate will have expertise in Java development, particularly in Java 8 or above, and extensive hands-on experience with Apache Spark, Spark Streaming, and Spring Boot for developing scalable and high-performance microservices. The candidate must also have strong problem-solving skills, a deep understanding of distributed computing, and experience with cloud technologies (Azure).


Key Responsibilities

  • Design, develop, and maintain highly scalable microservices and optimized RESTful APIs using Spring Boot in Java 8 or above.
  • Write efficient and maintainable Spark and Spark Streaming code for processing large-scale data in real-time.
  • Implement Java 8 advanced features such as Functional Interfaces, Lambda Expressions, Streams, Parallel Streams, Completable Futures, and Concurrency API improvements.
  • Work with relational (SQL) and non-relational (Cosmos DB) databases for data modeling and optimization.
  • Utilize Maven for building and deploying artifacts to the snapshot repository.
  • Collaborate with cross-functional teams, including Product, Business, Automation, and other stakeholders, to define, design, and deliver new features.
  • Follow Agile SCRUM methodologies for software development and actively participate in sprint planning and retrospective meetings.
  • Maintain version control using Git and ensure best practices for code collaboration and peer code reviews.
  • Implement CI/CD pipelines using tools such as Jenkins and GitHub Actions to automate build and deployment processes.
  • Work with Azure Cloud Technologies to build and deploy cloud-based applications.
  • Apply software design patterns and best practices in backend development to enhance system architecture and scalability.
  • Troubleshoot and debug applications, ensuring high performance, security, and scalability.
  • Keep up to date with the latest industry trends, tools, and technologies to continuously improve development processes.


Minimum Qualifications

  • BS/MS in Computer Science or equivalent.
  • 4+ years of industry experience in developing highly scalable microservices and optimized RESTful APIs using Spring Boot in Java 8 or above.
  • 3+ years of experience in version control tools like Git.
  • 3+ years of experience working in an Agile SCRUM environment.
  • Strong understanding of software design patterns and distributed computing concepts.
  • Solid experience in relational and non-relational databases (SQL and Cosmos DB).
  • Experience with Maven for building and managing dependencies.
  • Knowledge of CI/CD workflows and experience with Jenkins and GitHub Actions.
  • Prior enterprise experience in working with Azure Cloud Technologies.
  • Proven ability to work collaboratively with cross-functional teams to deliver high-quality product features.
  • Strong problem-solving skills, debugging techniques, and ability to troubleshoot complex issues efficiently.


Preferred Qualifications

  • Experience with Kafka or other messaging queues for real-time data processing.
  • Exposure to Docker, Kubernetes, and container orchestration tools.
  • Hands-on experience with NoSQL databases like MongoDB, Cassandra, or DynamoDB.
  • Experience with performance optimization techniques for backend applications.
  • Knowledge of test-driven development (TDD) and unit testing frameworks like JUnit.


Read more
CoffeeBeans

at CoffeeBeans

2 candid answers
Nikita Sinha
Posted by Nikita Sinha
Bengaluru (Bangalore)
5 - 8 yrs
Upto ₹26L / yr (Varies
)
Spark
skill iconScala
SQL
NOSQL Databases
Windows Azure
+2 more

Roles and responsibilities-


- Tech-lead in one of the feature teams, candidate need to be work along with team lead in handling the team without much guidance

- Good communication and leadership skills

- Nurture and build next level talent within the team

- Work in collaboration with other vendors and client development team(s)

- Flexible to learn new tech areas

- Lead complete lifecycle of feature - from feature inception to solution, story grooming, delivery, and support features in production

- Ensure and build the controls and processes for continuous delivery of applications, considering all stages of the process and its automations

- Interact with teammates from across the business and comfortable explaining technical concepts to nontechnical audiences

- Create robust, scalable, flexible, and relevant solutions that help transform product and businesses


Must haves: 

- Spark

- Scala

- Postgres(or any SQL DB)s

- Elasticsearch(or any No-SQL DB)

- Azure (if not, any other cloud experience)

- Big data processing


Good to have:

- Golang

- Databricks

- Kubernetes

Read more
Aceis Services

at Aceis Services

2 candid answers
Anushi Mishra
Posted by Anushi Mishra
Remote only
2 - 10 yrs
₹8.6L - ₹30.2L / yr
CI/CD
Apache Spark
PySpark
MLOps
skill iconMachine Learning (ML)
+6 more

We are hiring freelancers to work on advanced Data & AI projects using Databricks. If you are passionate about cloud platforms, machine learning, data engineering, or architecture, and want to work with cutting-edge tools on real-world challenges, this is the opportunity for you!

Key Details

  • Work Type: Freelance / Contract
  • Location: Remote
  • Time Zones: IST / EST only
  • Domain: Data & AI, Cloud, Big Data, Machine Learning
  • Collaboration: Work with industry leaders on innovative projects

🔹 Open Roles

1. Databricks – Senior Consultant

  • Skills: Data Warehousing, Python, Java, Scala, ETL, SQL, AWS, GCP, Azure
  • Experience: 6+ years

2. Databricks – ML Engineer

  • Skills: CI/CD, MLOps, Machine Learning, Spark, Hadoop
  • Experience: 4+ years

3. Databricks – Solution Architect

  • Skills: Azure, GCP, AWS, CI/CD, MLOps
  • Experience: 7+ years

4. Databricks – Solution Consultant

  • Skills: SQL, Spark, BigQuery, Python, Scala
  • Experience: 2+ years

What We Offer

  • Opportunity to work with top-tier professionals and clients
  • Exposure to cutting-edge technologies and real-world data challenges
  • Flexible remote work environment aligned with IST / EST time zones
  • Competitive compensation and growth opportunities

📌 Skills We Value

Cloud Computing | Data Warehousing | Python | Java | Scala | ETL | SQL | AWS | GCP | Azure | CI/CD | MLOps | Machine Learning | Spark |

Read more
CoffeeBeans

at CoffeeBeans

2 candid answers
Nikita Sinha
Posted by Nikita Sinha
Bengaluru (Bangalore), Pune
5 - 7 yrs
Upto ₹22L / yr (Varies
)
skill iconPython
SQL
ETL
Data modeling
Spark
+6 more

Role Overview

We're looking for experienced Data Engineers who can independently design, build, and manage scalable data platforms. You'll work directly with clients and internal teams to develop robust data pipelines that support analytics, AI/ML, and operational systems.

You’ll also play a mentorship role and help establish strong engineering practices across our data projects.

Key Responsibilities

  • Design and develop large-scale, distributed data pipelines (batch and streaming)
  • Implement scalable data models, warehouses/lakehouses, and data lakes
  • Translate business requirements into technical data solutions
  • Optimize data pipelines for performance and reliability
  • Ensure code is clean, modular, tested, and documented
  • Contribute to architecture, tooling decisions, and platform setup
  • Review code/design and mentor junior engineers

Must-Have Skills

  • Strong programming skills in Python and advanced SQL
  • Solid grasp of ETL/ELT, data modeling (OLTP & OLAP), and stream processing
  • Hands-on experience with frameworks like Apache Spark, Flink, etc.
  • Experience with orchestration tools like Airflow
  • Familiarity with CI/CD pipelines and Git
  • Ability to debug and scale data pipelines in production

Preferred Skills

  • Experience with cloud platforms (AWS preferred, GCP or Azure also fine)
  • Exposure to Databricks, dbt, or similar tools
  • Understanding of data governance, quality frameworks, and observability
  • Certifications (e.g., AWS Data Analytics, Solutions Architect, Databricks) are a bonus

What We’re Looking For

  • Problem-solver with strong analytical skills and attention to detail
  • Fast learner who can adapt across tools, tech stacks, and domains
  • Comfortable working in fast-paced, client-facing environments
  • Willingness to travel within India when required
Read more
Tecblic Private LImited
Ahmedabad
8 - 12 yrs
₹6L - ₹28L / yr
Data Structures
Data Visualization
databricks
Azure data factory
Spark
+6 more

Data Architecture and Engineering Lead


Responsibilities:

  • Lead Data Architecture: Own the design, evolution, and delivery of enterprise data architecture across cloud and hybrid environments. Develop relational and analytical data models (conceptual, logical, and physical) to support business needs and ensure data integrity.
  • Consolidate Core Systems: Unify data sources across airport systems into a single analytical platform optimised for business value.
  • Build Scalable Infrastructure: Architect cloud-native solutions that support both batch and streaming data workflows using tools like Databricks, Kafka, etc.
  • Implement Governance Frameworks: Define and enforce enterprise-wide data standards for access control, privacy, quality, security, and lineage.
  • Enable Metadata & Cataloguing: Deploy metadata management and cataloguing tools to enhance data discoverability and self-service analytics.
  • Operationalise AI/ML Pipelines: Lead data architecture that supports AI/ML initiatives, including forecasting, pricing models, and personalisation.
  • Partner Across Functions: Translate business needs into data architecture solutions by collaborating with leaders in Operations, Finance, HR, Legal, Technology.
  • Optimize Cloud Cost & Performance: Roll out compute and storage systems that balance cost efficiency, performance, and observability across platforms.


Qualifications:

  • 12+ years of experience in data architecture, with 3+ years in a senior or leadership role across cloud or hybrid environments
  • Proven ability to design and scale large data platforms supporting analytics, real-time reporting, and AI/ML use cases
  • Hands-on expertise with ingestion, transformation, and orchestration pipelines
  • Extensive experience with Microsoft Azure data services, including Azure Data Lake Storage, Azure Databricks, Azure Data Factory and related technologies.
  • Strong knowledge of ERP data models, especially SAP and MS Dynamics
  • Experience with data governance, compliance (GDPR/CCPA), metadata cataloguing, and security practices
  • Familiarity with distributed systems and streaming frameworks like Spark or Flink
  • Strong stakeholder management and communication skills, with the ability to influence both technical and business teams


Tools & Technologies


  • Warehousing: Azure Databricks Delta, BigQuery
  • Big Data: Apache Spark
  • Cloud Platforms: Azure (ADLS, AKS, EventHub, ServiceBus)
  • Streaming: Kafka, Pub/Sub
  • RDBMS: PostgreSQL, MS SQL
  • NoSQL: Redis
  • Monitoring: Azure Monitoring, App Insight, Prometheus, Grafana

 

 

 


Read more
It is a global technology consultancy

It is a global technology consultancy

Agency job
via Scaling Theory by DivyaSri Rajendran
Bengaluru (Bangalore)
4.5 - 10 yrs
₹15L - ₹30L / yr
Spark
skill iconScala
Hadoop
skill iconAmazon Web Services (AWS)

Role overview:

  • Must have About 5 - 11 years and at least 3 years relevant experience with Bigdata. 
  • Must have Experience in building highly scalable business applications, which involve implementing large complex business flows and dealing with huge amounts of data. 
  • Must have experience in Hadoop, Hive, Spark with Scala with good experience in performance tuning and debugging issues.
  • Good to have any stream processing Spark/Java Kafka. 
  • Must have experience in design and development of Big data projects.
  • Good knowledge in Functional programming and OOP concepts, SOLID principles, design patterns for developing scalable applications. 
  • Familiarity with build tools like Maven. 
  • Must have experience with any RDBMS and at least one SQL database preferably PostgresSQL
  • Must have experience writing unit and integration tests using scaliest
  • Must have experience using any versioning control system - Git 
  • Must have experience with CI / CD pipeline – Jenkins is a plus  
  • Basic hands-on experience in one of the cloud provider (AWS/Azure) is a plus
  • Databricks Spark certification is a plus.


What would you do here:

As a Software Development Engineer 2 you will be responsible for expanding and optimising our data and data pipeline architecture as well as optimising data flow and collection for cross-functional teams. The ideal candidate is an experienced data pipeline design and data wrangler who enjoys optimising data systems and building them from the ground up. The Data Engineer will lead our software developers on data initiatives and will ensure optimal data delivery architecture is consistent throughout ongoing projects. They must be self-directed and comfortable supporting the data needs of multiple teams, systems and products. The right candidate will be excited by the prospect of optimising or even re-designing our company’s data architecture to support our next generation of products and data initiatives.

 

Responsibilities:

 

•Create and maintain optimal data pipeline architecture

•Assemble large complex data sets that meet functional / non-functional business requirements.

•Identify design and implement internal process improvements: automating manual processes optimising data delivery, coordinating to re-design infrastructure for greater scalability etc.

•Work with stakeholders including the Executive Product Data and Design teams to assist with data-related technical issues and support their data infrastructure needs.

•Keep our data separated and secure

•Work with data and analytics experts to strive for greater functionality in our data systems.

- Support PROD systems


Read more
empowers digital transformation for innovative and high grow

empowers digital transformation for innovative and high grow

Agency job
via Hirebound by Jebin Joy
Pune
4 - 12 yrs
₹12L - ₹30L / yr
Hadoop
Spark
Apache Kafka
ETL
skill iconJava
+2 more

To be successful in this role, you should possess

• Collaborate closely with Product Management and Engineering leadership to devise and build the

right solution.

• Participate in Design discussions and brainstorming sessions to select, integrate, and maintain Big

Data tools and frameworks required to solve Big Data problems at scale.

• Design and implement systems to cleanse, process, and analyze large data sets using distributed

processing tools like Akka and Spark.

• Understanding and critically reviewing existing data pipelines, and coming up with ideas in

collaboration with Technical Leaders and Architects to improve upon current bottlenecks

• Take initiatives, and show the drive to pick up new stuff proactively, and work as a Senior

Individual contributor on the multiple products and features we have.

• 3+ years of experience in developing highly scalable Big Data pipelines.

• In-depth understanding of the Big Data ecosystem including processing frameworks like Spark,

Akka, Storm, and Hadoop, and the file types they deal with.

• Experience with ETL and Data pipeline tools like Apache NiFi, Airflow etc.

• Excellent coding skills in Java or Scala, including the understanding to apply appropriate Design

Patterns when required.

• Experience with Git and build tools like Gradle/Maven/SBT.

• Strong understanding of object-oriented design, data structures, algorithms, profiling, and

optimization.

• Have elegant, readable, maintainable and extensible code style.


You are someone who would easily be able to

• Work closely with the US and India engineering teams to help build the Java/Scala based data

pipelines

• Lead the India engineering team in technical excellence and ownership of critical modules; own

the development of new modules and features

• Troubleshoot live production server issues.

• Handle client coordination and be able to work as a part of a team, be able to contribute

independently and drive the team to exceptional contributions with minimal team supervision

• Follow Agile methodology, JIRA for work planning, issue management/tracking


Additional Project/Soft Skills:

• Should be able to work independently with India & US based team members.

• Strong verbal and written communication with ability to articulate problems and solutions over phone and emails.

• Strong sense of urgency, with a passion for accuracy and timeliness.

• Ability to work calmly in high pressure situations and manage multiple projects/tasks.

• Ability to work independently and possess superior skills in issue resolution.

• Should have the passion to learn and implement, analyze and troubleshoot issues

Read more
Pluginlive

at Pluginlive

1 recruiter
Harsha Saggi
Posted by Harsha Saggi
Chennai, Mumbai
4 - 6 yrs
₹10L - ₹20L / yr
skill iconPython
SQL
NOSQL Databases
Data architecture
Data modeling
+7 more

Role Overview:

We are seeking a talented and experienced Data Architect with strong data visualization capabilities to join our dynamic team in Mumbai. As a Data Architect, you will be responsible for designing, building, and managing our data infrastructure, ensuring its reliability, scalability, and performance. You will also play a crucial role in transforming complex data into insightful visualizations that drive business decisions. This role requires a deep understanding of data modeling, database technologies (particularly Oracle Cloud), data warehousing principles, and proficiency in data manipulation and visualization tools, including Python and SQL.


Responsibilities:

  • Design and implement robust and scalable data architectures, including data warehouses, data lakes, and operational data stores, primarily leveraging Oracle Cloud services.
  • Develop and maintain data models (conceptual, logical, and physical) that align with business requirements and ensure data integrity and consistency.
  • Define data governance policies and procedures to ensure data quality, security, and compliance.
  • Collaborate with data engineers to build and optimize ETL/ELT pipelines for efficient data ingestion, transformation, and loading.
  • Develop and execute data migration strategies to Oracle Cloud.
  • Utilize strong SQL skills to query, manipulate, and analyze large datasets from various sources.
  • Leverage Python and relevant libraries (e.g., Pandas, NumPy) for data cleaning, transformation, and analysis.
  • Design and develop interactive and insightful data visualizations using tools like [Specify Visualization Tools - e.g., Tableau, Power BI, Matplotlib, Seaborn, Plotly] to communicate data-driven insights to both technical and non-technical stakeholders.
  • Work closely with business analysts and stakeholders to understand their data needs and translate them into effective data models and visualizations.
  • Ensure the performance and reliability of data visualization dashboards and reports.
  • Stay up-to-date with the latest trends and technologies in data architecture, cloud computing (especially Oracle Cloud), and data visualization.
  • Troubleshoot data-related issues and provide timely resolutions.
  • Document data architectures, data flows, and data visualization solutions.
  • Participate in the evaluation and selection of new data technologies and tools.


Qualifications:

  • Bachelor's or Master's degree in Computer Science, Data Science, Information Systems, or a related field.
  • Proven experience (typically 5+ years) as a Data Architect, Data Modeler, or similar role. 

  • Deep understanding of data warehousing concepts, dimensional modeling (e.g., star schema, snowflake schema), and ETL/ELT processes.
  • Extensive experience working with relational databases, particularly Oracle, and proficiency in SQL.
  • Hands-on experience with Oracle Cloud data services (e.g., Autonomous Data Warehouse, Object Storage, Data Integration).
  • Strong programming skills in Python and experience with data manipulation and analysis libraries (e.g., Pandas, NumPy).
  • Demonstrated ability to create compelling and effective data visualizations using industry-standard tools (e.g., Tableau, Power BI, Matplotlib, Seaborn, Plotly).
  • Excellent analytical and problem-solving skills with the ability to interpret complex data and translate it into actionable insights. 
  • Strong communication and presentation skills, with the ability to effectively communicate technical concepts to non-technical audiences. 
  • Experience with data governance and data quality principles.
  • Familiarity with agile development methodologies.
  • Ability to work independently and collaboratively within a team environment.

Application Link- https://forms.gle/km7n2WipJhC2Lj2r5

Read more
Sonatype

at Sonatype

5 candid answers
Reshika Mendiratta
Posted by Reshika Mendiratta
Hyderabad
6 - 10 yrs
₹15L - ₹33L / yr
ETL
Spark
Apache Kafka
skill iconPython
skill iconJava
+11 more

The Opportunity

We’re looking for a Senior Data Engineer to join our growing Data Platform team. This role is a hybrid of data engineering and business intelligence, ideal for someone who enjoys solving complex data challenges while also building intuitive and actionable reporting solutions.


You’ll play a key role in designing and scaling the infrastructure and pipelines that power analytics, dashboards, machine learning, and decision-making across Sonatype. You’ll also be responsible for delivering clear, compelling, and insightful business intelligence through tools like Looker Studio and advanced SQL queries.


What You’ll Do

  • Design, build, and maintain scalable data pipelines and ETL/ELT processes.
  • Architect and optimize data models and storage solutions for analytics and operational use.
  • Create and manage business intelligence reports and dashboards using tools like Looker Studio, Power BI, or similar.
  • Collaborate with data scientists, analysts, and stakeholders to ensure datasets are reliable, meaningful, and actionable.
  • Own and evolve parts of our data platform (e.g., Airflow, dbt, Spark, Redshift, or Snowflake).
  • Write complex, high-performance SQL queries to support reporting and analytics needs.
  • Implement observability, alerting, and data quality monitoring for critical pipelines.
  • Drive best practices in data engineering and business intelligence, including documentation, testing, and CI/CD.
  • Contribute to the evolution of our next-generation data lakehouse and BI architecture.


What We’re Looking For


Minimum Qualifications

  • 5+ years of experience as a Data Engineer or in a hybrid data/reporting role.
  • Strong programming skills in Python, Java, or Scala.
  • Proficiency with data tools such as Databricks, data modeling techniques (e.g., star schema, dimensional modeling), and data warehousing solutions like Snowflake or Redshift.
  • Hands-on experience with modern data platforms and orchestration tools (e.g., Spark, Kafka, Airflow).
  • Proficient in SQL with experience in writing and optimizing complex queries for BI and analytics.
  • Experience with BI tools such as Looker Studio, Power BI, or Tableau.
  • Experience in building and maintaining robust ETL/ELT pipelines in production.
  • Understanding of data quality, observability, and governance best practices.


Bonus Points

  • Experience with dbt, Terraform, or Kubernetes.
  • Familiarity with real-time data processing or streaming architectures.
  • Understanding of data privacy, compliance, and security best practices in analytics and reporting.


Why You’ll Love Working Here

  • Data with purpose: Work on problems that directly impact how the world builds secure software.
  • Full-spectrum impact: Use both engineering and analytical skills to shape product, strategy, and operations.
  • Modern tooling: Leverage the best of open-source and cloud-native technologies.
  • Collaborative culture: Join a passionate team that values learning, autonomy, and real-world impact.
Read more
Sonatype

at Sonatype

5 candid answers
Reshika Mendiratta
Posted by Reshika Mendiratta
Hyderabad
2 - 5 yrs
Upto ₹20L / yr (Varies
)
skill iconPython
ETL
Spark
Apache Kafka
databricks
+12 more

About the Role

We’re hiring a Data Engineer to join our Data Platform team. You’ll help build and scale the systems that power analytics, reporting, and data-driven features across the company. This role works with engineers, analysts, and product teams to make sure our data is accurate, available, and usable.


What You’ll Do

  • Build and maintain reliable data pipelines and ETL/ELT workflows.
  • Develop and optimize data models for analytics and internal tools.
  • Work with team members to deliver clean, trusted datasets.
  • Support core data platform tools like Airflow, dbt, Spark, Redshift, or Snowflake.
  • Monitor data pipelines for quality, performance, and reliability.
  • Write clear documentation and contribute to test coverage and CI/CD processes.
  • Help shape our data lakehouse architecture and platform roadmap.


What You Need

  • 2–4 years of experience in data engineering or a backend data-related role.
  • Strong skills in Python or another backend programming language.
  • Experience working with SQL and distributed data systems (e.g., Spark, Kafka).
  • Familiarity with NoSQL stores like HBase or similar.
  • Comfortable writing efficient queries and building data workflows.
  • Understanding of data modeling for analytics and reporting.
  • Exposure to tools like Airflow or other workflow schedulers.


Bonus Points

  • Experience with DBT, Databricks, or real-time data pipelines.
  • Familiarity with cloud infrastructure tools like Terraform or Kubernetes.
  • Interest in data governance, ML pipelines, or compliance standards.


Why Join Us?

  • Work on data that supports meaningful software security outcomes.
  • Use modern tools in a cloud-first, open-source-friendly environment.
  • Join a team that values clarity, learning, and autonomy.


If you're excited about building impactful software and helping others do the same, this is an opportunity to grow as a technical leader and make a meaningful impact.

Read more
NeoGenCode Technologies Pvt Ltd
Bengaluru (Bangalore)
8 - 12 yrs
₹15L - ₹22L / yr
Data engineering
Google Cloud Platform (GCP)
Data Transformation Tool (DBT)
Google Dataform
BigQuery
+6 more

Job Title : Data Engineer – GCP + Spark + DBT

Location : Bengaluru (On-site at Client Location | 3 Days WFO)

Experience : 8 to 12 Years

Level : Associate Architect

Type : Full-time


Job Overview :

We are looking for a seasoned Data Engineer to join the Data Platform Engineering team supporting a Unified Data Platform (UDP). This role requires hands-on expertise in DBT, GCP, BigQuery, and PySpark, with a solid foundation in CI/CD, data pipeline optimization, and agile delivery.


Mandatory Skills : GCP, DBT, Google Dataform, BigQuery, PySpark/Spark SQL, Advanced SQL, CI/CD, Git, Agile Methodologies.


Key Responsibilities :

  • Design, build, and optimize scalable data pipelines using BigQuery, DBT, and PySpark.
  • Leverage GCP-native services like Cloud Storage, Pub/Sub, Dataproc, Cloud Functions, and Composer for ETL/ELT workflows.
  • Implement and maintain CI/CD for data engineering projects with Git-based version control.
  • Collaborate with cross-functional teams including Infra, Security, and DataOps for reliable, secure, and high-quality data delivery.
  • Lead code reviews, mentor junior engineers, and enforce best practices in data engineering.
  • Participate in Agile sprints, backlog grooming, and Jira-based project tracking.

Must-Have Skills :

  • Strong experience with DBT, Google Dataform, and BigQuery
  • Hands-on expertise with PySpark/Spark SQL
  • Proficient in GCP for data engineering workflows
  • Solid knowledge of SQL optimization, Git, and CI/CD pipelines
  • Agile team experience and strong problem-solving abilities

Nice-to-Have Skills :

  • Familiarity with Databricks, Delta Lake, or Kafka
  • Exposure to data observability and quality frameworks (e.g., Great Expectations, Soda)
  • Knowledge of MDM patterns, Terraform, or IaC is a plus
Read more
VyTCDC
Gobinath Sundaram
Posted by Gobinath Sundaram
Bengaluru (Bangalore)
5 - 8 yrs
₹4L - ₹25L / yr
Data engineering
skill iconPython
Spark

🛠️ Key Responsibilities

  • Design, build, and maintain scalable data pipelines using Python and Apache Spark (PySpark or Scala APIs)
  • Develop and optimize ETL processes for batch and real-time data ingestion
  • Collaborate with data scientists, analysts, and DevOps teams to support data-driven solutions
  • Ensure data quality, integrity, and governance across all stages of the data lifecycle
  • Implement data validation, monitoring, and alerting mechanisms for production pipelines
  • Work with cloud platforms (AWS, GCP, or Azure) and tools like Airflow, Kafka, and Delta Lake
  • Participate in code reviews, performance tuning, and documentation


🎓 Qualifications

  • Bachelor’s or Master’s degree in Computer Science, Engineering, or related field
  • 3–6 years of experience in data engineering with a focus on Python and Spark
  • Experience with distributed computing and handling large-scale datasets (10TB+)
  • Familiarity with data security, PII handling, and compliance standards is a plus


Read more
KJBN labs

at KJBN labs

2 candid answers
sakthi ganesh
Posted by sakthi ganesh
Bengaluru (Bangalore)
4 - 7 yrs
₹10L - ₹30L / yr
Hadoop
Apache Kafka
Spark
skill iconPython
skill iconJava
+8 more

Senior Data Engineer Job Description

Overview

The Senior Data Engineer will design, develop, and maintain scalable data pipelines and

infrastructure to support data-driven decision-making and advanced analytics. This role requires deep

expertise in data engineering, strong problem-solving skills, and the ability to collaborate with

cross-functional teams to deliver robust data solutions.

Key Responsibilities


Data Pipeline Development: Design, build, and optimize scalable, secure, and reliable data

pipelines to ingest, process, and transform large volumes of structured and unstructured data.

Data Architecture: Architect and maintain data storage solutions, including data lakes, data

warehouses, and databases, ensuring performance, scalability, and cost-efficiency.

Data Integration: Integrate data from diverse sources, including APIs, third-party systems,

and streaming platforms, ensuring data quality and consistency.

Performance Optimization: Monitor and optimize data systems for performance, scalability,

and cost, implementing best practices for partitioning, indexing, and caching.

Collaboration: Work closely with data scientists, analysts, and software engineers to

understand data needs and deliver solutions that enable advanced analytics, machine

learning, and reporting.

Data Governance: Implement data governance policies, ensuring compliance with data

security, privacy regulations (e.g., GDPR, CCPA), and internal standards.

Automation: Develop automated processes for data ingestion, transformation, and validation

to improve efficiency and reduce manual intervention.

Mentorship: Guide and mentor junior data engineers, fostering a culture of technical

excellence and continuous learning.

Troubleshooting: Diagnose and resolve complex data-related issues, ensuring high

availability and reliability of data systems.

Required Qualifications

Education: Bachelor’s or Master’s degree in Computer Science, Engineering, Data Science,

or a related field.

Experience: 5+ years of experience in data engineering or a related role, with a proven track

record of building scalable data pipelines and infrastructure.

Technical Skills:

Proficiency in programming languages such as Python, Java, or Scala.

Expertise in SQL and experience with NoSQL databases (e.g., MongoDB, Cassandra).

Strong experience with cloud platforms (e.g., AWS, Azure, GCP) and their data services

(e.g., Redshift, BigQuery, Snowflake).

Hands-on experience with ETL/ELT tools (e.g., Apache Airflow, Talend, Informatica) and

data integration frameworks.

Familiarity with big data technologies (e.g., Hadoop, Spark, Kafka) and distributed

systems.

Knowledge of containerization and orchestration tools (e.g., Docker, Kubernetes) is a

plus.

Soft Skills:

Excellent problem-solving and analytical skills.

Strong communication and collaboration abilities.

Ability to work in a fast-paced, dynamic environment and manage multiple priorities.

Certifications (optional but preferred): Cloud certifications (e.g., AWS Certified Data Analytics,

Google Professional Data Engineer) or relevant data engineering certifications.

Preferred Qualifica

Experience with real-time data processing and streaming architectures.

Familiarity with machine learning pipelines and MLOps practices.

Knowledge of data visualization tools (e.g., Tableau, Power BI) and their integration with data

pipelines.

Experience in industries with high data complexity, such as finance, healthcare, or

e-commerce.

Work Environment

Location: Hybrid/Remote/On-site (depending on company policy).

Team: Collaborative, cross-functional team environment with data scientists, analysts, and

business stakeholders.

Hours: Full-time, with occasional on-call responsibilities for critical data systems.

Read more
Intellikart Ventures LLP
ramandeep intellikart
Posted by ramandeep intellikart
Mumbai
3 - 5 yrs
₹18L - ₹19L / yr
SQL
Spark
Data modeling
Windows Azure
skill iconData Analytics
+1 more

Location: Mumbai

Job Type: Full-Time (Hybrid – 3 days in office, 2 days WFH)


Job Overview:

We are looking for a skilled Azure Data Engineer with strong experience in data modeling, pipeline development, and SQL/Spark expertise. The ideal candidate will work closely with the Data Analytics & BI teams to implement robust data solutions on Azure Synapse and ensure seamless data integration with third-party applications.


Key Responsibilities:

  • Design, develop, and maintain Azure data pipelines using Azure Synapse (SQL dedicated pools or Apache Spark pools).
  • Implement data models in collaboration with the Data Analytics and BI teams.
  • Optimize and manage large-scale SQL and Spark-based data processing solutions.
  • Ensure data availability and reliability for third-party application consumption.
  • Collaborate with cross-functional teams to translate business requirements into scalable data solutions.


Required Skills & Experience:

3–5 years of hands-on experience in:

  • Azure data services
  • Data Modeling
  • SQL development and tuning
  • Apache Spark
  • Strong knowledge of Azure Synapse Analytics.
  • Experience in designing data pipelines and ETL/ELT processes.
  • Ability to troubleshoot and optimize complex data workflows.


Preferred Qualifications:

  • Experience with data governance, security, and data quality practices.
  • Familiarity with DevOps practices in a data engineering context.
  • Effective communication skills and the ability to work in a collaborative team environment.
Read more
Pulsedata Labs Pvt Ltd

Pulsedata Labs Pvt Ltd

Agency job
Remote only
5 - 7 yrs
₹20L - ₹30L / yr
databricks
Spark
pythonspark
SQL server
ETL
+2 more

Company name: PulseData labs Pvt Ltd (captive Unit for URUS, USA)


About URUS

We are the URUS family (US), a global leader in products and services for Agritech.


SENIOR DATA ENGINEER

This role is responsible for the design, development, and maintenance of data integration and reporting solutions. The ideal candidate will possess expertise in Databricks and strong skills in SQL Server, SSIS and SSRS, and experience with other modern data engineering tools such as Azure Data Factory. This position requires a proactive and results-oriented individual with a passion for data and a strong understanding of data warehousing principles.


Responsibilities

Data Integration

  • Design, develop, and maintain robust and efficient ETL pipelines and processes on Databricks.
  • Troubleshoot and resolve Databricks pipeline errors and performance issues.
  • Maintain legacy SSIS packages for ETL processes.
  • Troubleshoot and resolve SSIS package errors and performance issues.
  • Optimize data flow performance and minimize data latency.
  • Implement data quality checks and validations within ETL processes.

Databricks Development

  • Develop and maintain Databricks pipelines and datasets using Python, Spark and SQL.
  • Migrate legacy SSIS packages to Databricks pipelines.
  • Optimize Databricks jobs for performance and cost-effectiveness.
  • Integrate Databricks with other data sources and systems.
  • Participate in the design and implementation of data lake architectures.

Data Warehousing

  • Participate in the design and implementation of data warehousing solutions.
  • Support data quality initiatives and implement data cleansing procedures.

Reporting and Analytics

  • Collaborate with business users to understand data requirements for department driven reporting needs.
  • Maintain existing library of complex SSRS reports, dashboards, and visualizations.
  • Troubleshoot and resolve SSRS report issues, including performance bottlenecks and data inconsistencies.

Collaboration and Communication

  • Comfortable in entrepreneurial, self-starting, and fast-paced environment, working both independently and with our highly skilled teams.
  • Collaborate effectively with business users, data analysts, and other IT teams.
  • Communicate technical information clearly and concisely, both verbally and in writing.
  • Document all development work and procedures thoroughly.

Continuous Growth

  • Keep abreast of the latest advancements in data integration, reporting, and data engineering technologies.
  • Continuously improve skills and knowledge through training and self-learning.

This job description reflects managements assignment of essential functions; it does not prescribe or restrict the tasks that may be assigned.


Requirements

  • Bachelor's degree in computer science, Information Systems, or a related field.
  • 7+ years of experience in data integration and reporting.
  • Extensive experience with Databricks, including Python, Spark, and Delta Lake.
  • Strong proficiency in SQL Server, including T-SQL, stored procedures, and functions.
  • Experience with SSIS (SQL Server Integration Services) development and maintenance.
  • Experience with SSRS (SQL Server Reporting Services) report design and development.
  • Experience with data warehousing concepts and best practices.
  • Experience with Microsoft Azure cloud platform and Microsoft Fabric desirable.
  • Strong analytical and problem-solving skills.
  • Excellent communication and interpersonal skills.
  • Ability to work independently and as part of a team.
  • Experience with Agile methodologies.



Read more
Valuebound
Suchandni Verma
Posted by Suchandni Verma
Chennai
3 - 7 yrs
₹12L - ₹25L / yr
MLOps
skill iconAmazon Web Services (AWS)
skill iconKubernetes
SQL Azure
Data Structures
+1 more

What you’ll do

  • Tame data → pull, clean, and shape structured & unstructured data.
  • Orchestrate pipelines → Airflow / Step Functions / ADF… your call.
  • Ship models → build, tune, and push to prod on SageMaker, Azure ML, or Vertex AI.
  • Scale → Spark / Databricks for the heavy lifting.
  • Automate everything → Docker, Kubernetes, CI/CD, MLFlow, Seldon, Kubeflow.
  • Pair up → work with engineers, architects, and business folks to solve real problems, fast.



What you bring

  • 3+ yrs hands-on MLOps (4-5 yrs total software experience).
  • Proven chops on one hyperscaler (AWS, Azure, or GCP).
  • Confidence with Databricks / Spark, Python, SQL, TensorFlow / PyTorch / Scikit-learn.
  • You debug Kubernetes in your sleep and treat Dockerfiles like breathing.
  • You prototype with open-source first, choose the right tool, then make it scale.
  • Sharp mind, low ego, bias for action.



Nice-to-haves

  • Sagemaker, Azure ML, or Vertex AI in production.
  • Love for clean code, clear docs, and crisp PRs.



Read more
A leader in telecom, fintech, AI-led marketing automation.

A leader in telecom, fintech, AI-led marketing automation.

Agency job
via Infinium Associate by Toshi Srivastava
Bengaluru (Bangalore)
9 - 15 yrs
₹25L - ₹35L / yr
MERN Stack
skill iconPython
skill iconMongoDB
Spark
Hadoop
+7 more

We are looking for a talented MERN Developer with expertise in MongoDB/MySQL, Kubernetes, Python, ETL, Hadoop, and Spark. The ideal candidate will design, develop, and optimize scalable applications while ensuring efficient source code management and implementing Non-Functional Requirements (NFRs).


Key Responsibilities:

  • Develop and maintain robust applications using MERN Stack (MongoDB, Express.js, React.js, Node.js).
  • Design efficient database architectures (MongoDB/MySQL) for scalable data handling.
  • Implement and manage Kubernetes-based deployment strategies for containerized applications.
  • Ensure compliance with Non-Functional Requirements (NFRs), including source code management, development tools, and security best practices.
  • Develop and integrate Python-based functionalities for data processing and automation.
  • Work with ETL pipelines for smooth data transformations.
  • Leverage Hadoop and Spark for processing and optimizing large-scale data operations.
  • Collaborate with solution architects, DevOps teams, and data engineers to enhance system performance.
  • Conduct code reviews, troubleshooting, and performance optimization to ensure seamless application functionality.


Required Skills & Qualifications:

  • Proficiency in MERN Stack (MongoDB, Express.js, React.js, Node.js).
  • Strong understanding of database technologies (MongoDB/MySQL).
  • Experience working with Kubernetes for container orchestration.
  • Hands-on knowledge of Non-Functional Requirements (NFRs) in application development.
  • Expertise in Python, ETL pipelines, and big data technologies (Hadoop, Spark).
  • Strong problem-solving and debugging skills.
  • Knowledge of microservices architecture and cloud computing frameworks.

Preferred Qualifications:

  • Certifications in cloud computing, Kubernetes, or database management.
  • Experience in DevOps, CI/CD automation, and infrastructure management.
  • Understanding of security best practices in application development.


Read more
Hunarstreet Technologies pvt ltd

Hunarstreet Technologies pvt ltd

Agency job
via Hunarstreet Technologies pvt ltd by Sakshi Patankar
Remote only
10 - 20 yrs
₹15L - ₹30L / yr
Data engineering
databricks
skill iconPython
skill iconScala
Spark
+14 more

What You’ll Be Doing:

● Design and build parts of our data pipeline architecture for extraction, transformation, and loading of data from a wide variety of data sources using the latest Big Data technologies.

● Identify, design, and implement internal process improvements: automating manual processes, optimizing data delivery, re-designing infrastructure for greater scalability, etc.

● Work with stakeholders including the Executive, Product, Data and Design teams to assist with data-related technical issues and support their data infrastructure needs.

● Work with machine learning, data, and analytics experts to drive innovation, accuracy and greater functionality in our data system. Qualifications:

● Bachelor's degree in Engineering, Computer Science, or relevant field.

● 10+ years of relevant and recent experience in a Data Engineer role. ● 5+ years recent experience with Apache Spark and solid understanding of the fundamentals.

● Deep understanding of Big Data concepts and distributed systems.

● Strong coding skills with Scala, Python, Java and/or other languages and the ability to quickly switch between them with ease.

● Advanced working SQL knowledge and experience working with a variety of relational databases such as Postgres and/or MySQL.

● Cloud Experience with DataBricks

● Experience working with data stored in many formats including Delta Tables, Parquet, CSV and JSON.

● Comfortable working in a linux shell environment and writing scripts as needed.

● Comfortable working in an Agile environment

● Machine Learning knowledge is a plus.

● Must be capable of working independently and delivering stable, efficient and reliable software.

● Excellent written and verbal communication skills in English.

● Experience supporting and working with cross-functional teams in a dynamic environment


EMPLOYMENT TYPE: Full-Time, Permanent

LOCATION: Remote (Pan India)

SHIFT TIMINGS: 2.00 pm-11:00pm IST 

Read more
Data Havn

Data Havn

Agency job
via Infinium Associate by Toshi Srivastava
Noida
5 - 8 yrs
₹25L - ₹40L / yr
Data engineering
skill iconPython
SQL
Data Warehouse (DWH)
ETL
+6 more

About the Role:

We are seeking a talented Lead Data Engineer to join our team and play a pivotal role in transforming raw data into valuable insights. As a Data Engineer, you will design, develop, and maintain robust data pipelines and infrastructure to support our organization's analytics and decision-making processes.

Responsibilities:

  • Data Pipeline Development: Build and maintain scalable data pipelines to extract, transform, and load (ETL) data from various sources (e.g., databases, APIs, files) into data warehouses or data lakes.
  • Data Infrastructure: Design, implement, and manage data infrastructure components, including data warehouses, data lakes, and data marts.
  • Data Quality: Ensure data quality by implementing data validation, cleansing, and standardization processes.
  • Team Management: Able to handle team.
  • Performance Optimization: Optimize data pipelines and infrastructure for performance and efficiency.
  • Collaboration: Collaborate with data analysts, scientists, and business stakeholders to understand their data needs and translate them into technical requirements.
  • Tool and Technology Selection: Evaluate and select appropriate data engineering tools and technologies (e.g., SQL, Python, Spark, Hadoop, cloud platforms).
  • Documentation: Create and maintain clear and comprehensive documentation for data pipelines, infrastructure, and processes.

 

 Skills:

  • Strong proficiency in SQL and at least one programming language (e.g., Python, Java).
  • Experience with data warehousing and data lake technologies (e.g., Snowflake, AWS Redshift, Databricks).
  • Knowledge of cloud platforms (e.g., AWS, GCP, Azure) and cloud-based data services.
  • Understanding of data modeling and data architecture concepts.
  • Experience with ETL/ELT tools and frameworks.
  • Excellent problem-solving and analytical skills.
  • Ability to work independently and as part of a team.

Preferred Qualifications:

  • Experience with real-time data processing and streaming technologies (e.g., Kafka, Flink).
  • Knowledge of machine learning and artificial intelligence concepts.
  • Experience with data visualization tools (e.g., Tableau, Power BI).
  • Certification in cloud platforms or data engineering.


Read more
Hiring for MNC

Hiring for MNC

Agency job
via Spes Manning Solution by srushti patil
Remote only
5 - 10 yrs
₹18L - ₹20L / yr
skill iconScala
Akka
Spark
skill iconPython

Job Description:


Interviews will be scheduled in two days. 


We are seeking a highly skilled Scala Developer to join our team on an immediate basis. The ideal candidate will work remotely and collaborate with a US-based client, so excellent communication skills are essential.


Key Responsibilities:


Develop scalable and high-performance applications using Scala.


Collaborate with cross-functional teams to understand requirements and deliver quality solutions.


Write clean, maintainable, and testable code.


Optimize application performance and troubleshoot issues.


Participate in code reviews and ensure adherence to best practices.


Required Skills:


Strong experience in Scala development.


Solid understanding of functional programming principles.


Experience with frameworks like Akka, Play, or Spark is a plus.


Good knowledge of REST APIs, microservices architecture, and concurrency.


Familiarity with CI/CD, Git, and Agile methodologies.


Roles & Responsibilities


  • Develop and maintain scalable backend services using Scala.
  • Design and integrate RESTful APIs and microservices.
  • Collaborate with cross-functional teams to deliver technical solutions.
  • Write clean, efficient, and testable code.
  • Participate in code reviews and ensure code quality.
  • Troubleshoot issues and optimize performance.
  • Stay updated on Scala and backend development best practices.



Immediate joiner prefer.

Read more
Xebia IT Architects

at Xebia IT Architects

2 recruiters
Vijay S
Posted by Vijay S
Bengaluru (Bangalore), Gurugram, Pune, Hyderabad, Chennai, Bhopal, Jaipur
10 - 15 yrs
₹30L - ₹40L / yr
Spark
Google Cloud Platform (GCP)
skill iconPython
Apache Airflow
PySpark
+1 more

We are looking for a Senior Data Engineer with strong expertise in GCP, Databricks, and Airflow to design and implement a GCP Cloud Native Data Processing Framework. The ideal candidate will work on building scalable data pipelines and help migrate existing workloads to a modern framework.


  • Shift: 2 PM 11 PM
  • Work Mode: Hybrid (3 days a week) across Xebia locations
  • Notice Period: Immediate joiners or those with a notice period of up to 30 days


Key Responsibilities:

  • Design and implement a GCP Native Data Processing Framework leveraging Spark and GCP Cloud Services.
  • Develop and maintain data pipelines using Databricks and Airflow for transforming Raw → Silver → Gold data layers.
  • Ensure data integrity, consistency, and availability across all systems.
  • Collaborate with data engineers, analysts, and stakeholders to optimize performance.
  • Document standards and best practices for data engineering workflows.

Required Experience:


  • 7-8 years of experience in data engineering, architecture, and pipeline development.
  • Strong knowledge of GCP, Databricks, PySpark, and BigQuery.
  • Experience with Orchestration tools like Airflow, Dagster, or GCP equivalents.
  • Understanding of Data Lake table formats (Delta, Iceberg, etc.).
  • Proficiency in Python for scripting and automation.
  • Strong problem-solving skills and collaborative mindset.


⚠️ Please apply only if you have not applied recently or are not currently in the interview process for any open roles at Xebia.


Looking forward to your response!


Best regards,

Vijay S

Assistant Manager - TAG

https://www.linkedin.com/in/vijay-selvarajan/

Read more
OnActive
Mansi Gupta
Posted by Mansi Gupta
Gurugram, Pune, Bengaluru (Bangalore), Chennai, Bhopal, Hyderabad, Jaipur
5 - 8 yrs
₹6L - ₹12L / yr
skill iconPython
Spark
SQL
AWS CloudFormation
skill iconMachine Learning (ML)
+3 more

Level of skills and experience:


5 years of hands-on experience in using Python, Spark,Sql.

Experienced in AWS Cloud usage and management.

Experience with Databricks (Lakehouse, ML, Unity Catalog, MLflow).

Experience using various ML models and frameworks such as XGBoost, Lightgbm, Torch.

Experience with orchestrators such as Airflow and Kubeflow.

Familiarity with containerization and orchestration technologies (e.g., Docker, Kubernetes).

Fundamental understanding of Parquet, Delta Lake and other data file formats.

Proficiency on an IaC tool such as Terraform, CDK or CloudFormation.

Strong written and verbal English communication skill and proficient in communication with non-technical stakeholderst

Read more
Remote only
7 - 12 yrs
₹25L - ₹40L / yr
Spark
skill iconJava
Apache Kafka
Big Data
Apache Hive
+5 more

Job Title: Big Data Engineer (Java Spark Developer – JAVA SPARK EXP IS MUST)

Location: Chennai, Hyderabad, Pune, Bangalore (Bengaluru) / NCR Delhi

Client: Premium Tier 1 Company

Payroll: Direct Client

Employment Type: Full time / Perm

Experience: 7+ years

 

Job Description:

We are looking for a skilled Big Data Engineers using Java Spark with 7+ years of experience in Big Data / legacy platforms, who can join immediately. Desired candidate should have design, development and optimization of real-time & batch data pipelines experience in Big Data environment at an enterprise scale applications. You will work on building scalable and high-performance data processing solutions, integrating real-time data streams, and building a reliable Data platforms. Strong troubleshooting, performance tuning, and collaboration skills are key for this role.

 

Key Responsibilities:

·      Develop data pipelines using Java Spark and Kafka.

·      Optimize and maintain real-time data pipelines and messaging systems.

·      Collaborate with cross-functional teams to deliver scalable data solutions.

·      Troubleshoot and resolve issues in Java Spark and Kafka applications.

 

Qualifications:

·      Experience in Java Spark is must

·      Knowledge and hands-on experience using distributed computing, real-time data streaming, and big data technologies

·      Strong problem-solving and performance optimization skills

·      Looking for immediate joiners

 

If interested, please share your resume along with the following details

1)    Notice Period

2)    Current CTC

3)    Expected CTC

4)    Have Experience in Java Spark - Y / N (this is must)

5)    Any offers in hand

 

Thanks & Regards,

LION & ELEPHANTS CONSULTANCY PVT LTD TEAM

SINGAPORE | INDIA

 

Read more
Koantek
Bhoomika Varshney
Posted by Bhoomika Varshney
Remote only
4 - 8 yrs
₹10L - ₹30L / yr
skill iconPython
databricks
SQL
Spark
PySpark
+3 more

The Sr AWS/Azure/GCP Databricks Data Engineer at Koantek will use comprehensive

modern data engineering techniques and methods with Advanced Analytics to support

business decisions for our clients. Your goal is to support the use of data-driven insights

to help our clients achieve business outcomes and objectives. You can collect, aggregate, and analyze structured/unstructured data from multiple internal and external sources and

patterns, insights, and trends to decision-makers. You will help design and build data

pipelines, data streams, reporting tools, information dashboards, data service APIs, data

generators, and other end-user information portals and insight tools. You will be a critical

part of the data supply chain, ensuring that stakeholders can access and manipulate data

for routine and ad hoc analysis to drive business outcomes using Advanced Analytics. You are expected to function as a productive member of a team, working and

communicating proactively with engineering peers, technical lead, project managers, product owners, and resource managers. Requirements:

 Strong experience as an AWS/Azure/GCP Data Engineer and must have

AWS/Azure/GCP Databricks experience.  Expert proficiency in Spark Scala, Python, and spark

 Must have data migration experience from on-prem to cloud

 Hands-on experience in Kinesis to process & analyze Stream Data, Event/IoT Hubs, and Cosmos

 In depth understanding of Azure/AWS/GCP cloud and Data lake and Analytics

solutions on Azure.  Expert level hands-on development Design and Develop applications on Databricks.  Extensive hands-on experience implementing data migration and data processing

using AWS/Azure/GCP services

 In depth understanding of Spark Architecture including Spark Streaming, Spark Core, Spark SQL, Data Frames, RDD caching, Spark MLib

 Hands-on experience with the Technology stack available in the industry for data

management, data ingestion, capture, processing, and curation: Kafka, StreamSets, Attunity, GoldenGate, Map Reduce, Hadoop, Hive, Hbase, Cassandra, Spark, Flume, Hive, Impala, etc

 Hands-on knowledge of data frameworks, data lakes and open-source projects such

asApache Spark, MLflow, and Delta Lake

 Good working knowledge of code versioning tools [such as Git, Bitbucket or SVN]

 Hands-on experience in using Spark SQL with various data sources like JSON, Parquet and Key Value Pair

 Experience preparing data for Data Science and Machine Learning with exposure to- model selection, model lifecycle, hyperparameter tuning, model serving, deep

learning, etc

 Demonstrated experience preparing data, automating and building data pipelines for

AI Use Cases (text, voice, image, IoT data etc. ).  Good to have programming language experience with. NET or Spark/Scala

 Experience in creating tables, partitioning, bucketing, loading and aggregating data

using Spark Scala, Spark SQL/PySpark

 Knowledge of AWS/Azure/GCP DevOps processes like CI/CD as well as Agile tools

and processes including Git, Jenkins, Jira, and Confluence

 Working experience with Visual Studio, PowerShell Scripting, and ARM templates.  Able to build ingestion to ADLS and enable BI layer for Analytics

 Strong understanding of Data Modeling and defining conceptual logical and physical

data models.  Big Data/analytics/information analysis/database management in the cloud

 IoT/event-driven/microservices in the cloud- Experience with private and public cloud

architectures, pros/cons, and migration considerations.  Ability to remain up to date with industry standards and technological advancements

that will enhance data quality and reliability to advance strategic initiatives


 Working knowledge of RESTful APIs, OAuth2 authorization framework and security

best practices for API Gateways

 Guide customers in transforming big data projects, including development and

deployment of big data and AI applications

 Guide customers on Data engineering best practices, provide proof of concept, architect solutions and collaborate when needed

 2+ years of hands-on experience designing and implementing multi-tenant solutions


using AWS/Azure/GCP Databricks for data governance, data pipelines for near real-

time data warehouse, and machine learning solutions.  Over all 5+ years' experience in a software development, data engineering, or data


analytics field using Python, PySpark, Scala, Spark, Java, or equivalent technologies.  hands-on expertise in Apache SparkTM (Scala or Python)

 3+ years of experience working in query tuning, performance tuning, troubleshooting, and debugging Spark and other big data solutions.  Bachelor's or Master's degree in Big Data, Computer Science, Engineering, Mathematics, or similar area of study or equivalent work experience

 Ability to manage competing priorities in a fast-paced environment

 Ability to resolve issues

 Basic experience with or knowledge of agile methodologies

 AWS Certified: Solutions Architect Professional

 Databricks Certified Associate Developer for Apache Spark

 Microsoft Certified: Azure Data Engineer Associate

 GCP Certified: Professional Google Cloud Certified

Read more
Xebia IT Architects

at Xebia IT Architects

2 recruiters
Vijay S
Posted by Vijay S
Bengaluru (Bangalore), Pune, Hyderabad, Chennai, Gurugram, Bhopal, Jaipur
5 - 15 yrs
₹20L - ₹35L / yr
Spark
ETL
Data Transformation Tool (DBT)
skill iconPython
Apache Airflow
+2 more

We are seeking a highly skilled and experienced Offshore Data Engineer . The role involves designing, implementing, and testing data pipelines and products.


Qualifications & Experience:


bachelor's or master's degree in computer science, Information Systems, or a related field.


5+ years of experience in data engineering, with expertise in data architecture and pipeline development.


☁️ Proven experience with GCP, Big Query, Databricks, Airflow, Spark, DBT, and GCP Services.


️ Hands-on experience with ETL processes, SQL, PostgreSQL, MySQL, MongoDB, Cassandra.


Strong proficiency in Python and data modelling.


Experience in testing and validation of data pipelines.


Preferred: Experience with eCommerce systems, data visualization tools (Tableau, Looker), and cloud certifications.


If you meet the above criteria and are interested, please share your updated CV along with the following details:


Total Experience:


Current CTC:


Expected CTC:


Current Location:


Preferred Location:


Notice Period / Last Working Day (if serving notice):


⚠️ Kindly share your details only if you have not applied recently or are not currently in the interview process for any open roles at Xebia.


Looking forward to your response!

Read more
NeoGenCode Technologies Pvt Ltd
Akshay Patil
Posted by Akshay Patil
Gurugram
5 - 12 yrs
₹5L - ₹20L / yr
PySpark
Data engineering
Big Data
Hadoop
Spark
+17 more

Job Title : Senior AWS Data Engineer

Experience : 5+ Years

Location : Gurugram

Employment Type : Full-Time

Job Summary :

Seeking a Senior AWS Data Engineer with expertise in AWS to design, build, and optimize scalable data pipelines and data architectures. The ideal candidate will have experience in ETL/ELT, data warehousing, and big data technologies.

Key Responsibilities :

  • Build and optimize data pipelines using AWS (Glue, EMR, Redshift, S3, etc.).
  • Maintain data lakes & warehouses for analytics.
  • Ensure data integrity through quality checks.
  • Collaborate with data scientists & engineers to deliver solutions.

Qualifications :

  • 7+ Years in Data Engineering.
  • Expertise in AWS services, SQL, Python, Spark, Kafka.
  • Experience with CI/CD, DevOps practices.
  • Strong problem-solving skills.

Preferred Skills :

  • Experience with Snowflake, Databricks.
  • Knowledge of BI tools (Tableau, Power BI).
  • Healthcare/Insurance domain experience is a plus.
Read more
NeoGenCode Technologies Pvt Ltd
Bengaluru (Bangalore)
4 - 10 yrs
₹5L - ₹20L / yr
skill iconJava
J2EE
skill iconSpring Boot
Hibernate (Java)
Apache Spark
+13 more

Position : Software Engineer (Java Backend Engineer)

Experience : 4+ Years

📍 Location : Bangalore, India (Hybrid)

Mandatory Skills : Java 8+ (Advanced Features), Spring Boot, Apache Spark (Spark Streaming), SQL & Cosmos DB, Git, Maven, CI/CD (Jenkins, GitHub), Azure Cloud, Agile Scrum.


About the Role :

We are seeking a highly skilled Backend Engineer with expertise in Java, Spark, and microservices architecture to join our dynamic team. The ideal candidate will have a strong background in object-oriented programming, experience with Spark Streaming, and a deep understanding of distributed systems and cloud technologies.


Key Responsibilities :

  • Design, develop, and maintain highly scalable microservices and optimized RESTful APIs using Spring Boot and Java 8+.
  • Implement and optimize Spark Streaming applications for real-time data processing.
  • Utilize advanced Java 8 features, including:
  • Functional interfaces & Lambda expressions
  • Streams and Parallel Streams
  • Completable Futures & Concurrency API improvements
  • Enhanced Collections APIs
  • Work with relational (SQL) and NoSQL (Cosmos DB) databases, ensuring efficient data modeling and retrieval.
  • Develop and manage CI/CD pipelines using Jenkins, GitHub, and related automation tools.
  • Collaborate with cross-functional teams, including Product, Business, and Automation, to deliver end-to-end product features.
  • Ensure adherence to Agile Scrum practices and participate in code reviews to maintain high-quality standards.
  • Deploy and manage applications in Azure Cloud environments.


Minimum Qualifications:

  • BS/MS in Computer Science or a related field.
  • 4+ Years of experience developing backend applications with Spring Boot and Java 8+.
  • 3+ Years of hands-on experience with Git for version control.
  • Strong understanding of software design patterns and distributed computing principles.
  • Experience with Maven for building and deploying artifacts.
  • Proven ability to work in Agile Scrum environments with a collaborative team mindset.
  • Prior experience with Azure Cloud Technologies.
Read more
NeoGenCode Technologies Pvt Ltd
Akshay Patil
Posted by Akshay Patil
Gurugram
7 - 15 yrs
₹5L - ₹20L / yr
PySpark
Data engineering
Big Data
Hadoop
Spark
+20 more

Job Title : Tech Lead - Data Engineering (AWS, 7+ Years)

Location : Gurugram

Employment Type : Full-Time


Job Summary :

Seeking a Tech Lead - Data Engineering with expertise in AWS to design, build, and optimize scalable data pipelines and data architectures. The ideal candidate will have experience in ETL/ELT, data warehousing, and big data technologies.


Key Responsibilities :

  • Build and optimize data pipelines using AWS (Glue, EMR, Redshift, S3, etc.).
  • Maintain data lakes & warehouses for analytics.
  • Ensure data integrity through quality checks.
  • Collaborate with data scientists & engineers to deliver solutions.

Qualifications :

  • 7+ Years in Data Engineering.
  • Expertise in AWS services, SQL, Python, Spark, Kafka.
  • Experience with CI/CD, DevOps practices.
  • Strong problem-solving skills.

Preferred Skills :

  • Experience with Snowflake, Databricks.
  • Knowledge of BI tools (Tableau, Power BI).
  • Healthcare/Insurance domain experience is a plus.


Read more
Mphasis
Agency job
via Rigel Networks Pvt Ltd by Minakshi Soni
Bengaluru (Bangalore), Hyderabad
6 - 11 yrs
₹10L - ₹15L / yr
Software Testing (QA)
Test Automation (QA)
API Testing
UFT
skill iconJava
+11 more

Dear Candidate,

We are Urgently hiring QA Automation Engineers and Test leads At Hyderabad and Bangalore

Exp: 6-10 yrs

Locations: Hyderabad ,Bangalore


JD:

we are Hiring Automation Testers with 6-10 years of Automation testing experience using QA automation tools like Java, UFT, Selenium, API Testing, ETL & others

 

Must Haves:

·        Experience in Financial Domain is a must

·        Extensive Hands-on experience in Design, implement and maintain automation framework using Java, UFT, ETL, Selenium tools and automation concepts.

·        Experience with AWS concept and framework design/ testing.

·        Experience in Data Analysis, Data Validation, Data Cleansing, Data Verification and identifying data mismatch.

·        Experience with Databricks, Python, Spark, Hive, Airflow, etc.

·        Experience in validating and analyzing kubernetics log files.

·        API testing experience

·        Backend testing skills with ability to write SQL queries in Databricks and in Oracle databases

·        Experience in working with globally distributed Agile project teams

·        Ability to work in a fast-paced, globally structured and team-based environment, as well as independently

·        Experience in test management tools like Jira

·        Good written and verbal communication skills

Good To have:

  • Business and finance knowledge desirable

 

Best Regards,

Minakshi Soni

Executive - Talent Acquisition (L2)

Worldwide Locations: USA | HK | IN 

Read more
Molecular Connections

at Molecular Connections

4 recruiters
Molecular Connections
Posted by Molecular Connections
Bengaluru (Bangalore)
2 - 5 yrs
₹13L - ₹16L / yr
Spotfire
Qlikview
Tableau
PowerBI
Data Visualization
+4 more

Responsibilities:

·       Analyze complex data sets to answer specific questions using MMIT’s market access data (MMIT) and Norstella claims data, third-party claims data (IQVIA LAAD, Symphony SHA). Applicant must have experience working with the aforementioned data sets exclusively.

·       Deliver consultative services to clients related to MMIT RWD sets

·       Produce complex analytical reports using data visualization tools such as Power BI or Tableau

·       Define customized technical specifications to surface MMIT RWD in MMIT tools. 

·       Execute work in a timely fashion with high accuracy, while managing various competing priorities; Perform thorough troubleshooting and execute QA; Communicate with internal teams to obtain required data

·       Ensure adherence to documentation requirements, process workflows, timelines, and escalation protocols

·       And other duties as assigned.

 

Requirements:

·       Bachelor’s Degree or relevant experience required

·       2-5 yrs. of professional experience in RWD analytics using SQL

·       Fundamental understanding of Pharma and Market access space

·       Strong analysis skills and proficiency with tools such as Tableau or PowerBI

·       Excellent written and verbal communication skills.

·       Analytical, critical thinking and creative problem-solving skills.

·       Relationship building skills.

·       Solid organizational skills including attention to detail and multitasking skills.

·       Excellent time management and prioritization skills.

 

Read more
Affine
Rishika Chadha
Posted by Rishika Chadha
Remote only
5 - 8 yrs
Best in industry
skill iconScala
ETL
Apache Kafka
Object Oriented Programming (OOPs)
CI/CD
+4 more

Role Objective:


Big Data Engineer will be responsible for expanding and optimizing our data and database architecture, as well as optimizing data flow and collection for cross functional teams. The ideal candidate is an experienced data pipeline builder and data wrangler who enjoys optimizing data systems and building. The Data Engineer will support our software developers, database architects, data analysts and data scientists on data initiatives and will ensure optimal data delivery architecture is consistent throughout ongoing projects. They must be self-directed and comfortable supporting the data needs of multiple teams, systems, and products


Roles & Responsibilities:

  • Sound knowledge in Spark architecture and distributed computing and Spark streaming.
  • Proficient in Spark – including RDD and Data frames core functions, troubleshooting and performance tuning.
  • SFDC(Data modelling experience) would be given preference
  • Good understanding in object-oriented concepts and hands on experience on Scala with excellent programming logic and technique.
  • Good in functional programming and OOPS concept on Scala
  • Good experience in SQL – should be able to write complex queries.
  • Managing the team of Associates and Senior Associates and ensuring the utilization is maintained across the project.
  • Able to mentor new members for onboarding to the project.
  • Understand the client requirement and able to design, develop from scratch and deliver.
  • AWS cloud experience would be preferable.
  • Design, build and operationalize large scale enterprise data solutions and applications using one or more of AWS data and analytics services - DynamoDB, RedShift, Kinesis, Lambda, S3, etc. (preferred)
  • Hands on experience utilizing AWS Management Tools (CloudWatch, CloudTrail) to proactively monitor large and complex deployments (preferred)
  • Experience in analyzing, re-architecting, and re-platforming on-premises data warehouses to data platforms on AWS (preferred)
  • Leading the client calls to flag off any delays, blockers, escalations and collate all the requirements.
  • Managing project timing, client expectations and meeting deadlines.
  • Should have played project and team management roles.
  • Facilitate meetings within the team on regular basis.
  • Understand business requirement and analyze different approaches and plan deliverables and milestones for the project.
  • Optimization, maintenance, and support of pipelines.
  • Strong analytical and logical skills.
  • Ability to comfortably tackling new challenges and learn
Read more
Affine
Jeeba P
Posted by Jeeba P
Remote only
3 - 8 yrs
Best in industry
skill iconScala
Spark
Apache Kafka
SQL
skill iconAmazon Web Services (AWS)

Role Objective:


Big Data Engineer will be responsible for expanding and optimizing our data and database architecture, as well as optimizing data flow and collection for cross functional teams. The ideal candidate is an experienced data pipeline builder and data wrangler who enjoys optimizing data systems and building. The Data Engineer will support our software developers, database architects, data analysts and data scientists on data initiatives and will ensure optimal data delivery architecture is consistent throughout ongoing projects. They must be self-directed and comfortable supporting the data needs of multiple teams, systems, and products


Roles & Responsibilities:

  • Sound knowledge in Spark architecture and distributed computing and Spark streaming.
  • Proficient in Spark – including RDD and Data frames core functions, troubleshooting and performance tuning.
  • Good understanding in object-oriented concepts and hands on experience on Scala with excellent programming logic and technique.
  • Good in functional programming and OOPS concept on Scala
  • Good experience in SQL – should be able to write complex queries.
  • Managing the team of Associates and Senior Associates and ensuring the utilization is maintained across the project.
  • Able to mentor new members for onboarding to the project.
  • Understand the client requirement and able to design, develop from scratch and deliver.
  • AWS cloud experience would be preferable.
  • Design, build and operationalize large scale enterprise data solutions and applications using one or more of AWS data and analytics services - DynamoDB, RedShift, Kinesis, Lambda, S3, etc. (preferred)
  • Hands on experience utilizing AWS Management Tools (CloudWatch, CloudTrail) to proactively monitor large and complex deployments (preferred)
  • Experience in analyzing, re-architecting, and re-platforming on-premises data warehouses to data platforms on AWS (preferred)
  • Leading the client calls to flag off any delays, blockers, escalations and collate all the requirements.
  • Managing project timing, client expectations and meeting deadlines.
  • Should have played project and team management roles.
  • Facilitate meetings within the team on regular basis.
  • Understand business requirement and analyze different approaches and plan deliverables and milestones for the project.
  • Optimization, maintenance, and support of pipelines.
  • Strong analytical and logical skills.
  • Ability to comfortably tackling new challenges and learn

External Skills And Expertise

Must have Skills:

  • Scala
  • Spark
  • SQL (Intermediate to advanced level)
  • Spark Streaming
  • AWS preferable/Any cloud
  • Kafka /Kinesis/Any streaming services
  • Object-Oriented Programming
  • Hive, ETL/ELT design experience
  • CICD experience (ETL pipeline deployment)

Good to Have Skills:

  • AWS Certification
  • Git/similar version control tool
  • Knowledge in CI/CD, Microservices


Read more
Solix Technologies

at Solix Technologies

3 recruiters
Sumathi Arramraju
Posted by Sumathi Arramraju
Hyderabad
3 - 7 yrs
₹6L - ₹12L / yr
Hadoop
skill iconJava
HDFS
Spring
Spark
+1 more
Primary Skills required: Java, J2ee, JSP, Servlets, JDBC, Tomcat, Hadoop (hdfs, map reduce, hive, hbase, spark, impala) 
Secondary Skills: Streaming, Archiving , AWS / AZURE / CLOUD

Role:
·         Should have strong programming and support experience in Java, J2EE technologies 
·         Should have good experience in Core Java, JSP, Sevlets, JDBC
·         Good exposure in Hadoop development ( HDFS, Map Reduce, Hive, HBase, Spark)
·         Should have 2+ years of Java experience and 1+ years of experience in Hadoop 
·         Should possess good communication skills
·         Web Services or Elastic \ Map Reduce 
·         Familiarity with data-loading tools such as Sqoop
·         Good to know: Spark, Storm, Apache HBase
Read more
HrBizHub

HrBizHub

Agency job
via HR BIZ HUB by Pooja shankla
Bengaluru (Bangalore), Gurugram
3 - 7 yrs
₹3L - ₹18L / yr
Bigdata
Hibernate (Java)
skill iconSpring Boot
Microservices
Spark

software development and automated testing Proficient in Big Data technologies Designs, codes, tests, corrects and documents large and/or complex programs and program modifications from supplied specifications using agreed standards and tools, to achieve a well engineered result Proficient and Hands-on Data Warehousing,

Experience with Agile development, Continuous Integration, and Continuous Delivery Ability to effectively interpret technical and business objectives and provide solutions Strong communication skills, with ability to articulate technical solutions effectively across diverse group of stakeholders

Need to be a fast learner willing to adapt to evolving needs of the developer community.




Thanks & Regards

snehalata verma

IT Recruiter --HrBizHub


Read more
MathCo
Nabhan Mustafa
Posted by Nabhan Mustafa
Bengaluru (Bangalore)
2 - 8 yrs
Best in industry
Data Warehouse (DWH)
Microsoft Windows Azure
Data engineering
skill iconPython
skill iconAmazon Web Services (AWS)
+2 more
  • Responsible for designing, storing, processing, and maintaining of large-scale data and related infrastructure.
  • Can drive multiple projects both from operational and technical standpoint.
  • Ideate and build PoV or PoC for new product that can help drive more business.
  • Responsible for defining, designing, and implementing data engineering best practices, strategies, and solutions.
  • Is an Architect who can guide the customers, team, and overall organization on tools, technologies, and best practices around data engineering.
  • Lead architecture discussions, align with business needs, security, and best practices.
  • Has strong conceptual understanding of Data Warehousing and ETL, Data Governance and Security, Cloud Computing, and Batch & Real Time data processing
  • Has strong execution knowledge of Data Modeling, Databases in general (SQL and NoSQL), software development lifecycle and practices, unit testing, functional programming, etc.
  • Understanding of Medallion architecture pattern
  • Has worked on at least one cloud platform.
  • Has worked as data architect and executed multiple end-end data engineering project.
  • Has extensive knowledge of different data architecture designs and data modelling concepts.
  • Manages conversation with the client stakeholders to understand the requirement and translate it into technical outcomes.


Required Tech Stack

 

  • Strong proficiency in SQL
  • Experience working on any of the three major cloud platforms i.e., AWS/Azure/GCP
  • Working knowledge of an ETL and/or orchestration tools like IICS, Talend, Matillion, Airflow, Azure Data Factory, AWS Glue, GCP Composer, etc.
  • Working knowledge of one or more OLTP databases (Postgres, MySQL, SQL Server, etc.)
  • Working knowledge of one or more Data Warehouse like Snowflake, Redshift, Azure Synapse, Hive, Big Query, etc.
  • Proficient in at least one programming language used in data engineering, such as Python (or Scala/Rust/Java)
  • Has strong execution knowledge of Data Modeling (star schema, snowflake schema, fact vs dimension tables)
  • Proficient in Spark and related applications like Databricks, GCP DataProc, AWS Glue, EMR, etc.
  • Has worked on Kafka and real-time streaming.
  • Has strong execution knowledge of data architecture design patterns (lambda vs kappa architecture, data harmonization, customer data platforms, etc.)
  • Has worked on code and SQL query optimization.
  • Strong knowledge of version control systems like Git to manage source code repositories and designing CI/CD pipelines for continuous delivery.
  • Has worked on data and networking security (RBAC, secret management, key vaults, vnets, subnets, certificates)
Read more
Hyderabad
3 - 6 yrs
₹10L - ₹16L / yr
SQL
Spark
Analytical Skills
Hadoop
Communication Skills
+4 more

The Sr. Analytics Engineer would provide technical expertise in needs identification, data modeling, data movement, and transformation mapping (source to target), automation and testing strategies, translating business needs into technical solutions with adherence to established data guidelines and approaches from a business unit or project perspective.


Understands and leverages best-fit technologies (e.g., traditional star schema structures, cloud, Hadoop, NoSQL, etc.) and approaches to address business and environmental challenges.


Provides data understanding and coordinates data-related activities with other data management groups such as master data management, data governance, and metadata management.


Actively participates with other consultants in problem-solving and approach development.


Responsibilities :


Provide a consultative approach with business users, asking questions to understand the business need and deriving the data flow, conceptual, logical, and physical data models based on those needs.


Perform data analysis to validate data models and to confirm the ability to meet business needs.


Assist with and support setting the data architecture direction, ensuring data architecture deliverables are developed, ensuring compliance to standards and guidelines, implementing the data architecture, and supporting technical developers at a project or business unit level.


Coordinate and consult with the Data Architect, project manager, client business staff, client technical staff and project developers in data architecture best practices and anything else that is data related at the project or business unit levels.


Work closely with Business Analysts and Solution Architects to design the data model satisfying the business needs and adhering to Enterprise Architecture.


Coordinate with Data Architects, Program Managers and participate in recurring meetings.


Help and mentor team members to understand the data model and subject areas.


Ensure that the team adheres to best practices and guidelines.


Requirements :


- Strong working knowledge of at least 3 years of Spark, Java/Scala/Pyspark, Kafka, Git, Unix / Linux, and ETL pipeline designing.


- Experience with Spark optimization/tuning/resource allocations


- Excellent understanding of IN memory distributed computing frameworks like Spark and its parameter tuning, writing optimized workflow sequences.


- Experience of relational databases (e.g., PostgreSQL, MySQL) and NoSQL databases (e.g., Redshift, Bigquery, Cassandra, etc).


- Familiarity with Docker, Kubernetes, Azure Data Lake/Blob storage, AWS S3, Google Cloud storage, etc.


- Have a deep understanding of the various stacks and components of the Big Data ecosystem.


- Hands-on experience with Python is a huge plus

Read more
TVARIT GmbH

at TVARIT GmbH

2 candid answers
Shivani Kawade
Posted by Shivani Kawade
Remote, Pune
2 - 6 yrs
₹8L - ₹25L / yr
SQL Azure
databricks
skill iconPython
SQL
ETL
+9 more

TVARIT GmbH develops and delivers solutions in the field of artificial intelligence (AI) for the Manufacturing, automotive, and process industries. With its software products, TVARIT makes it possible for its customers to make intelligent and well-founded decisions, e.g., in forward-looking Maintenance, increasing the OEE and predictive quality. We have renowned reference customers, competent technology, a good research team from renowned Universities, and the award of a renowned AI prize (e.g., EU Horizon 2020) which makes TVARIT one of the most innovative AI companies in Germany and Europe.


We are looking for a self-motivated person with a positive "can-do" attitude and excellent oral and written communication skills in English.


We are seeking a skilled and motivated senior Data Engineer from the manufacturing Industry with over four years of experience to join our team. The Senior Data Engineer will oversee the department’s data infrastructure, including developing a data model, integrating large amounts of data from different systems, building & enhancing a data lake-house & subsequent analytics environment, and writing scripts to facilitate data analysis. The ideal candidate will have a strong foundation in ETL pipelines and Python, with additional experience in Azure and Terraform being a plus. This role requires a proactive individual who can contribute to our data infrastructure and support our analytics and data science initiatives.


Skills Required:


  • Experience in the manufacturing industry (metal industry is a plus)
  • 4+ years of experience as a Data Engineer
  • Experience in data cleaning & structuring and data manipulation
  • Architect and optimize complex data pipelines, leading the design and implementation of scalable data infrastructure, and ensuring data quality and reliability at scale
  • ETL Pipelines: Proven experience in designing, building, and maintaining ETL pipelines.
  • Python: Strong proficiency in Python programming for data manipulation, transformation, and automation.
  • Experience in SQL and data structures
  • Knowledge in big data technologies such as Spark, Flink, Hadoop, Apache, and NoSQL databases.
  • Knowledge of cloud technologies (at least one) such as AWS, Azure, and Google Cloud Platform.
  • Proficient in data management and data governance
  • Strong analytical experience & skills that can extract actionable insights from raw data to help improve the business.
  • Strong analytical and problem-solving skills.
  • Excellent communication and teamwork abilities.


Nice To Have:

  • Azure: Experience with Azure data services (e.g., Azure Data Factory, Azure Databricks, Azure SQL Database).
  • Terraform: Knowledge of Terraform for infrastructure as code (IaC) to manage cloud.
  • Bachelor’s degree in computer science, Information Technology, Engineering, or a related field from top-tier Indian Institutes of Information Technology (IIITs).
  • Benefits And Perks
  • A culture that fosters innovation, creativity, continuous learning, and resilience
  • Progressive leave policy promoting work-life balance
  • Mentorship opportunities with highly qualified internal resources and industry-driven programs
  • Multicultural peer groups and supportive workplace policies
  • Annual workcation program allowing you to work from various scenic locations
  • Experience the unique environment of a dynamic start-up


Why should you join TVARIT ?


Working at TVARIT, a deep-tech German IT startup, offers a unique blend of innovation, collaboration, and growth opportunities. We seek individuals eager to adapt and thrive in a rapidly evolving environment.


If this opportunity excites you and aligns with your career aspirations, we encourage you to apply today!

Read more
Nielsen
Dheeraj Sidana
Posted by Dheeraj Sidana
Gurugram, Bengaluru (Bangalore), Mumbai
1 - 20 yrs
Best in industry
PySpark
Data engineering
Big Data
Hadoop
Spark


Nielsen, a global company specialising in audience measurement and analytics, is currently seeking a proficient leader in data engineering to join their team in Bangalore, Gurgaon, or Mumbai.


This is a manager of managers role that involves managing multiple scrum teams and overseeing an advanced data platform that analyses audience consumption patterns across various channels like OTT, TV, Radio, and Social Media worldwide. You will be responsible for building and supervising a top-performing data engineering team that delivers data for targeted campaigns. Moreover, you will work with AWS services (S3, Lambda, Kinesis) and other data engineering technologies such as Spark, Scala/Python, Kafka, etc. There may also be opportunities to establish deep integrations with OTT platforms like Netflix, Prime Video, and other.


Read more
Scremer
Sathish Dhawan
Posted by Sathish Dhawan
Pune, Mumbai
6 - 11 yrs
₹15L - ₹15L / yr
skill iconAmazon Web Services (AWS)
skill iconPython
skill iconJava
Spark


Primary Skills

DynamoDB, Java, Kafka, Spark, Amazon Redshift, AWS Lake Formation, AWS Glue, Python


Skills:

Good work experience showing growth as a Data Engineer.

Hands On programming experience

Implementation Experience on Kafka, Kinesis, Spark, AWS Glue, AWS Lake Formation.

Excellent knowledge in: Python, Scala/Java, Spark, AWS (Lambda, Step Functions, Dynamodb, EMR), Terraform, UI (Angular), Git, Mavena

Experience of performance optimization in Batch and Real time processing applications

Expertise in Data Governance and Data Security Implementation

Good hands-on design and programming skills building reusable tools and products Experience developing in AWS or similar cloud platforms. Preferred:, ECS, EKS, S3, EMR, DynamoDB, Aurora, Redshift, Quick Sight or similar.

Familiarity with systems with very high volume of transactions, micro service design, or data processing pipelines (Spark).

Knowledge and hands-on experience with server less technologies such as Lambda, MSK, MWAA, Kinesis Analytics a plus.

Expertise in practices like Agile, Peer reviews, Continuous Integration


Roles and responsibilities:

Determining project requirements and developing work schedules for the team.

Delegating tasks and achieving daily, weekly, and monthly goals.

Responsible for designing, building, testing, and deploying the software releases.


Salary: 25LPA-40LPA

Read more
Sadup Softech

at Sadup Softech

1 recruiter
madhuri g
Posted by madhuri g
Bengaluru (Bangalore)
3 - 6 yrs
₹12L - ₹15L / yr
PySpark
Data engineering
Big Data
Hadoop
Spark
+2 more

Must have skills


3 to 6 years

Data Science

SQL, Excel, Big Query - mandate 3+ years

Python/ML, Hadoop, Spark - 2+ years


Requirements


• 3+ years prior experience as a data analyst

• Detail oriented, structural thinking and analytical mindset.

• Proven analytic skills, including data analysis and data validation.

• Technical writing experience in relevant areas, including queries, reports, and presentations.

• Strong SQL and Excel skills with the ability to learn other analytic tools

• Good communication skills (being precise and clear)

• Good to have prior knowledge of python and ML algorithms

Read more
Wissen Technology

at Wissen Technology

4 recruiters
Tony Tom
Posted by Tony Tom
Pune
6 - 12 yrs
₹2L - ₹30L / yr
Python
AWS
Spark

Location: Pune

Required Skills : Scala, Python, Data Engineering, AWS, Cassandra/AstraDB, Athena, EMR, Spark/Snowflake


Read more
Sadup Softech

at Sadup Softech

1 recruiter
madhuri g
Posted by madhuri g
Remote only
4 - 6 yrs
₹4L - ₹15L / yr
Google Cloud Platform (GCP)
big query
PySpark
Data engineering
Big Data
+2 more

Job Description:

We are seeking a talented Machine Learning Engineer with expertise in software engineering to join our team. As a Machine Learning Engineer, your primary responsibility will be to develop machine learning (ML) solutions that focus on technology process improvements. Specifically, you will be working on projects involving ML & Generative AI solutions for Technology & Data Management Efficiencies such as optimal cloud computing, knowledge bots, Software Code Assistants, Automatic Data Management etc

 

Responsibilities:

- Collaborate with cross-functional teams to identify opportunities for technology process improvements that can be solved using machine learning and generative AI.

- Define and build innovate ML and Generative AI systems such as AI Assistants for varied SDLC tasks, and improve Data & Infrastructure management etc. 

- Design and develop ML Engineering Solutions, generative AI Applications & Fine-Tuning Large Language Models (LLMs) for above ensuring scalability, efficiency, and maintainability of such solutions.

- Implement prompt engineering techniques to fine-tune and enhance LLMs for better performance and application-specific needs.

- Stay abreast of the latest advancements in the field of Generative AI and actively contribute to the research and development of new ML & Generative AI Solutions.

 

Requirements:

- A Master's or Ph.D. degree in Computer Science, Statistics, Data Science, or a related field.

- Proven experience working as a Software Engineer, with a focus on ML Engineering and exposure to Generative AI Applications such as chatGPT.

- Strong proficiency in programming languages such as Java, Scala, Python, Google Cloud, Biq Query, Hadoop & Spark etc

- Solid knowledge of software engineering best practices, including version control systems (e.g., Git), code reviews, and testing methodologies.

- Familiarity with large language models (LLMs), prompt engineering techniques, vector DB's, embedding & various fine-tuning techniques.

- Strong communication skills to effectively collaborate and present findings to both technical and non-technical stakeholders.

- Proven ability to adapt and learn new technologies and frameworks quickly.

- A proactive mindset with a passion for continuous learning and research in the field of Generative AI.

 

If you are a skilled and innovative Data Scientist with a passion for Generative AI, and have a desire to contribute to technology process improvements, we would love to hear from you. Join our team and help shape the future of our AI Driven Technology Solutions.

Read more
Publicis Sapient

at Publicis Sapient

10 recruiters
Mohit Singh
Posted by Mohit Singh
Bengaluru (Bangalore), Pune, Hyderabad, Gurugram, Noida
5 - 11 yrs
₹20L - ₹36L / yr
PySpark
Data engineering
Big Data
Hadoop
Spark
+7 more

Publicis Sapient Overview:

The Senior Associate People Senior Associate L1 in Data Engineering, you will translate client requirements into technical design, and implement components for data engineering solution. Utilize deep understanding of data integration and big data design principles in creating custom solutions or implementing package solutions. You will independently drive design discussions to insure the necessary health of the overall solution 

.

Job Summary:

As Senior Associate L2 in Data Engineering, you will translate client requirements into technical design, and implement components for data engineering solution. Utilize deep understanding of data integration and big data design principles in creating custom solutions or implementing package solutions. You will independently drive design discussions to insure the necessary health of the overall solution

The role requires a hands-on technologist who has strong programming background like Java / Scala / Python, should have experience in Data Ingestion, Integration and data Wrangling, Computation, Analytics pipelines and exposure to Hadoop ecosystem components. You are also required to have hands-on knowledge on at least one of AWS, GCP, Azure cloud platforms.


Role & Responsibilities:

Your role is focused on Design, Development and delivery of solutions involving:

• Data Integration, Processing & Governance

• Data Storage and Computation Frameworks, Performance Optimizations

• Analytics & Visualizations

• Infrastructure & Cloud Computing

• Data Management Platforms

• Implement scalable architectural models for data processing and storage

• Build functionality for data ingestion from multiple heterogeneous sources in batch & real-time mode

• Build functionality for data analytics, search and aggregation

Experience Guidelines:

Mandatory Experience and Competencies:

# Competency

1.Overall 5+ years of IT experience with 3+ years in Data related technologies

2.Minimum 2.5 years of experience in Big Data technologies and working exposure in at least one cloud platform on related data services (AWS / Azure / GCP)

3.Hands-on experience with the Hadoop stack – HDFS, sqoop, kafka, Pulsar, NiFi, Spark, Spark Streaming, Flink, Storm, hive, oozie, airflow and other components required in building end to end data pipeline.

4.Strong experience in at least of the programming language Java, Scala, Python. Java preferable

5.Hands-on working knowledge of NoSQL and MPP data platforms like Hbase, MongoDb, Cassandra, AWS Redshift, Azure SQLDW, GCP BigQuery etc

6.Well-versed and working knowledge with data platform related services on at least 1 cloud platform, IAM and data security


Preferred Experience and Knowledge (Good to Have):

# Competency

1.Good knowledge of traditional ETL tools (Informatica, Talend, etc) and database technologies (Oracle, MySQL, SQL Server, Postgres) with hands on experience

2.Knowledge on data governance processes (security, lineage, catalog) and tools like Collibra, Alation etc

3.Knowledge on distributed messaging frameworks like ActiveMQ / RabbiMQ / Solace, search & indexing and Micro services architectures

4.Performance tuning and optimization of data pipelines

5.CI/CD – Infra provisioning on cloud, auto build & deployment pipelines, code quality

6.Cloud data specialty and other related Big data technology certifications


Personal Attributes:

• Strong written and verbal communication skills

• Articulation skills

• Good team player

• Self-starter who requires minimal oversight

• Ability to prioritize and manage multiple tasks

• Process orientation and the ability to define and set up processes


Read more
one-to-one, one-to-many, and many-to-many

one-to-one, one-to-many, and many-to-many

Agency job
via The Hub by Sridevi Viswanathan
Chennai
5 - 9 yrs
₹1L - ₹15L / yr
PowerBI
skill iconPython
Spark
skill iconData Analytics
data brick

Position Overview: We are seeking a talented Data Engineer with expertise in Power BI to join our team. The ideal candidate will be responsible for designing and implementing data pipelines, as well as developing insightful visualizations and reports using Power BI. Additionally, the candidate should have strong skills in Python, data analytics, PySpark, and Databricks. This role requires a blend of technical expertise, analytical thinking, and effective communication skills.

Key Responsibilities:

  1. Design, develop, and maintain data pipelines and architectures using PySpark and Databricks.
  2. Implement ETL processes to extract, transform, and load data from various sources into data warehouses or data lakes.
  3. Collaborate with data analysts and business stakeholders to understand data requirements and translate them into actionable insights.
  4. Develop interactive dashboards, reports, and visualizations using Power BI to communicate key metrics and trends.
  5. Optimize and tune data pipelines for performance, scalability, and reliability.
  6. Monitor and troubleshoot data infrastructure to ensure data quality, integrity, and availability.
  7. Implement security measures and best practices to protect sensitive data.
  8. Stay updated with emerging technologies and best practices in data engineering and data visualization.
  9. Document processes, workflows, and configurations to maintain a comprehensive knowledge base.

Requirements:

  1. Bachelor’s degree in Computer Science, Engineering, or related field. (Master’s degree preferred)
  2. Proven experience as a Data Engineer with expertise in Power BI, Python, PySpark, and Databricks.
  3. Strong proficiency in Power BI, including data modeling, DAX calculations, and creating interactive reports and dashboards.
  4. Solid understanding of data analytics concepts and techniques.
  5. Experience working with Big Data technologies such as Hadoop, Spark, or Kafka.
  6. Proficiency in programming languages such as Python and SQL.
  7. Hands-on experience with cloud platforms like AWS, Azure, or Google Cloud.
  8. Excellent analytical and problem-solving skills with attention to detail.
  9. Strong communication and collaboration skills to work effectively with cross-functional teams.
  10. Ability to work independently and manage multiple tasks simultaneously in a fast-paced environment.

Preferred Qualifications:

  • Advanced degree in Computer Science, Engineering, or related field.
  • Certifications in Power BI or related technologies.
  • Experience with data visualization tools other than Power BI (e.g., Tableau, QlikView).
  • Knowledge of machine learning concepts and frameworks.


Read more
Publicis Sapient

at Publicis Sapient

10 recruiters
Mohit Singh
Posted by Mohit Singh
Bengaluru (Bangalore), Gurugram, Pune, Hyderabad, Noida
4 - 10 yrs
Best in industry
PySpark
Data engineering
Big Data
Hadoop
Spark
+6 more

Publicis Sapient Overview:

The Senior Associate People Senior Associate L1 in Data Engineering, you will translate client requirements into technical design, and implement components for data engineering solution. Utilize deep understanding of data integration and big data design principles in creating custom solutions or implementing package solutions. You will independently drive design discussions to insure the necessary health of the overall solution 

.

Job Summary:

As Senior Associate L1 in Data Engineering, you will do technical design, and implement components for data engineering solution. Utilize deep understanding of data integration and big data design principles in creating custom solutions or implementing package solutions. You will independently drive design discussions to insure the necessary health of the overall solution

The role requires a hands-on technologist who has strong programming background like Java / Scala / Python, should have experience in Data Ingestion, Integration and data Wrangling, Computation, Analytics pipelines and exposure to Hadoop ecosystem components. Having hands-on knowledge on at least one of AWS, GCP, Azure cloud platforms will be preferable.


Role & Responsibilities:

Job Title: Senior Associate L1 – Data Engineering

Your role is focused on Design, Development and delivery of solutions involving:

• Data Ingestion, Integration and Transformation

• Data Storage and Computation Frameworks, Performance Optimizations

• Analytics & Visualizations

• Infrastructure & Cloud Computing

• Data Management Platforms

• Build functionality for data ingestion from multiple heterogeneous sources in batch & real-time

• Build functionality for data analytics, search and aggregation


Experience Guidelines:

Mandatory Experience and Competencies:

# Competency

1.Overall 3.5+ years of IT experience with 1.5+ years in Data related technologies

2.Minimum 1.5 years of experience in Big Data technologies

3.Hands-on experience with the Hadoop stack – HDFS, sqoop, kafka, Pulsar, NiFi, Spark, Spark Streaming, Flink, Storm, hive, oozie, airflow and other components required in building end to end data pipeline. Working knowledge on real-time data pipelines is added advantage.

4.Strong experience in at least of the programming language Java, Scala, Python. Java preferable

5.Hands-on working knowledge of NoSQL and MPP data platforms like Hbase, MongoDb, Cassandra, AWS Redshift, Azure SQLDW, GCP BigQuery etc


Preferred Experience and Knowledge (Good to Have):

# Competency

1.Good knowledge of traditional ETL tools (Informatica, Talend, etc) and database technologies (Oracle, MySQL, SQL Server, Postgres) with hands on experience

2.Knowledge on data governance processes (security, lineage, catalog) and tools like Collibra, Alation etc

3.Knowledge on distributed messaging frameworks like ActiveMQ / RabbiMQ / Solace, search & indexing and Micro services architectures

4.Performance tuning and optimization of data pipelines

5.CI/CD – Infra provisioning on cloud, auto build & deployment pipelines, code quality

6.Working knowledge with data platform related services on at least 1 cloud platform, IAM and data security

7.Cloud data specialty and other related Big data technology certifications


Job Title: Senior Associate L1 – Data Engineering

Personal Attributes:

• Strong written and verbal communication skills

• Articulation skills

• Good team player

• Self-starter who requires minimal oversight

• Ability to prioritize and manage multiple tasks

• Process orientation and the ability to define and set up processes

Read more
xyz

xyz

Agency job
via HR BIZ HUB by Pooja shankla
Bengaluru (Bangalore)
4 - 6 yrs
₹12L - ₹15L / yr
skill iconJava
Big Data
Apache Hive
Hadoop
Spark

Job Title Big Data Developer

Job Description

Bachelor's degree in Engineering or Computer Science or equivalent OR Master's in Computer Applications or equivalent.

Solid Experience of software development experience and leading teams of engineers and scrum teams.

4+ years of hands-on experience of working with Map-Reduce, Hive, Spark (core, SQL and PySpark).

Solid Datawarehousing concepts.

Knowledge of Financial reporting ecosystem will be a plus.

4+ years of experience within Data Engineering/ Data Warehousing using Big Data technologies will be an addon.

Expert on Distributed ecosystem.

Hands-on experience with programming using Core Java or Python/Scala

Expert on Hadoop and Spark Architecture and its working principle

Hands-on experience on writing and understanding complex SQL(Hive/PySpark-dataframes), optimizing joins while processing huge amount of data.

Experience in UNIX shell scripting.

Roles & Responsibilities

Ability to design and develop optimized Data pipelines for batch and real time data processing

Should have experience in analysis, design, development, testing, and implementation of system applications

Demonstrated ability to develop and document technical and functional specifications and analyze software and system processing flows.

Excellent technical and analytical aptitude

Good communication skills.

Excellent Project management skills.

Results driven Approach.

Mandatory SkillsBig Data, PySpark, Hive

Read more
Get to hear about interesting companies hiring right now
Company logo
Company logo
Company logo
Company logo
Company logo
Linkedin iconFollow Cutshort
Why apply via Cutshort?
Connect with actual hiring teams and get their fast response. No spam.
Find more jobs
Get to hear about interesting companies hiring right now
Company logo
Company logo
Company logo
Company logo
Company logo
Linkedin iconFollow Cutshort