50+ Spark Jobs in India
Apply to 50+ Spark Jobs on CutShort.io. Find your next job, effortlessly. Browse Spark Jobs and apply today!

Job Description
We are seeking a highly skilled and experienced Backend Engineer to join our dynamic and fast-paced development team in Bangalore. The ideal candidate will have expertise in Java development, particularly in Java 8 or above, and extensive hands-on experience with Apache Spark, Spark Streaming, and Spring Boot for developing scalable and high-performance microservices. The candidate must also have strong problem-solving skills, a deep understanding of distributed computing, and experience with cloud technologies (Azure).
Key Responsibilities
- Design, develop, and maintain highly scalable microservices and optimized RESTful APIs using Spring Boot in Java 8 or above.
- Write efficient and maintainable Spark and Spark Streaming code for processing large-scale data in real-time.
- Implement Java 8 advanced features such as Functional Interfaces, Lambda Expressions, Streams, Parallel Streams, Completable Futures, and Concurrency API improvements.
- Work with relational (SQL) and non-relational (Cosmos DB) databases for data modeling and optimization.
- Utilize Maven for building and deploying artifacts to the snapshot repository.
- Collaborate with cross-functional teams, including Product, Business, Automation, and other stakeholders, to define, design, and deliver new features.
- Follow Agile SCRUM methodologies for software development and actively participate in sprint planning and retrospective meetings.
- Maintain version control using Git and ensure best practices for code collaboration and peer code reviews.
- Implement CI/CD pipelines using tools such as Jenkins and GitHub Actions to automate build and deployment processes.
- Work with Azure Cloud Technologies to build and deploy cloud-based applications.
- Apply software design patterns and best practices in backend development to enhance system architecture and scalability.
- Troubleshoot and debug applications, ensuring high performance, security, and scalability.
- Keep up to date with the latest industry trends, tools, and technologies to continuously improve development processes.
Minimum Qualifications
- BS/MS in Computer Science or equivalent.
- 4+ years of industry experience in developing highly scalable microservices and optimized RESTful APIs using Spring Boot in Java 8 or above.
- 3+ years of experience in version control tools like Git.
- 3+ years of experience working in an Agile SCRUM environment.
- Strong understanding of software design patterns and distributed computing concepts.
- Solid experience in relational and non-relational databases (SQL and Cosmos DB).
- Experience with Maven for building and managing dependencies.
- Knowledge of CI/CD workflows and experience with Jenkins and GitHub Actions.
- Prior enterprise experience in working with Azure Cloud Technologies.
- Proven ability to work collaboratively with cross-functional teams to deliver high-quality product features.
- Strong problem-solving skills, debugging techniques, and ability to troubleshoot complex issues efficiently.
Preferred Qualifications
- Experience with Kafka or other messaging queues for real-time data processing.
- Exposure to Docker, Kubernetes, and container orchestration tools.
- Hands-on experience with NoSQL databases like MongoDB, Cassandra, or DynamoDB.
- Experience with performance optimization techniques for backend applications.
- Knowledge of test-driven development (TDD) and unit testing frameworks like JUnit.

We are looking for a Senior Data Engineer with strong expertise in GCP, Databricks, and Airflow to design and implement a GCP Cloud Native Data Processing Framework. The ideal candidate will work on building scalable data pipelines and help migrate existing workloads to a modern framework.
- Shift: 2 PM 11 PM
- Work Mode: Hybrid (3 days a week) across Xebia locations
- Notice Period: Immediate joiners or those with a notice period of up to 30 days
Key Responsibilities:
- Design and implement a GCP Native Data Processing Framework leveraging Spark and GCP Cloud Services.
- Develop and maintain data pipelines using Databricks and Airflow for transforming Raw → Silver → Gold data layers.
- Ensure data integrity, consistency, and availability across all systems.
- Collaborate with data engineers, analysts, and stakeholders to optimize performance.
- Document standards and best practices for data engineering workflows.
Required Experience:
- 7-8 years of experience in data engineering, architecture, and pipeline development.
- Strong knowledge of GCP, Databricks, PySpark, and BigQuery.
- Experience with Orchestration tools like Airflow, Dagster, or GCP equivalents.
- Understanding of Data Lake table formats (Delta, Iceberg, etc.).
- Proficiency in Python for scripting and automation.
- Strong problem-solving skills and collaborative mindset.
⚠️ Please apply only if you have not applied recently or are not currently in the interview process for any open roles at Xebia.
Looking forward to your response!
Best regards,
Vijay S
Assistant Manager - TAG
About Data Axle:
Data Axle Inc. has been an industry leader in data, marketing solutions, sales, and research for over 50 years in the USA. Data Axle now has an established strategic global centre of excellence in Pune. This centre delivers mission critical data services to its global customers powered by its proprietary cloud-based technology platform and by leveraging proprietary business and consumer databases.
Data Axle India is recognized as a Great Place to Work! This prestigious designation is a testament to our collective efforts in fostering an exceptional workplace culture and creating an environment where every team member can thrive.
General Summary:
As a Digital Data Management Architect, you will design, implement, and optimize advanced data management systems that support processing billions of digital transactions, ensuring high availability and accuracy. You will leverage your expertise in developing identity graphs, real-time data processing, and API integration to drive insights and enhance user experiences across digital platforms. Your role is crucial in building scalable and secure data architectures that support real-time analytics, identity resolution, and seamless data flows across multiple systems and applications.
Roles and Responsibilities:
- Data Architecture & System Design:
- Design and implement scalable data architectures capable of processing billions of digital transactions in real-time, ensuring low latency and high availability.
- Architect data models, workflows, and storage solutions to enable seamless real-time data processing, including stream processing and event-driven architectures.
- Identity Graph Development:
- Lead the development and maintenance of a comprehensive identity graph to unify disparate data sources, enabling accurate identity resolution across channels.
- Develop algorithms and data matching techniques to enhance identity linking, while maintaining data accuracy and privacy.
- Real-Time Data Processing & Analytics:
- Implement real-time data ingestion, processing, and analytics pipelines to support immediate data availability and actionable insights.
- Work closely with engineering teams to integrate and optimize real-time data processing frameworks such as Apache Kafka, Apache Flink, or Spark Streaming.
- API Development & Integration:
- Design and develop real-time APIs that facilitate data access and integration across internal and external platforms, focusing on security, scalability, and performance.
- Collaborate with product and engineering teams to define API specifications, data contracts, and SLAs to meet business and user requirements.
- Data Governance & Security:
- Establish data governance practices to maintain data quality, privacy, and compliance with regulatory standards across all digital transactions and identity graph data.
- Ensure security protocols and access controls are embedded in all data workflows and API integrations to protect sensitive information.
- Collaboration & Stakeholder Engagement:
- Partner with data engineering, analytics, and product teams to align data architecture with business requirements and strategic goals.
- Provide technical guidance and mentorship to junior architects and data engineers, promoting best practices and continuous learning.
Qualifications:
- Bachelor’s or Master’s degree in Computer Science, Data Engineering, or related field.
- 10+ years of experience in data architecture, digital data management, or a related field, with a proven track record in managing billion+ transactions.
- Deep experience with identity resolution techniques and building identity graphs.
- Strong proficiency in real-time data processing technologies (e.g., Kafka, Flink, Spark) and API development (RESTful and/or GraphQL).
- In-depth knowledge of database systems (SQL, NoSQL), data warehousing solutions, and cloud-based platforms (AWS, Azure, or GCP).
- Familiarity with data privacy regulations (e.g., GDPR, CCPA) and data governance best practices.
This position description is intended to describe the duties most frequently performed by an individual in this position. It is not intended to be a complete list of assigned duties but to describe a position level.


Level of skills and experience:
5 years of hands-on experience in using Python, Spark,Sql.
Experienced in AWS Cloud usage and management.
Experience with Databricks (Lakehouse, ML, Unity Catalog, MLflow).
Experience using various ML models and frameworks such as XGBoost, Lightgbm, Torch.
Experience with orchestrators such as Airflow and Kubeflow.
Familiarity with containerization and orchestration technologies (e.g., Docker, Kubernetes).
Fundamental understanding of Parquet, Delta Lake and other data file formats.
Proficiency on an IaC tool such as Terraform, CDK or CloudFormation.
Strong written and verbal English communication skill and proficient in communication with non-technical stakeholderst
Job Title: Big Data Engineer (Java Spark Developer – JAVA SPARK EXP IS MUST)
Location: Chennai, Hyderabad, Pune, Bangalore (Bengaluru) / NCR Delhi
Client: Premium Tier 1 Company
Payroll: Direct Client
Employment Type: Full time / Perm
Experience: 7+ years
Job Description:
We are looking for a skilled Big Data Engineers using Java Spark with 7+ years of experience in Big Data / legacy platforms, who can join immediately. Desired candidate should have design, development and optimization of real-time & batch data pipelines experience in Big Data environment at an enterprise scale applications. You will work on building scalable and high-performance data processing solutions, integrating real-time data streams, and building a reliable Data platforms. Strong troubleshooting, performance tuning, and collaboration skills are key for this role.
Key Responsibilities:
· Develop data pipelines using Java Spark and Kafka.
· Optimize and maintain real-time data pipelines and messaging systems.
· Collaborate with cross-functional teams to deliver scalable data solutions.
· Troubleshoot and resolve issues in Java Spark and Kafka applications.
Qualifications:
· Experience in Java Spark is must
· Knowledge and hands-on experience using distributed computing, real-time data streaming, and big data technologies
· Strong problem-solving and performance optimization skills
· Looking for immediate joiners
If interested, please share your resume along with the following details
1) Notice Period
2) Current CTC
3) Expected CTC
4) Have Experience in Java Spark - Y / N (this is must)
5) Any offers in hand
Thanks & Regards,
LION & ELEPHANTS CONSULTANCY PVT LTD TEAM
SINGAPORE | INDIA

The Sr AWS/Azure/GCP Databricks Data Engineer at Koantek will use comprehensive
modern data engineering techniques and methods with Advanced Analytics to support
business decisions for our clients. Your goal is to support the use of data-driven insights
to help our clients achieve business outcomes and objectives. You can collect, aggregate, and analyze structured/unstructured data from multiple internal and external sources and
patterns, insights, and trends to decision-makers. You will help design and build data
pipelines, data streams, reporting tools, information dashboards, data service APIs, data
generators, and other end-user information portals and insight tools. You will be a critical
part of the data supply chain, ensuring that stakeholders can access and manipulate data
for routine and ad hoc analysis to drive business outcomes using Advanced Analytics. You are expected to function as a productive member of a team, working and
communicating proactively with engineering peers, technical lead, project managers, product owners, and resource managers. Requirements:
Strong experience as an AWS/Azure/GCP Data Engineer and must have
AWS/Azure/GCP Databricks experience. Expert proficiency in Spark Scala, Python, and spark
Must have data migration experience from on-prem to cloud
Hands-on experience in Kinesis to process & analyze Stream Data, Event/IoT Hubs, and Cosmos
In depth understanding of Azure/AWS/GCP cloud and Data lake and Analytics
solutions on Azure. Expert level hands-on development Design and Develop applications on Databricks. Extensive hands-on experience implementing data migration and data processing
using AWS/Azure/GCP services
In depth understanding of Spark Architecture including Spark Streaming, Spark Core, Spark SQL, Data Frames, RDD caching, Spark MLib
Hands-on experience with the Technology stack available in the industry for data
management, data ingestion, capture, processing, and curation: Kafka, StreamSets, Attunity, GoldenGate, Map Reduce, Hadoop, Hive, Hbase, Cassandra, Spark, Flume, Hive, Impala, etc
Hands-on knowledge of data frameworks, data lakes and open-source projects such
asApache Spark, MLflow, and Delta Lake
Good working knowledge of code versioning tools [such as Git, Bitbucket or SVN]
Hands-on experience in using Spark SQL with various data sources like JSON, Parquet and Key Value Pair
Experience preparing data for Data Science and Machine Learning with exposure to- model selection, model lifecycle, hyperparameter tuning, model serving, deep
learning, etc
Demonstrated experience preparing data, automating and building data pipelines for
AI Use Cases (text, voice, image, IoT data etc. ). Good to have programming language experience with. NET or Spark/Scala
Experience in creating tables, partitioning, bucketing, loading and aggregating data
using Spark Scala, Spark SQL/PySpark
Knowledge of AWS/Azure/GCP DevOps processes like CI/CD as well as Agile tools
and processes including Git, Jenkins, Jira, and Confluence
Working experience with Visual Studio, PowerShell Scripting, and ARM templates. Able to build ingestion to ADLS and enable BI layer for Analytics
Strong understanding of Data Modeling and defining conceptual logical and physical
data models. Big Data/analytics/information analysis/database management in the cloud
IoT/event-driven/microservices in the cloud- Experience with private and public cloud
architectures, pros/cons, and migration considerations. Ability to remain up to date with industry standards and technological advancements
that will enhance data quality and reliability to advance strategic initiatives
Working knowledge of RESTful APIs, OAuth2 authorization framework and security
best practices for API Gateways
Guide customers in transforming big data projects, including development and
deployment of big data and AI applications
Guide customers on Data engineering best practices, provide proof of concept, architect solutions and collaborate when needed
2+ years of hands-on experience designing and implementing multi-tenant solutions
using AWS/Azure/GCP Databricks for data governance, data pipelines for near real-
time data warehouse, and machine learning solutions. Over all 5+ years' experience in a software development, data engineering, or data
analytics field using Python, PySpark, Scala, Spark, Java, or equivalent technologies. hands-on expertise in Apache SparkTM (Scala or Python)
3+ years of experience working in query tuning, performance tuning, troubleshooting, and debugging Spark and other big data solutions. Bachelor's or Master's degree in Big Data, Computer Science, Engineering, Mathematics, or similar area of study or equivalent work experience
Ability to manage competing priorities in a fast-paced environment
Ability to resolve issues
Basic experience with or knowledge of agile methodologies
AWS Certified: Solutions Architect Professional
Databricks Certified Associate Developer for Apache Spark
Microsoft Certified: Azure Data Engineer Associate
GCP Certified: Professional Google Cloud Certified

We are seeking a highly skilled and experienced Offshore Data Engineer . The role involves designing, implementing, and testing data pipelines and products.
Qualifications & Experience:
bachelor's or master's degree in computer science, Information Systems, or a related field.
5+ years of experience in data engineering, with expertise in data architecture and pipeline development.
☁️ Proven experience with GCP, Big Query, Databricks, Airflow, Spark, DBT, and GCP Services.
️ Hands-on experience with ETL processes, SQL, PostgreSQL, MySQL, MongoDB, Cassandra.
Strong proficiency in Python and data modelling.
Experience in testing and validation of data pipelines.
Preferred: Experience with eCommerce systems, data visualization tools (Tableau, Looker), and cloud certifications.
If you meet the above criteria and are interested, please share your updated CV along with the following details:
Total Experience:
Current CTC:
Expected CTC:
Current Location:
Preferred Location:
Notice Period / Last Working Day (if serving notice):
⚠️ Kindly share your details only if you have not applied recently or are not currently in the interview process for any open roles at Xebia.
Looking forward to your response!

We are looking for skilled Data Engineer to design, build, and maintain robust data pipelines and infrastructure. You will play a pivotal role in optimizing data flow, ensuring scalability, and enabling seamless access to structured/unstructured data across the organization. This role requires technical expertise in Python, SQL, ETL/ELT frameworks, and cloud data warehouses, along with strong collaboration skills to partner with cross-functional teams.
Company: BigThinkCode Technologies
URL:
Location: Chennai (Work from office / Hybrid)
Experience: 4 - 6 years
Key Responsibilities:
- Design, develop, and maintain scalable ETL/ELT pipelines to process structured and unstructured data.
- Optimize and manage SQL queries for performance and efficiency in large-scale datasets.
- Experience working with data warehouse solutions (e.g., Redshift, BigQuery, Snowflake) for analytics and reporting.
- Collaborate with data scientists, analysts, and business stakeholders to translate requirements into technical solutions.
- Experience in Implementing solutions for streaming data (e.g., Apache Kafka, AWS Kinesis) is preferred but not mandatory.
- Ensure data quality, governance, and security across pipelines and storage systems.
- Document architectures, processes, and workflows for clarity and reproducibility.
Required Technical Skills:
- Proficiency in Python for scripting, automation, and pipeline development.
- Expertise in SQL (complex queries, optimization, and database design).
- Hands-on experience with ETL/ELT tools (e.g., Apache Airflow, dbt, AWS Glue).
- Experience working with structured data (RDBMS) and unstructured data (JSON, Parquet, Avro).
- Familiarity with cloud-based data warehouses (Redshift, BigQuery, Snowflake).
- Knowledge of version control systems (e.g., Git) and CI/CD practices.
Preferred Qualifications:
- Experience with streaming data technologies (e.g., Kafka, Kinesis, Spark Streaming).
- Exposure to cloud platforms (AWS, GCP, Azure) and their data services.
- Understanding of data modelling (dimensional, star schema) and optimization techniques.
Soft Skills:
- Team player with a collaborative mindset and ability to mentor junior engineers.
- Strong stakeholder management skills to align technical solutions with business goals.
- Excellent communication skills to explain technical concepts to non-technical audiences.
- Proactive problem-solving and adaptability in fast-paced environments.
If interested, apply / reply by sharing your updated profile to connect and discuss.
Regards
Job Title : Senior AWS Data Engineer
Experience : 5+ Years
Location : Gurugram
Employment Type : Full-Time
Job Summary :
Seeking a Senior AWS Data Engineer with expertise in AWS to design, build, and optimize scalable data pipelines and data architectures. The ideal candidate will have experience in ETL/ELT, data warehousing, and big data technologies.
Key Responsibilities :
- Build and optimize data pipelines using AWS (Glue, EMR, Redshift, S3, etc.).
- Maintain data lakes & warehouses for analytics.
- Ensure data integrity through quality checks.
- Collaborate with data scientists & engineers to deliver solutions.
Qualifications :
- 7+ Years in Data Engineering.
- Expertise in AWS services, SQL, Python, Spark, Kafka.
- Experience with CI/CD, DevOps practices.
- Strong problem-solving skills.
Preferred Skills :
- Experience with Snowflake, Databricks.
- Knowledge of BI tools (Tableau, Power BI).
- Healthcare/Insurance domain experience is a plus.

Lead Data Scientist
Job Description
As a Lead Data Scientist, you will be responsible for identifying, scoping, and delivering data science projects with a strong emphasis on causal inference.
The ability to take large, scientifically complex projects and break them down into manageable hypotheses and experiments to inform functional specifications, and then deliver features in a successful and timely manner, is expected. Maturity, high judgment, negotiation skills, ability to influence are essential to success in this role.
We will rely on your experience in successfully delivering projects that significantly, positively, and measurably affect the business. You should also have experience in large-scale data science projects.
What You'll Do
• Work closely with the Tech team to convert those POC into fully scalable products.
• Actively identify existing and new features which could benefit from predictive modelling and productionization of predictive models
• Actively identify and resolve strategic issues that may impair the team’s ability to meet strategic, scientific, and technical goals
• Contribute to research and development of AI/ML techniques and technology that fuels the business innovation and growth of -
• Work closely with engineers to deploy models in production both in real time and in batch process and systematically track model performance
• Encourage team building, best practices sharing especially with more junior team members
Requirements & Skills
• Strong problem solving skills with an emphasis on product development
• Master’s or PhD degree in Statistics, Mathematics, Computer Science, or another quantitative field
• More than 8 years of experience in practicing machine learning and data science in business or a related field, with a focus on statistical analysis
• Strong background in statistics and causal inference.
• Proficient in statistical programming languages such as Python or R, and data visualization tools (e.g., Tableau, Power BI).
• Strong experience with machine learning algorithms and statistical modeling techniques
• Strong computing/programming skills; Proficient in Python, Spark, SQL, Linux shell script.
• Proven ability to work with large datasets and familiarity with big data technologies (e.g., Hadoop, Spark, SQL) is a plus.
• Experience with end to end feature development (owning feature definition, roadmap development and experimentation
• Effective leadership and communication skills, with the ability to inspire and guide a team.
• Excellent problem solving and critical thinking capabilities.
• Strong experience in Cloud technology.


Associate- Manager Engineering, -
Location: Bangalore, Karnataka
Type: Full-Time
Reports to: VP-Engineering
Job Purpose:
As a Engineering Manager, you will be responsible for developing, leading and managing a team of software engineers to deliver high-quality software products and solutions. This role involves a combination of technical expertise, project management, and people management to drive the software development process and achieve organizational goals.
Key Responsibilities:
Leadership and Team Development:
• Lead, mentor, and coach a team of software engineers, fostering a collaborative and innovative work environment.
Goal Setting and Feedback:
• Set clear goals and expectations for team members, and provide regular feedback and performance evaluations.
Project Management:
• Oversee the planning, execution, and successful delivery of software development projects.
• Define project scope, objectives, and timelines. Allocate resources effectively and ensure project success.
Technical Expertise and Guidance:
• Provide technical expertise and guidance to the development team, helping them make informed decisions and solve complex technical challenges.
Industry Best Practices:
• Stay updated on industry best practices and emerging technologies.
Process Improvement:
• Implement and improve software development processes, methodologies, and standards, such as Agile or Scrum, to enhance productivity and software quality.
Quality Assurance:
• Ensure code reviews, testing, and quality assurance processes are in place and followed.
Collaboration and Communication:
• Collaborate with product managers, business stakeholders, and other teams to define project requirements, prioritize tasks, and provide regular updates on project status.
• Act as a bridge between technical and non-technical stakeholders.
Resource Management:
• Manage resources effectively, including budget allocation, staffing, and workload distribution.
• Identify and address resource constraints or skill gaps.
Risk Management:
• Identify project risks and develop mitigation plans to address potential challenges or roadblocks.
• Proactively address issues that may impact project timelines or quality.
Quality Standards:
• Ensure that software products meet high-quality standards by implementing testing and quality control processes.
• Address and resolve software defects and issues
Qualifications & Experience:
• Bachelor's or Master's degree in Computer Science, Software Engineering, or related field.
• Over 12+ year of proven experience in leading software product development, software architecture and design for complex systems.
• Strong Hands-on experience in SDLC, AGILE, coding and design experience.
• Strong understanding of programming languages, frameworks, and tools relevant to software development.
• Proven experience in software development with a strong technical background.
• Prior experience in a leadership or management role, with a track record of successfully leading software development teams.
Skills & Attributes:
Technical Skills:
• Proficiency in software development methodologies (AGILE, SDLC), tools, and best practices.
• Prior working experience in the technologies like Python API development(API frameworks including Fast API, Rest API)
• Prior working experience in the UI technologies like React JS, Redux , HTML5/CSS and Java Scripting
• Good working experience in RDBMS like PostgreSQL, hands-on experience in SQL is must
• Nice to have experience in technologies like Spark, Hive
• Experience in building enterprise scale SaaS software products using Microservices architecture and cloud platform like AWS and Azure
• Experience in designing and implementing scalable, distributed systems is preferred.
• Proficiency in ensuring code quality, unit testing, and adherence to coding standards.
• Familiarity with AI/ML concepts and their application is advantageous.
Soft Skills:
• Communication: Ability to articulate complex technical concepts to non-technical stakeholders, as well as to developers and other technical staff.
• Teamwork: Collaborate effectively with various teams (development, QA, product, etc.).
• Problem Solving: Address unforeseen issues and come up with innovative solutions.
• Decision Making: Make informed decisions that consider technical feasibility, business needs, and potential risks.
• Leadership: Provide guidance, mentorship, and direction to engineering teams.
• Time Management: Prioritize tasks effectively to meet deadlines and product milestones.
• Continuous Learning: Stay updated with the latest in technology trends, methodologies, and best practices.
Business-Oriented Skills:
• Product Mindset: Understand the business objectives, user needs, and how technology can align with and fulfill those needs.
• Stakeholder Management: Collaborate and communicate effectively with stakeholders to gather requirements, provide updates, and gather feedback.
• Project Management: Familiarity with project management methodologies (like Agile or Waterfall) to ensure timely product delivery.
• Strategic Thinking: Ability to align technological strategies with business goals and foresee potential technological challenges or opportunities.
• Cost Management: Understand the financial aspects, such as the costs of certain technological solutions, ROI, and TCO.
Position : Software Engineer (Java Backend Engineer)
Experience : 4+ Years
📍 Location : Bangalore, India (Hybrid)
Mandatory Skills : Java 8+ (Advanced Features), Spring Boot, Apache Spark (Spark Streaming), SQL & Cosmos DB, Git, Maven, CI/CD (Jenkins, GitHub), Azure Cloud, Agile Scrum.
About the Role :
We are seeking a highly skilled Backend Engineer with expertise in Java, Spark, and microservices architecture to join our dynamic team. The ideal candidate will have a strong background in object-oriented programming, experience with Spark Streaming, and a deep understanding of distributed systems and cloud technologies.
Key Responsibilities :
- Design, develop, and maintain highly scalable microservices and optimized RESTful APIs using Spring Boot and Java 8+.
- Implement and optimize Spark Streaming applications for real-time data processing.
- Utilize advanced Java 8 features, including:
- Functional interfaces & Lambda expressions
- Streams and Parallel Streams
- Completable Futures & Concurrency API improvements
- Enhanced Collections APIs
- Work with relational (SQL) and NoSQL (Cosmos DB) databases, ensuring efficient data modeling and retrieval.
- Develop and manage CI/CD pipelines using Jenkins, GitHub, and related automation tools.
- Collaborate with cross-functional teams, including Product, Business, and Automation, to deliver end-to-end product features.
- Ensure adherence to Agile Scrum practices and participate in code reviews to maintain high-quality standards.
- Deploy and manage applications in Azure Cloud environments.
Minimum Qualifications:
- BS/MS in Computer Science or a related field.
- 4+ Years of experience developing backend applications with Spring Boot and Java 8+.
- 3+ Years of hands-on experience with Git for version control.
- Strong understanding of software design patterns and distributed computing principles.
- Experience with Maven for building and deploying artifacts.
- Proven ability to work in Agile Scrum environments with a collaborative team mindset.
- Prior experience with Azure Cloud Technologies.
Job Title : Tech Lead - Data Engineering (AWS, 7+ Years)
Location : Gurugram
Employment Type : Full-Time
Job Summary :
Seeking a Tech Lead - Data Engineering with expertise in AWS to design, build, and optimize scalable data pipelines and data architectures. The ideal candidate will have experience in ETL/ELT, data warehousing, and big data technologies.
Key Responsibilities :
- Build and optimize data pipelines using AWS (Glue, EMR, Redshift, S3, etc.).
- Maintain data lakes & warehouses for analytics.
- Ensure data integrity through quality checks.
- Collaborate with data scientists & engineers to deliver solutions.
Qualifications :
- 7+ Years in Data Engineering.
- Expertise in AWS services, SQL, Python, Spark, Kafka.
- Experience with CI/CD, DevOps practices.
- Strong problem-solving skills.
Preferred Skills :
- Experience with Snowflake, Databricks.
- Knowledge of BI tools (Tableau, Power BI).
- Healthcare/Insurance domain experience is a plus.
Dear Candidate,
We are Urgently hiring QA Automation Engineers and Test leads At Hyderabad and Bangalore
Exp: 6-10 yrs
Locations: Hyderabad ,Bangalore
JD:
we are Hiring Automation Testers with 6-10 years of Automation testing experience using QA automation tools like Java, UFT, Selenium, API Testing, ETL & others
Must Haves:
· Experience in Financial Domain is a must
· Extensive Hands-on experience in Design, implement and maintain automation framework using Java, UFT, ETL, Selenium tools and automation concepts.
· Experience with AWS concept and framework design/ testing.
· Experience in Data Analysis, Data Validation, Data Cleansing, Data Verification and identifying data mismatch.
· Experience with Databricks, Python, Spark, Hive, Airflow, etc.
· Experience in validating and analyzing kubernetics log files.
· API testing experience
· Backend testing skills with ability to write SQL queries in Databricks and in Oracle databases
· Experience in working with globally distributed Agile project teams
· Ability to work in a fast-paced, globally structured and team-based environment, as well as independently
· Experience in test management tools like Jira
· Good written and verbal communication skills
Good To have:
- Business and finance knowledge desirable
Best Regards,
Minakshi Soni
Executive - Talent Acquisition (L2)
Worldwide Locations: USA | HK | IN


We are looking for computer science/engineering final year students/ fresh graduates that have solid understanding of computer science fundamentals (algorithms, data structures, object oriented programming) and strong java. programming skills. You will get to work on machine learning algorithms as applied to online advertising or do data analytics. You will learn how to collaborate in small, agile teams, do rapid development, testing and get to taste the invigorating feel of a start-up company.
Experience
None required
Required Skills
-Solid foundation in computer science, with strong competencies in data structures, algorithms, and software design
-Java / Python programming
-UI/UX HTML5 CSS3, Javascript
-MYSQL, Relational Databases
-MVC Framework, ReactJS
Optional Skills
-Familiarity with online advertising, web technologies
-Familiarity with Hadoop, Spark, Scala
Education
UG - B.Tech/B.E. - Computers; PG - M.Tech - Computers
Responsibilities:
· Analyze complex data sets to answer specific questions using MMIT’s market access data (MMIT) and Norstella claims data, third-party claims data (IQVIA LAAD, Symphony SHA). Applicant must have experience working with the aforementioned data sets exclusively.
· Deliver consultative services to clients related to MMIT RWD sets
· Produce complex analytical reports using data visualization tools such as Power BI or Tableau
· Define customized technical specifications to surface MMIT RWD in MMIT tools.
· Execute work in a timely fashion with high accuracy, while managing various competing priorities; Perform thorough troubleshooting and execute QA; Communicate with internal teams to obtain required data
· Ensure adherence to documentation requirements, process workflows, timelines, and escalation protocols
· And other duties as assigned.
Requirements:
· Bachelor’s Degree or relevant experience required
· 2-5 yrs. of professional experience in RWD analytics using SQL
· Fundamental understanding of Pharma and Market access space
· Strong analysis skills and proficiency with tools such as Tableau or PowerBI
· Excellent written and verbal communication skills.
· Analytical, critical thinking and creative problem-solving skills.
· Relationship building skills.
· Solid organizational skills including attention to detail and multitasking skills.
· Excellent time management and prioritization skills.

Role Objective:
Big Data Engineer will be responsible for expanding and optimizing our data and database architecture, as well as optimizing data flow and collection for cross functional teams. The ideal candidate is an experienced data pipeline builder and data wrangler who enjoys optimizing data systems and building. The Data Engineer will support our software developers, database architects, data analysts and data scientists on data initiatives and will ensure optimal data delivery architecture is consistent throughout ongoing projects. They must be self-directed and comfortable supporting the data needs of multiple teams, systems, and products
Roles & Responsibilities:
- Sound knowledge in Spark architecture and distributed computing and Spark streaming.
- Proficient in Spark – including RDD and Data frames core functions, troubleshooting and performance tuning.
- SFDC(Data modelling experience) would be given preference
- Good understanding in object-oriented concepts and hands on experience on Scala with excellent programming logic and technique.
- Good in functional programming and OOPS concept on Scala
- Good experience in SQL – should be able to write complex queries.
- Managing the team of Associates and Senior Associates and ensuring the utilization is maintained across the project.
- Able to mentor new members for onboarding to the project.
- Understand the client requirement and able to design, develop from scratch and deliver.
- AWS cloud experience would be preferable.
- Design, build and operationalize large scale enterprise data solutions and applications using one or more of AWS data and analytics services - DynamoDB, RedShift, Kinesis, Lambda, S3, etc. (preferred)
- Hands on experience utilizing AWS Management Tools (CloudWatch, CloudTrail) to proactively monitor large and complex deployments (preferred)
- Experience in analyzing, re-architecting, and re-platforming on-premises data warehouses to data platforms on AWS (preferred)
- Leading the client calls to flag off any delays, blockers, escalations and collate all the requirements.
- Managing project timing, client expectations and meeting deadlines.
- Should have played project and team management roles.
- Facilitate meetings within the team on regular basis.
- Understand business requirement and analyze different approaches and plan deliverables and milestones for the project.
- Optimization, maintenance, and support of pipelines.
- Strong analytical and logical skills.
- Ability to comfortably tackling new challenges and learn

Role Objective:
Big Data Engineer will be responsible for expanding and optimizing our data and database architecture, as well as optimizing data flow and collection for cross functional teams. The ideal candidate is an experienced data pipeline builder and data wrangler who enjoys optimizing data systems and building. The Data Engineer will support our software developers, database architects, data analysts and data scientists on data initiatives and will ensure optimal data delivery architecture is consistent throughout ongoing projects. They must be self-directed and comfortable supporting the data needs of multiple teams, systems, and products
Roles & Responsibilities:
- Sound knowledge in Spark architecture and distributed computing and Spark streaming.
- Proficient in Spark – including RDD and Data frames core functions, troubleshooting and performance tuning.
- Good understanding in object-oriented concepts and hands on experience on Scala with excellent programming logic and technique.
- Good in functional programming and OOPS concept on Scala
- Good experience in SQL – should be able to write complex queries.
- Managing the team of Associates and Senior Associates and ensuring the utilization is maintained across the project.
- Able to mentor new members for onboarding to the project.
- Understand the client requirement and able to design, develop from scratch and deliver.
- AWS cloud experience would be preferable.
- Design, build and operationalize large scale enterprise data solutions and applications using one or more of AWS data and analytics services - DynamoDB, RedShift, Kinesis, Lambda, S3, etc. (preferred)
- Hands on experience utilizing AWS Management Tools (CloudWatch, CloudTrail) to proactively monitor large and complex deployments (preferred)
- Experience in analyzing, re-architecting, and re-platforming on-premises data warehouses to data platforms on AWS (preferred)
- Leading the client calls to flag off any delays, blockers, escalations and collate all the requirements.
- Managing project timing, client expectations and meeting deadlines.
- Should have played project and team management roles.
- Facilitate meetings within the team on regular basis.
- Understand business requirement and analyze different approaches and plan deliverables and milestones for the project.
- Optimization, maintenance, and support of pipelines.
- Strong analytical and logical skills.
- Ability to comfortably tackling new challenges and learn
External Skills And Expertise
Must have Skills:
- Scala
- Spark
- SQL (Intermediate to advanced level)
- Spark Streaming
- AWS preferable/Any cloud
- Kafka /Kinesis/Any streaming services
- Object-Oriented Programming
- Hive, ETL/ELT design experience
- CICD experience (ETL pipeline deployment)
Good to Have Skills:
- AWS Certification
- Git/similar version control tool
- Knowledge in CI/CD, Microservices
Secondary Skills: Streaming, Archiving , AWS / AZURE / CLOUD
Role:
· Should have strong programming and support experience in Java, J2EE technologies
· Should have good experience in Core Java, JSP, Sevlets, JDBC
· Good exposure in Hadoop development ( HDFS, Map Reduce, Hive, HBase, Spark)
· Should have 2+ years of Java experience and 1+ years of experience in Hadoop
· Should possess good communication skills
software development and automated testing Proficient in Big Data technologies Designs, codes, tests, corrects and documents large and/or complex programs and program modifications from supplied specifications using agreed standards and tools, to achieve a well engineered result Proficient and Hands-on Data Warehousing,
Experience with Agile development, Continuous Integration, and Continuous Delivery Ability to effectively interpret technical and business objectives and provide solutions Strong communication skills, with ability to articulate technical solutions effectively across diverse group of stakeholders
Need to be a fast learner willing to adapt to evolving needs of the developer community.
Thanks & Regards
snehalata verma
IT Recruiter --HrBizHub

- Responsible for designing, storing, processing, and maintaining of large-scale data and related infrastructure.
- Can drive multiple projects both from operational and technical standpoint.
- Ideate and build PoV or PoC for new product that can help drive more business.
- Responsible for defining, designing, and implementing data engineering best practices, strategies, and solutions.
- Is an Architect who can guide the customers, team, and overall organization on tools, technologies, and best practices around data engineering.
- Lead architecture discussions, align with business needs, security, and best practices.
- Has strong conceptual understanding of Data Warehousing and ETL, Data Governance and Security, Cloud Computing, and Batch & Real Time data processing
- Has strong execution knowledge of Data Modeling, Databases in general (SQL and NoSQL), software development lifecycle and practices, unit testing, functional programming, etc.
- Understanding of Medallion architecture pattern
- Has worked on at least one cloud platform.
- Has worked as data architect and executed multiple end-end data engineering project.
- Has extensive knowledge of different data architecture designs and data modelling concepts.
- Manages conversation with the client stakeholders to understand the requirement and translate it into technical outcomes.
Required Tech Stack
- Strong proficiency in SQL
- Experience working on any of the three major cloud platforms i.e., AWS/Azure/GCP
- Working knowledge of an ETL and/or orchestration tools like IICS, Talend, Matillion, Airflow, Azure Data Factory, AWS Glue, GCP Composer, etc.
- Working knowledge of one or more OLTP databases (Postgres, MySQL, SQL Server, etc.)
- Working knowledge of one or more Data Warehouse like Snowflake, Redshift, Azure Synapse, Hive, Big Query, etc.
- Proficient in at least one programming language used in data engineering, such as Python (or Scala/Rust/Java)
- Has strong execution knowledge of Data Modeling (star schema, snowflake schema, fact vs dimension tables)
- Proficient in Spark and related applications like Databricks, GCP DataProc, AWS Glue, EMR, etc.
- Has worked on Kafka and real-time streaming.
- Has strong execution knowledge of data architecture design patterns (lambda vs kappa architecture, data harmonization, customer data platforms, etc.)
- Has worked on code and SQL query optimization.
- Strong knowledge of version control systems like Git to manage source code repositories and designing CI/CD pipelines for continuous delivery.
- Has worked on data and networking security (RBAC, secret management, key vaults, vnets, subnets, certificates)
The Sr. Analytics Engineer would provide technical expertise in needs identification, data modeling, data movement, and transformation mapping (source to target), automation and testing strategies, translating business needs into technical solutions with adherence to established data guidelines and approaches from a business unit or project perspective.
Understands and leverages best-fit technologies (e.g., traditional star schema structures, cloud, Hadoop, NoSQL, etc.) and approaches to address business and environmental challenges.
Provides data understanding and coordinates data-related activities with other data management groups such as master data management, data governance, and metadata management.
Actively participates with other consultants in problem-solving and approach development.
Responsibilities :
Provide a consultative approach with business users, asking questions to understand the business need and deriving the data flow, conceptual, logical, and physical data models based on those needs.
Perform data analysis to validate data models and to confirm the ability to meet business needs.
Assist with and support setting the data architecture direction, ensuring data architecture deliverables are developed, ensuring compliance to standards and guidelines, implementing the data architecture, and supporting technical developers at a project or business unit level.
Coordinate and consult with the Data Architect, project manager, client business staff, client technical staff and project developers in data architecture best practices and anything else that is data related at the project or business unit levels.
Work closely with Business Analysts and Solution Architects to design the data model satisfying the business needs and adhering to Enterprise Architecture.
Coordinate with Data Architects, Program Managers and participate in recurring meetings.
Help and mentor team members to understand the data model and subject areas.
Ensure that the team adheres to best practices and guidelines.
Requirements :
- Strong working knowledge of at least 3 years of Spark, Java/Scala/Pyspark, Kafka, Git, Unix / Linux, and ETL pipeline designing.
- Experience with Spark optimization/tuning/resource allocations
- Excellent understanding of IN memory distributed computing frameworks like Spark and its parameter tuning, writing optimized workflow sequences.
- Experience of relational databases (e.g., PostgreSQL, MySQL) and NoSQL databases (e.g., Redshift, Bigquery, Cassandra, etc).
- Familiarity with Docker, Kubernetes, Azure Data Lake/Blob storage, AWS S3, Google Cloud storage, etc.
- Have a deep understanding of the various stacks and components of the Big Data ecosystem.
- Hands-on experience with Python is a huge plus

TVARIT GmbH develops and delivers solutions in the field of artificial intelligence (AI) for the Manufacturing, automotive, and process industries. With its software products, TVARIT makes it possible for its customers to make intelligent and well-founded decisions, e.g., in forward-looking Maintenance, increasing the OEE and predictive quality. We have renowned reference customers, competent technology, a good research team from renowned Universities, and the award of a renowned AI prize (e.g., EU Horizon 2020) which makes TVARIT one of the most innovative AI companies in Germany and Europe.
We are looking for a self-motivated person with a positive "can-do" attitude and excellent oral and written communication skills in English.
We are seeking a skilled and motivated senior Data Engineer from the manufacturing Industry with over four years of experience to join our team. The Senior Data Engineer will oversee the department’s data infrastructure, including developing a data model, integrating large amounts of data from different systems, building & enhancing a data lake-house & subsequent analytics environment, and writing scripts to facilitate data analysis. The ideal candidate will have a strong foundation in ETL pipelines and Python, with additional experience in Azure and Terraform being a plus. This role requires a proactive individual who can contribute to our data infrastructure and support our analytics and data science initiatives.
Skills Required:
- Experience in the manufacturing industry (metal industry is a plus)
- 4+ years of experience as a Data Engineer
- Experience in data cleaning & structuring and data manipulation
- Architect and optimize complex data pipelines, leading the design and implementation of scalable data infrastructure, and ensuring data quality and reliability at scale
- ETL Pipelines: Proven experience in designing, building, and maintaining ETL pipelines.
- Python: Strong proficiency in Python programming for data manipulation, transformation, and automation.
- Experience in SQL and data structures
- Knowledge in big data technologies such as Spark, Flink, Hadoop, Apache, and NoSQL databases.
- Knowledge of cloud technologies (at least one) such as AWS, Azure, and Google Cloud Platform.
- Proficient in data management and data governance
- Strong analytical experience & skills that can extract actionable insights from raw data to help improve the business.
- Strong analytical and problem-solving skills.
- Excellent communication and teamwork abilities.
Nice To Have:
- Azure: Experience with Azure data services (e.g., Azure Data Factory, Azure Databricks, Azure SQL Database).
- Terraform: Knowledge of Terraform for infrastructure as code (IaC) to manage cloud.
- Bachelor’s degree in computer science, Information Technology, Engineering, or a related field from top-tier Indian Institutes of Information Technology (IIITs).
- Benefits And Perks
- A culture that fosters innovation, creativity, continuous learning, and resilience
- Progressive leave policy promoting work-life balance
- Mentorship opportunities with highly qualified internal resources and industry-driven programs
- Multicultural peer groups and supportive workplace policies
- Annual workcation program allowing you to work from various scenic locations
- Experience the unique environment of a dynamic start-up
Why should you join TVARIT ?
Working at TVARIT, a deep-tech German IT startup, offers a unique blend of innovation, collaboration, and growth opportunities. We seek individuals eager to adapt and thrive in a rapidly evolving environment.
If this opportunity excites you and aligns with your career aspirations, we encourage you to apply today!
Nielsen, a global company specialising in audience measurement and analytics, is currently seeking a proficient leader in data engineering to join their team in Bangalore, Gurgaon, or Mumbai.
This is a manager of managers role that involves managing multiple scrum teams and overseeing an advanced data platform that analyses audience consumption patterns across various channels like OTT, TV, Radio, and Social Media worldwide. You will be responsible for building and supervising a top-performing data engineering team that delivers data for targeted campaigns. Moreover, you will work with AWS services (S3, Lambda, Kinesis) and other data engineering technologies such as Spark, Scala/Python, Kafka, etc. There may also be opportunities to establish deep integrations with OTT platforms like Netflix, Prime Video, and other.

Primary Skills
DynamoDB, Java, Kafka, Spark, Amazon Redshift, AWS Lake Formation, AWS Glue, Python
Skills:
Good work experience showing growth as a Data Engineer.
Hands On programming experience
Implementation Experience on Kafka, Kinesis, Spark, AWS Glue, AWS Lake Formation.
Excellent knowledge in: Python, Scala/Java, Spark, AWS (Lambda, Step Functions, Dynamodb, EMR), Terraform, UI (Angular), Git, Mavena
Experience of performance optimization in Batch and Real time processing applications
Expertise in Data Governance and Data Security Implementation
Good hands-on design and programming skills building reusable tools and products Experience developing in AWS or similar cloud platforms. Preferred:, ECS, EKS, S3, EMR, DynamoDB, Aurora, Redshift, Quick Sight or similar.
Familiarity with systems with very high volume of transactions, micro service design, or data processing pipelines (Spark).
Knowledge and hands-on experience with server less technologies such as Lambda, MSK, MWAA, Kinesis Analytics a plus.
Expertise in practices like Agile, Peer reviews, Continuous Integration
Roles and responsibilities:
Determining project requirements and developing work schedules for the team.
Delegating tasks and achieving daily, weekly, and monthly goals.
Responsible for designing, building, testing, and deploying the software releases.
Salary: 25LPA-40LPA
Must have skills
3 to 6 years
Data Science
SQL, Excel, Big Query - mandate 3+ years
Python/ML, Hadoop, Spark - 2+ years
Requirements
• 3+ years prior experience as a data analyst
• Detail oriented, structural thinking and analytical mindset.
• Proven analytic skills, including data analysis and data validation.
• Technical writing experience in relevant areas, including queries, reports, and presentations.
• Strong SQL and Excel skills with the ability to learn other analytic tools
• Good communication skills (being precise and clear)
• Good to have prior knowledge of python and ML algorithms
Location: Pune
Required Skills : Scala, Python, Data Engineering, AWS, Cassandra/AstraDB, Athena, EMR, Spark/Snowflake
Job Description:
We are seeking a talented Machine Learning Engineer with expertise in software engineering to join our team. As a Machine Learning Engineer, your primary responsibility will be to develop machine learning (ML) solutions that focus on technology process improvements. Specifically, you will be working on projects involving ML & Generative AI solutions for Technology & Data Management Efficiencies such as optimal cloud computing, knowledge bots, Software Code Assistants, Automatic Data Management etc
Responsibilities:
- Collaborate with cross-functional teams to identify opportunities for technology process improvements that can be solved using machine learning and generative AI.
- Define and build innovate ML and Generative AI systems such as AI Assistants for varied SDLC tasks, and improve Data & Infrastructure management etc.
- Design and develop ML Engineering Solutions, generative AI Applications & Fine-Tuning Large Language Models (LLMs) for above ensuring scalability, efficiency, and maintainability of such solutions.
- Implement prompt engineering techniques to fine-tune and enhance LLMs for better performance and application-specific needs.
- Stay abreast of the latest advancements in the field of Generative AI and actively contribute to the research and development of new ML & Generative AI Solutions.
Requirements:
- A Master's or Ph.D. degree in Computer Science, Statistics, Data Science, or a related field.
- Proven experience working as a Software Engineer, with a focus on ML Engineering and exposure to Generative AI Applications such as chatGPT.
- Strong proficiency in programming languages such as Java, Scala, Python, Google Cloud, Biq Query, Hadoop & Spark etc
- Solid knowledge of software engineering best practices, including version control systems (e.g., Git), code reviews, and testing methodologies.
- Familiarity with large language models (LLMs), prompt engineering techniques, vector DB's, embedding & various fine-tuning techniques.
- Strong communication skills to effectively collaborate and present findings to both technical and non-technical stakeholders.
- Proven ability to adapt and learn new technologies and frameworks quickly.
- A proactive mindset with a passion for continuous learning and research in the field of Generative AI.
If you are a skilled and innovative Data Scientist with a passion for Generative AI, and have a desire to contribute to technology process improvements, we would love to hear from you. Join our team and help shape the future of our AI Driven Technology Solutions.
Publicis Sapient Overview:
The Senior Associate People Senior Associate L1 in Data Engineering, you will translate client requirements into technical design, and implement components for data engineering solution. Utilize deep understanding of data integration and big data design principles in creating custom solutions or implementing package solutions. You will independently drive design discussions to insure the necessary health of the overall solution
.
Job Summary:
As Senior Associate L2 in Data Engineering, you will translate client requirements into technical design, and implement components for data engineering solution. Utilize deep understanding of data integration and big data design principles in creating custom solutions or implementing package solutions. You will independently drive design discussions to insure the necessary health of the overall solution
The role requires a hands-on technologist who has strong programming background like Java / Scala / Python, should have experience in Data Ingestion, Integration and data Wrangling, Computation, Analytics pipelines and exposure to Hadoop ecosystem components. You are also required to have hands-on knowledge on at least one of AWS, GCP, Azure cloud platforms.
Role & Responsibilities:
Your role is focused on Design, Development and delivery of solutions involving:
• Data Integration, Processing & Governance
• Data Storage and Computation Frameworks, Performance Optimizations
• Analytics & Visualizations
• Infrastructure & Cloud Computing
• Data Management Platforms
• Implement scalable architectural models for data processing and storage
• Build functionality for data ingestion from multiple heterogeneous sources in batch & real-time mode
• Build functionality for data analytics, search and aggregation
Experience Guidelines:
Mandatory Experience and Competencies:
# Competency
1.Overall 5+ years of IT experience with 3+ years in Data related technologies
2.Minimum 2.5 years of experience in Big Data technologies and working exposure in at least one cloud platform on related data services (AWS / Azure / GCP)
3.Hands-on experience with the Hadoop stack – HDFS, sqoop, kafka, Pulsar, NiFi, Spark, Spark Streaming, Flink, Storm, hive, oozie, airflow and other components required in building end to end data pipeline.
4.Strong experience in at least of the programming language Java, Scala, Python. Java preferable
5.Hands-on working knowledge of NoSQL and MPP data platforms like Hbase, MongoDb, Cassandra, AWS Redshift, Azure SQLDW, GCP BigQuery etc
6.Well-versed and working knowledge with data platform related services on at least 1 cloud platform, IAM and data security
Preferred Experience and Knowledge (Good to Have):
# Competency
1.Good knowledge of traditional ETL tools (Informatica, Talend, etc) and database technologies (Oracle, MySQL, SQL Server, Postgres) with hands on experience
2.Knowledge on data governance processes (security, lineage, catalog) and tools like Collibra, Alation etc
3.Knowledge on distributed messaging frameworks like ActiveMQ / RabbiMQ / Solace, search & indexing and Micro services architectures
4.Performance tuning and optimization of data pipelines
5.CI/CD – Infra provisioning on cloud, auto build & deployment pipelines, code quality
6.Cloud data specialty and other related Big data technology certifications
Personal Attributes:
• Strong written and verbal communication skills
• Articulation skills
• Good team player
• Self-starter who requires minimal oversight
• Ability to prioritize and manage multiple tasks
• Process orientation and the ability to define and set up processes

- As a data engineer, you will build systems that collect, manage, and convert raw data into usable information for data scientists and business analysts to interpret. You ultimate goal is to make data accessible for organizations to optimize their performance.
- Work closely with PMs, business analysts to build and improvise data pipelines, identify and model business objects • Write scripts implementing data transformation, data structures, metadata for bringing structure for partially unstructured data and improvise quality of data
- Experience performing root cause analysis on internal and external data and processes to answer specific business questions and identify opportunities for improvement
- Build the infrastructure required for optimal extraction, transformation, and loading of data from a wide variety of data sources using SQL
- Own data pipelines - Monitoring, testing, validating and ensuring meaningful data exists in data warehouse with high level of data quality
- What we look for in the candidate is strong analytical skills with the ability to collect, organise, analyse, and disseminate significant amounts of information with attention to detail and accuracy
- Create long term and short-term design solutions through collaboration with colleagues
- Proactive to experiment with new tools
- Strong programming skill in python
- Skillset: Python, SQL, ETL frameworks, PySpark and Snowflake
- Strong communication and interpersonal skills to interact with senior-level management regarding the implementation of changes
- Willingness to learn and eagerness to contribute to projects
- Designing datawarehouse and most appropriate DB schema for the data product
- Positive attitude and proactive problem-solving mindset
- Experience in building data pipelines and connectors
- Knowledge on AWS cloud services would be preferred


Position Overview: We are seeking a talented Data Engineer with expertise in Power BI to join our team. The ideal candidate will be responsible for designing and implementing data pipelines, as well as developing insightful visualizations and reports using Power BI. Additionally, the candidate should have strong skills in Python, data analytics, PySpark, and Databricks. This role requires a blend of technical expertise, analytical thinking, and effective communication skills.
Key Responsibilities:
- Design, develop, and maintain data pipelines and architectures using PySpark and Databricks.
- Implement ETL processes to extract, transform, and load data from various sources into data warehouses or data lakes.
- Collaborate with data analysts and business stakeholders to understand data requirements and translate them into actionable insights.
- Develop interactive dashboards, reports, and visualizations using Power BI to communicate key metrics and trends.
- Optimize and tune data pipelines for performance, scalability, and reliability.
- Monitor and troubleshoot data infrastructure to ensure data quality, integrity, and availability.
- Implement security measures and best practices to protect sensitive data.
- Stay updated with emerging technologies and best practices in data engineering and data visualization.
- Document processes, workflows, and configurations to maintain a comprehensive knowledge base.
Requirements:
- Bachelor’s degree in Computer Science, Engineering, or related field. (Master’s degree preferred)
- Proven experience as a Data Engineer with expertise in Power BI, Python, PySpark, and Databricks.
- Strong proficiency in Power BI, including data modeling, DAX calculations, and creating interactive reports and dashboards.
- Solid understanding of data analytics concepts and techniques.
- Experience working with Big Data technologies such as Hadoop, Spark, or Kafka.
- Proficiency in programming languages such as Python and SQL.
- Hands-on experience with cloud platforms like AWS, Azure, or Google Cloud.
- Excellent analytical and problem-solving skills with attention to detail.
- Strong communication and collaboration skills to work effectively with cross-functional teams.
- Ability to work independently and manage multiple tasks simultaneously in a fast-paced environment.
Preferred Qualifications:
- Advanced degree in Computer Science, Engineering, or related field.
- Certifications in Power BI or related technologies.
- Experience with data visualization tools other than Power BI (e.g., Tableau, QlikView).
- Knowledge of machine learning concepts and frameworks.
Publicis Sapient Overview:
The Senior Associate People Senior Associate L1 in Data Engineering, you will translate client requirements into technical design, and implement components for data engineering solution. Utilize deep understanding of data integration and big data design principles in creating custom solutions or implementing package solutions. You will independently drive design discussions to insure the necessary health of the overall solution
.
Job Summary:
As Senior Associate L1 in Data Engineering, you will do technical design, and implement components for data engineering solution. Utilize deep understanding of data integration and big data design principles in creating custom solutions or implementing package solutions. You will independently drive design discussions to insure the necessary health of the overall solution
The role requires a hands-on technologist who has strong programming background like Java / Scala / Python, should have experience in Data Ingestion, Integration and data Wrangling, Computation, Analytics pipelines and exposure to Hadoop ecosystem components. Having hands-on knowledge on at least one of AWS, GCP, Azure cloud platforms will be preferable.
Role & Responsibilities:
Job Title: Senior Associate L1 – Data Engineering
Your role is focused on Design, Development and delivery of solutions involving:
• Data Ingestion, Integration and Transformation
• Data Storage and Computation Frameworks, Performance Optimizations
• Analytics & Visualizations
• Infrastructure & Cloud Computing
• Data Management Platforms
• Build functionality for data ingestion from multiple heterogeneous sources in batch & real-time
• Build functionality for data analytics, search and aggregation
Experience Guidelines:
Mandatory Experience and Competencies:
# Competency
1.Overall 3.5+ years of IT experience with 1.5+ years in Data related technologies
2.Minimum 1.5 years of experience in Big Data technologies
3.Hands-on experience with the Hadoop stack – HDFS, sqoop, kafka, Pulsar, NiFi, Spark, Spark Streaming, Flink, Storm, hive, oozie, airflow and other components required in building end to end data pipeline. Working knowledge on real-time data pipelines is added advantage.
4.Strong experience in at least of the programming language Java, Scala, Python. Java preferable
5.Hands-on working knowledge of NoSQL and MPP data platforms like Hbase, MongoDb, Cassandra, AWS Redshift, Azure SQLDW, GCP BigQuery etc
Preferred Experience and Knowledge (Good to Have):
# Competency
1.Good knowledge of traditional ETL tools (Informatica, Talend, etc) and database technologies (Oracle, MySQL, SQL Server, Postgres) with hands on experience
2.Knowledge on data governance processes (security, lineage, catalog) and tools like Collibra, Alation etc
3.Knowledge on distributed messaging frameworks like ActiveMQ / RabbiMQ / Solace, search & indexing and Micro services architectures
4.Performance tuning and optimization of data pipelines
5.CI/CD – Infra provisioning on cloud, auto build & deployment pipelines, code quality
6.Working knowledge with data platform related services on at least 1 cloud platform, IAM and data security
7.Cloud data specialty and other related Big data technology certifications
Job Title: Senior Associate L1 – Data Engineering
Personal Attributes:
• Strong written and verbal communication skills
• Articulation skills
• Good team player
• Self-starter who requires minimal oversight
• Ability to prioritize and manage multiple tasks
• Process orientation and the ability to define and set up processes

Title:- Senior Data Engineer
Experience: 4-6 yrs
Budget: 24-28 lpa
Location: Bangalore
Work of Mode: Work from office
Primary Skills: Data Bricks, Spark, Pyspark,Sql, Python, AWS
Qualification: Any Engineering degree
Responsibilities:
∙Design and build reusable components, frameworks and libraries at scale to support analytics products.
∙Design and implement product features in collaboration with business and Technology
stakeholders.
∙Anticipate, identify and solve issues concerning data management to improve data quality.
∙Clean, prepare and optimize data at scale for ingestion and consumption.
∙Drive the implementation of new data management projects and re-structure of the current data architecture.
∙Implement complex automated workflows and routines using workflow scheduling tools.
∙Build continuous integration, test-driven development and production deployment
frameworks.
∙Drive collaborative reviews of design, code, test plans and dataset implementation performed by other data engineers in support of maintaining data engineering standards.
∙Analyze and profile data for the purpose of designing scalable solutions.
∙Troubleshoot complex data issues and perform root cause analysis to proactively resolve
product and operational issues.
∙Mentor and develop other data engineers in adopting best practices.
Qualifications:
Primary skillset:
∙Experience working with distributed technology tools for developing Batch and Streaming pipelines using
o SQL, Spark, Python, PySpark [4+ years],
o Airflow [3+ years],
o Scala [2+ years].
∙Able to write code which is optimized for performance.
∙Experience in Cloud platform, e.g., AWS, GCP, Azure, etc.
∙Able to quickly pick up new programming languages, technologies, and frameworks.
∙Strong skills building positive relationships across Product and Engineering.
∙Able to influence and communicate effectively, both verbally and written, with team members and business stakeholders
∙Experience with creating/ configuring Jenkins pipeline for smooth CI/CD process for Managed Spark jobs, build Docker images, etc.
∙Working knowledge of Data warehousing, Data modelling, Governance and Data Architecture
Good to have:
∙Experience working with Data platforms, including EMR, Airflow, Databricks (Data
Engineering & Delta Lake components, and Lakehouse Medallion architecture), etc.
∙Experience working in Agile and Scrum development process.
∙Experience in EMR/ EC2, Databricks etc.
∙Experience working with Data warehousing tools, including SQL database, Presto, and
Snowflake
∙Experience architecting data product in Streaming, Serverless and Microservices Architecture and platform.
Job Title Big Data Developer
Job Description
Bachelor's degree in Engineering or Computer Science or equivalent OR Master's in Computer Applications or equivalent.
Solid Experience of software development experience and leading teams of engineers and scrum teams.
4+ years of hands-on experience of working with Map-Reduce, Hive, Spark (core, SQL and PySpark).
Solid Datawarehousing concepts.
Knowledge of Financial reporting ecosystem will be a plus.
4+ years of experience within Data Engineering/ Data Warehousing using Big Data technologies will be an addon.
Expert on Distributed ecosystem.
Hands-on experience with programming using Core Java or Python/Scala
Expert on Hadoop and Spark Architecture and its working principle
Hands-on experience on writing and understanding complex SQL(Hive/PySpark-dataframes), optimizing joins while processing huge amount of data.
Experience in UNIX shell scripting.
Roles & Responsibilities
Ability to design and develop optimized Data pipelines for batch and real time data processing
Should have experience in analysis, design, development, testing, and implementation of system applications
Demonstrated ability to develop and document technical and functional specifications and analyze software and system processing flows.
Excellent technical and analytical aptitude
Good communication skills.
Excellent Project management skills.
Results driven Approach.
Mandatory SkillsBig Data, PySpark, Hive
Data Engineering : Senior Engineer / Manager
As Senior Engineer/ Manager in Data Engineering, you will translate client requirements into technical design, and implement components for a data engineering solutions. Utilize a deep understanding of data integration and big data design principles in creating custom solutions or implementing package solutions. You will independently drive design discussions to insure the necessary health of the overall solution.
Must Have skills :
1. GCP
2. Spark streaming : Live data streaming experience is desired.
3. Any 1 coding language: Java/Pyhton /Scala
Skills & Experience :
- Overall experience of MINIMUM 5+ years with Minimum 4 years of relevant experience in Big Data technologies
- Hands-on experience with the Hadoop stack - HDFS, sqoop, kafka, Pulsar, NiFi, Spark, Spark Streaming, Flink, Storm, hive, oozie, airflow and other components required in building end to end data pipeline. Working knowledge on real-time data pipelines is added advantage.
- Strong experience in at least of the programming language Java, Scala, Python. Java preferable
- Hands-on working knowledge of NoSQL and MPP data platforms like Hbase, MongoDb, Cassandra, AWS Redshift, Azure SQLDW, GCP BigQuery etc.
- Well-versed and working knowledge with data platform related services on GCP
- Bachelor's degree and year of work experience of 6 to 12 years or any combination of education, training and/or experience that demonstrates the ability to perform the duties of the position
Your Impact :
- Data Ingestion, Integration and Transformation
- Data Storage and Computation Frameworks, Performance Optimizations
- Analytics & Visualizations
- Infrastructure & Cloud Computing
- Data Management Platforms
- Build functionality for data ingestion from multiple heterogeneous sources in batch & real-time
- Build functionality for data analytics, search and aggregation

5+ years of experience designing, developing, validating, and automating ETL processes 3+ years of experience traditional ETL tools such as Visual Studio, SQL Server Management Studio, SSIS, SSAS and SSRS 2+ years of experience with cloud technologies and platforms, such as: Kubernetes, Spark, Kafka, Azure Data Factory, Snowflake, ML Flow, Databricks, Airflow or similar Must have experience with designing and implementing data access layers Must be an expert with SQL/T-SQL and Python Must have experience in Kafka Define and implement data models with various database technologies like MongoDB, CosmosDB, Neo4j, MariaDB and SQL Serve Ingest and publish data from sources and to destinations via an API Exposure to ETL/ELT with using Kafka or Azure Event Hubs with Spark or Databricks is a plus Exposure to healthcare technologies and integrations for FHIR API, HL7 or other HIE protocols is a plus
Skills Required :
Designing, Developing, ETL, Visual Studio, Python, Spark, Kubernetes, Kafka, Azure Data Factory, SQL Server, Airflow, Databricks, T-SQL, MongoDB, CosmosDB, Snowflake, SSIS, SSAS, SSRS, FHIR API, HL7, HIE Protocols

DATA ENGINEERING CONSULTANT
About NutaNXT: NutaNXT is a next-gen Software Product Engineering services provider building ground-breaking products using AI/ML, Data Analytics, IOT, Cloud & new emerging technologies disrupting the global markets. Our mission is to help clients leverage our specialized Digital Product Engineering capabilities on Data Engineering, AI Automations, Software Full stack solutions and services to build best-in-class products and stay ahead of the curve. You will get a chance to work on multiple projects critical to NutaNXT needs with opportunities to learn, develop new skills,switch teams and projects as you and our fast-paced business grow and evolve. Location: Pune Experience: 6 to 8 years
Job Description: NutaNXT is looking for supporting the planning and implementation of data design services, providing sizing and configuration assistance and performing needs assessments. Delivery of architectures for transformations and modernizations of enterprise data solutions using Azure cloud data technologies. As a Data Engineering Consultant, you will collect, aggregate, store, and reconcile data in support of Customer's business decisions. You will design and build data pipelines, data streams, data service APIs, data generators and other end-user information portals and insight tools.
Mandatory Skills: -
- Demonstrable experience in enterprise level data platforms involving implementation of end-to-end data pipelines with Python or Scala - Hands-on experience with at least one of the leading public cloud data platforms (Ideally Azure)
- - Experience with different Databases (like column-oriented database, NoSQL database, RDBMS)
- - Experience in architecting data pipelines and solutions for both streaming and batch integrations using tools/frameworks like Azure Databricks, Azure Data Factory, Spark, Spark Streaming, etc
- . - Understanding of data modeling, warehouse design and fact/dimension concepts - Good Communication
Good To Have:
Certifications for any of the cloud services (Ideally Azure)
• Experience working with code repositories and continuous integration • Understanding of development and project methodologies
Why Join Us?
We offer Innovative work in AI & Data Engineering Space, with a unique, diverse workplace environment having a Continuous learning and development opportunities. These are just some of the reasons we're consistently being recognized as one of the best companies to work for, and why our people choose to grow careers at NutaNXT. We also offer a highly flexible, self-driven, remote work culture which fosters the best of innovation, creativity and work-life balance, market industry-leading compensation which we believe help us consistently deliver to our clients and grow in the highly competitive, fast evolving Digital Engineering space with a strong focus on building advanced software products for clients in the US, Europe and APAC regions.
You will be responsible for designing, building, and maintaining data pipelines that handle Real-world data at Compile. You will be handling both inbound and outbound data deliveries at Compile for datasets including Claims, Remittances, EHR, SDOH, etc.
You will
- Work on building and maintaining data pipelines (specifically RWD).
- Build, enhance and maintain existing pipelines in pyspark, python and help build analytical insights and datasets.
- Scheduling and maintaining pipeline jobs for RWD.
- Develop, test, and implement data solutions based on the design.
- Design and implement quality checks on existing and new data pipelines.
- Ensure adherence to security and compliance that is required for the products.
- Maintain relationships with various data vendors and track changes and issues across vendors and deliveries.
You have
- Hands-on experience with ETL process (min of 5 years).
- Excellent communication skills and ability to work with multiple vendors.
- High proficiency with Spark, SQL.
- Proficiency in Data modeling, validation, quality check, and data engineering concepts.
- Experience in working with big-data processing technologies using - databricks, dbt, S3, Delta lake, Deequ, Griffin, Snowflake, BigQuery.
- Familiarity with version control technologies, and CI/CD systems.
- Understanding of scheduling tools like Airflow/Prefect.
- Min of 3 years of experience managing data warehouses.
- Familiarity with healthcare datasets is a plus.
Compile embraces diversity and equal opportunity in a serious way. We are committed to building a team of people from many backgrounds, perspectives, and skills. We know the more inclusive we are, the better our work will be.
- Big data developer with 8+ years of professional IT experience with expertise in Hadoop ecosystem components in ingestion, Data modeling, querying, processing, storage, analysis, Data Integration and Implementing enterprise level systems spanning Big Data.
- A skilled developer with strong problem solving, debugging and analytical capabilities, who actively engages in understanding customer requirements.
- Expertise in Apache Hadoop ecosystem components like Spark, Hadoop Distributed File Systems(HDFS), HiveMapReduce, Hive, Sqoop, HBase, Zookeeper, YARN, Flume, Pig, Nifi, Scala and Oozie.
- Hands on experience in creating real - time data streaming solutions using Apache Spark core, Spark SQL & DataFrames, Kafka, Spark streaming and Apache Storm.
- Excellent knowledge of Hadoop architecture and daemons of Hadoop clusters, which include Name node,Data node, Resource manager, Node Manager and Job history server.
- Worked on both Cloudera and Horton works in Hadoop Distributions. Experience in managing Hadoop clustersusing Cloudera Manager tool.
- Well versed in installation, Configuration, Managing of Big Data and underlying infrastructure of Hadoop Cluster.
- Hands on experience in coding MapReduce/Yarn Programs using Java, Scala and Python for analyzing Big Data.
- Exposure to Cloudera development environment and management using Cloudera Manager.
- Extensively worked on Spark using Scala on cluster for computational (analytics), installed it on top of Hadoop performed advanced analytical application by making use of Spark with Hive and SQL/Oracle .
- Implemented Spark using PYTHON and utilizing Data frames and Spark SQL API for faster processing of data and handled importing data from different data sources into HDFS using Sqoop and performing transformations using Hive, MapReduce and then loading data into HDFS.
- Used Spark Data Frames API over Cloudera platform to perform analytics on Hive data.
- Hands on experience in MLlib from Spark which are used for predictive intelligence, customer segmentation and for smooth maintenance in Spark streaming.
- Experience in using Flume to load log files into HDFS and Oozie for workflow design and scheduling.
- Experience in optimizing MapReduce jobs to use HDFS efficiently by using various compression mechanisms.
- Working on creating data pipeline for different events of ingestion, aggregation, and load consumer response data into Hive external tables in HDFS location to serve as feed for tableau dashboards.
- Hands on experience in using Sqoop to import data into HDFS from RDBMS and vice-versa.
- In-depth Understanding of Oozie to schedule all Hive/Sqoop/HBase jobs.
- Hands on expertise in real time analytics with Apache Spark.
- Experience in converting Hive/SQL queries into RDD transformations using Apache Spark, Scala and Python.
- Extensive experience in working with different ETL tool environments like SSIS, Informatica and reporting tool environments like SQL Server Reporting Services (SSRS).
- Experience in Microsoft cloud and setting cluster in Amazon EC2 & S3 including the automation of setting & extending the clusters in AWS Amazon cloud.
- Extensively worked on Spark using Python on cluster for computational (analytics), installed it on top of Hadoop performed advanced analytical application by making use of Spark with Hive and SQL.
- Strong experience and knowledge of real time data analytics using Spark Streaming, Kafka and Flume.
- Knowledge in installation, configuration, supporting and managing Hadoop Clusters using Apache, Cloudera (CDH3, CDH4) distributions and on Amazon web services (AWS).
- Experienced in writing Ad Hoc queries using Cloudera Impala, also used Impala analytical functions.
- Experience in creating Data frames using PySpark and performing operation on the Data frames using Python.
- In depth understanding/knowledge of Hadoop Architecture and various components such as HDFS and MapReduce Programming Paradigm, High Availability and YARN architecture.
- Establishing multiple connections to different Redshift clusters (Bank Prod, Card Prod, SBBDA Cluster) and provide the access for pulling the information we need for analysis.
- Generated various kinds of knowledge reports using Power BI based on Business specification.
- Developed interactive Tableau dashboards to provide a clear understanding of industry specific KPIs using quick filters and parameters to handle them more efficiently.
- Well Experience in projects using JIRA, Testing, Maven and Jenkins build tools.
- Experienced in designing, built, and deploying and utilizing almost all the AWS stack (Including EC2, S3,), focusing on high-availability, fault tolerance, and auto-scaling.
- Good experience with use-case development, with Software methodologies like Agile and Waterfall.
- Working knowledge of Amazon's Elastic Cloud Compute( EC2 ) infrastructure for computational tasks and Simple Storage Service ( S3 ) as Storage mechanism.
- Good working experience in importing data using Sqoop, SFTP from various sources like RDMS, Teradata, Mainframes, Oracle, Netezza to HDFS and performed transformations on it using Hive, Pig and Spark .
- Extensive experience in Text Analytics, developing different Statistical Machine Learning solutions to various business problems and generating data visualizations using Python and R.
- Proficient in NoSQL databases including HBase, Cassandra, MongoDB and its integration with Hadoop cluster.
- Hands on experience in Hadoop Big data technology working on MapReduce, Pig, Hive as Analysis tool, Sqoop and Flume data import/export tools.
Analytics Job Description
We are hiring an Analytics Engineer to help drive our Business Intelligence efforts. You will
partner closely with leaders across the organization, working together to understand the how
and why of people, team and company challenges, workflows and culture. The team is
responsible for delivering data and insights that drive decision-making, execution, and
investments for our product initiatives.
You will work cross-functionally with product, marketing, sales, engineering, finance, and our
customer-facing teams enabling them with data and narratives about the customer journey.
You’ll also work closely with other data teams, such as data engineering and product analytics,
to ensure we are creating a strong data culture at Blend that enables our cross-functional partners
to be more data-informed.
Role : DataEngineer
Please find below the JD for the DataEngineer Role..
Location: Guindy,Chennai
How you’ll contribute:
• Develop objectives and metrics, ensure priorities are data-driven, and balance short-
term and long-term goals
• Develop deep analytical insights to inform and influence product roadmaps and
business decisions and help improve the consumer experience
• Work closely with GTM and supporting operations teams to author and develop core
data sets that empower analyses
• Deeply understand the business and proactively spot risks and opportunities
• Develop dashboards and define metrics that drive key business decisions
• Build and maintain scalable ETL pipelines via solutions such as Fivetran, Hightouch,
and Workato
• Design our Analytics and Business Intelligence architecture, assessing and
implementing new technologies that fitting
• Work with our engineering teams to continually make our data pipelines and tooling
more resilient
Who you are:
• Bachelor’s degree or equivalent required from an accredited institution with a
quantitative focus such as Economics, Operations Research, Statistics, Computer Science OR 1-3 Years of Experience as a Data Analyst, Data Engineer, Data Scientist
• Must have strong SQL and data modeling skills, with experience applying skills to
thoughtfully create data models in a warehouse environment.
• A proven track record of using analysis to drive key decisions and influence change
• Strong storyteller and ability to communicate effectively with managers and
executives
• Demonstrated ability to define metrics for product areas, understand the right
questions to ask and push back on stakeholders in the face of ambiguous, complex
problems, and work with diverse teams with different goals
• A passion for documentation.
• A solution-oriented growth mindset. You’ll need to be a self-starter and thrive in a
dynamic environment.
• A bias towards communication and collaboration with business and technical
stakeholders.
• Quantitative rigor and systems thinking.
• Prior startup experience is preferred, but not required.
• Interest or experience in machine learning techniques (such as clustering, decision
tree, and segmentation)
• Familiarity with a scientific computing language, such as Python, for data wrangling
and statistical analysis
• Experience with a SQL focused data transformation framework such as dbt
• Experience with a Business Intelligence Tool such as Mode/Tableau
Mandatory Skillset:
-Very Strong in SQL
-Spark OR pyspark OR Python
-Shell Scripting


Job Description:
As an Azure Data Engineer, your role will involve designing, developing, and maintaining data solutions on the Azure platform. You will be responsible for building and optimizing data pipelines, ensuring data quality and reliability, and implementing data processing and transformation logic. Your expertise in Azure Databricks, Python, SQL, Azure Data Factory (ADF), PySpark, and Scala will be essential for performing the following key responsibilities:
Designing and developing data pipelines: You will design and implement scalable and efficient data pipelines using Azure Databricks, PySpark, and Scala. This includes data ingestion, data transformation, and data loading processes.
Data modeling and database design: You will design and implement data models to support efficient data storage, retrieval, and analysis. This may involve working with relational databases, data lakes, or other storage solutions on the Azure platform.
Data integration and orchestration: You will leverage Azure Data Factory (ADF) to orchestrate data integration workflows and manage data movement across various data sources and targets. This includes scheduling and monitoring data pipelines.
Data quality and governance: You will implement data quality checks, validation rules, and data governance processes to ensure data accuracy, consistency, and compliance with relevant regulations and standards.
Performance optimization: You will optimize data pipelines and queries to improve overall system performance and reduce processing time. This may involve tuning SQL queries, optimizing data transformation logic, and leveraging caching techniques.
Monitoring and troubleshooting: You will monitor data pipelines, identify performance bottlenecks, and troubleshoot issues related to data ingestion, processing, and transformation. You will work closely with cross-functional teams to resolve data-related problems.
Documentation and collaboration: You will document data pipelines, data flows, and data transformation processes. You will collaborate with data scientists, analysts, and other stakeholders to understand their data requirements and provide data engineering support.
Skills and Qualifications:
Strong experience with Azure Databricks, Python, SQL, ADF, PySpark, and Scala.
Proficiency in designing and developing data pipelines and ETL processes.
Solid understanding of data modeling concepts and database design principles.
Familiarity with data integration and orchestration using Azure Data Factory.
Knowledge of data quality management and data governance practices.
Experience with performance tuning and optimization of data pipelines.
Strong problem-solving and troubleshooting skills related to data engineering.
Excellent collaboration and communication skills to work effectively in cross-functional teams.
Understanding of cloud computing principles and experience with Azure services.


Requirements
Experience
- 5+ years of professional experience in implementing MLOps framework to scale up ML in production.
- Hands-on experience with Kubernetes, Kubeflow, MLflow, Sagemaker, and other ML model experiment management tools including training, inference, and evaluation.
- Experience in ML model serving (TorchServe, TensorFlow Serving, NVIDIA Triton inference server, etc.)
- Proficiency with ML model training frameworks (PyTorch, Pytorch Lightning, Tensorflow, etc.).
- Experience with GPU computing to do data and model training parallelism.
- Solid software engineering skills in developing systems for production.
- Strong expertise in Python.
- Building end-to-end data systems as an ML Engineer, Platform Engineer, or equivalent.
- Experience working with cloud data processing technologies (S3, ECR, Lambda, AWS, Spark, Dask, ElasticSearch, Presto, SQL, etc.).
- Having Geospatial / Remote sensing experience is a plus.
· Core responsibilities to include analyze business requirements and designs for accuracy and completeness. Develops and maintains relevant product.
· BlueYonder is seeking a Senior/Principal Architect in the Data Services department (under Luminate Platform ) to act as one of key technology leaders to build and manage BlueYonder’ s technology assets in the Data Platform and Services.
· This individual will act as a trusted technical advisor and strategic thought leader to the Data Services department. The successful candidate will have the opportunity to lead, participate, guide, and mentor other people in the team on architecture and design in a hands-on manner. You are responsible for technical direction of Data Platform. This position reports to the Global Head, Data Services and will be based in Bangalore, India.
· Core responsibilities to include Architecting and designing (along with counterparts and distinguished Architects) a ground up cloud native (we use Azure) SaaS product in Order management and micro-fulfillment
· The team currently comprises of 60+ global associates across US, India (COE) and UK and is expected to grow rapidly. The incumbent will need to have leadership qualities to also mentor junior and mid-level software associates in our team. This person will lead the Data platform architecture – Streaming, Bulk with Snowflake/Elastic Search/other tools
Our current technical environment:
· Software: Java, Springboot, Gradle, GIT, Hibernate, Rest API, OAuth , Snowflake
· • Application Architecture: Scalable, Resilient, event driven, secure multi-tenant Microservices architecture
· • Cloud Architecture: MS Azure (ARM templates, AKS, HD insight, Application gateway, Virtue Networks, Event Hub, Azure AD)
· Frameworks/Others: Kubernetes, Kafka, Elasticsearch, Spark, NOSQL, RDBMS, Springboot, Gradle GIT, Ignite
Job DescriptionPosition: Sr Data Engineer – Databricks & AWS
Experience: 4 - 5 Years
Company Profile:
Exponentia.ai is an AI tech organization with a presence across India, Singapore, the Middle East, and the UK. We are an innovative and disruptive organization, working on cutting-edge technology to help our clients transform into the enterprises of the future. We provide artificial intelligence-based products/platforms capable of automated cognitive decision-making to improve productivity, quality, and economics of the underlying business processes. Currently, we are transforming ourselves and rapidly expanding our business.
Exponentia.ai has developed long-term relationships with world-class clients such as PayPal, PayU, SBI Group, HDFC Life, Kotak Securities, Wockhardt and Adani Group amongst others.
One of the top partners of Cloudera (leading analytics player) and Qlik (leader in BI technologies), Exponentia.ai has recently been awarded the ‘Innovation Partner Award’ by Qlik in 2017.
Get to know more about us on our website: http://www.exponentia.ai/ and Life @Exponentia.
Role Overview:
· A Data Engineer understands the client requirements and develops and delivers the data engineering solutions as per the scope.
· The role requires good skills in the development of solutions using various services required for data architecture on Databricks Delta Lake, streaming, AWS, ETL Development, and data modeling.
Job Responsibilities
• Design of data solutions on Databricks including delta lake, data warehouse, data marts and other data solutions to support the analytics needs of the organization.
• Apply best practices during design in data modeling (logical, physical) and ETL pipelines (streaming and batch) using cloud-based services.
• Design, develop and manage the pipelining (collection, storage, access), data engineering (data quality, ETL, Data Modelling) and understanding (documentation, exploration) of the data.
• Interact with stakeholders regarding data landscape understanding, conducting discovery exercises, developing proof of concepts and demonstrating it to stakeholders.
Technical Skills
• Has more than 2 Years of experience in developing data lakes, and datamarts on the Databricks platform.
• Proven skill sets in AWS Data Lake services such as - AWS Glue, S3, Lambda, SNS, IAM, and skills in Spark, Python, and SQL.
• Experience in Pentaho
• Good understanding of developing a data warehouse, data marts etc.
• Has a good understanding of system architectures, and design patterns and should be able to design and develop applications using these principles.
Personality Traits
• Good collaboration and communication skills
• Excellent problem-solving skills to be able to structure the right analytical solutions.
• Strong sense of teamwork, ownership, and accountability
• Analytical and conceptual thinking
• Ability to work in a fast-paced environment with tight schedules.
• Good presentation skills with the ability to convey complex ideas to peers and management.
Education:
BE / ME / MS/MCA.
- KSQL
- Data Engineering spectrum (Java/Spark)
- Spark Scala / Kafka Streaming
- Confluent Kafka components
- Basic understanding of Hadoop
About Kloud9:
Kloud9 exists with the sole purpose of providing cloud expertise to the retail industry. Our team of cloud architects, engineers and developers help retailers launch a successful cloud initiative so you can quickly realise the benefits of cloud technology. Our standardised, proven cloud adoption methodologies reduce the cloud adoption time and effort so you can directly benefit from lower migration costs.
Kloud9 was founded with the vision of bridging the gap between E-commerce and cloud. The E-commerce of any industry is limiting and poses a huge challenge in terms of the finances spent on physical data structures.
At Kloud9, we know migrating to the cloud is the single most significant technology shift your company faces today. We are your trusted advisors in transformation and are determined to build a deep partnership along the way. Our cloud and retail experts will ease your transition to the cloud.
Our sole focus is to provide cloud expertise to retail industry giving our clients the empowerment that will take their business to the next level. Our team of proficient architects, engineers and developers have been designing, building and implementing solutions for retailers for an average of more than 20 years.
We are a cloud vendor that is both platform and technology independent. Our vendor independence not just provides us with a unique perspective into the cloud market but also ensures that we deliver the cloud solutions available that best meet our clients' requirements.
What we are looking for:
● 3+ years’ experience developing Data & Analytic solutions
● Experience building data lake solutions leveraging one or more of the following AWS, EMR, S3, Hive& Spark
● Experience with relational SQL
● Experience with scripting languages such as Shell, Python
● Experience with source control tools such as GitHub and related dev process
● Experience with workflow scheduling tools such as Airflow
● In-depth knowledge of scalable cloud
● Has a passion for data solutions
● Strong understanding of data structures and algorithms
● Strong understanding of solution and technical design
● Has a strong problem-solving and analytical mindset
● Experience working with Agile Teams.
● Able to influence and communicate effectively, both verbally and written, with team members and business stakeholders
● Able to quickly pick up new programming languages, technologies, and frameworks
● Bachelor’s Degree in computer science
Why Explore a Career at Kloud9:
With job opportunities in prime locations of US, London, Poland and Bengaluru, we help build your career paths in cutting edge technologies of AI, Machine Learning and Data Science. Be part of an inclusive and diverse workforce that's changing the face of retail technology with their creativity and innovative solutions. Our vested interest in our employees translates to deliver the best products and solutions to our customers.

Requirements:
● Understanding our data sets and how to bring them together.
● Working with our engineering team to support custom solutions offered to the product development.
● Filling the gap between development, engineering and data ops.
● Creating, maintaining and documenting scripts to support ongoing custom solutions.
● Excellent organizational skills, including attention to precise details
● Strong multitasking skills and ability to work in a fast-paced environment
● 5+ years experience with Python to develop scripts.
● Know your way around RESTFUL APIs.[Able to integrate not necessary to publish]
● You are familiar with pulling and pushing files from SFTP and AWS S3.
● Experience with any Cloud solutions including GCP / AWS / OCI / Azure.
● Familiarity with SQL programming to query and transform data from relational Databases.
● Familiarity to work with Linux (and Linux work environment).
● Excellent written and verbal communication skills
● Extracting, transforming, and loading data into internal databases and Hadoop
● Optimizing our new and existing data pipelines for speed and reliability
● Deploying product build and product improvements
● Documenting and managing multiple repositories of code
● Experience with SQL and NoSQL databases (Casendra, MySQL)
● Hands-on experience in data pipelining and ETL. (Any of these frameworks/tools: Hadoop, BigQuery,
RedShift, Athena)
● Hands-on experience in AirFlow
● Understanding of best practices, common coding patterns and good practices around
● storing, partitioning, warehousing and indexing of data
● Experience in reading the data from Kafka topic (both live stream and offline)
● Experience in PySpark and Data frames
Responsibilities:
You’ll
● Collaborating across an agile team to continuously design, iterate, and develop big data systems.
● Extracting, transforming, and loading data into internal databases.
● Optimizing our new and existing data pipelines for speed and reliability.
● Deploying new products and product improvements.
● Documenting and managing multiple repositories of code.
About Telstra
Telstra is Australia’s leading telecommunications and technology company, with operations in more than 20 countries, including In India where we’re building a new Innovation and Capability Centre (ICC) in Bangalore.
We’re growing, fast, and for you that means many exciting opportunities to develop your career at Telstra. Join us on this exciting journey, and together, we’ll reimagine the future.
Why Telstra?
- We're an iconic Australian company with a rich heritage that's been built over 100 years. Telstra is Australia's leading Telecommunications and Technology Company. We've been operating internationally for more than 70 years.
- International presence spanning over 20 countries.
- We are one of the 20 largest telecommunications providers globally
- At Telstra, the work is complex and stimulating, but with that comes a great sense of achievement. We are shaping the tomorrow's modes of communication with our innovation driven teams.
Telstra offers an opportunity to make a difference to lives of millions of people by providing the choice of flexibility in work and a rewarding career that you will be proud of!
About the team
Being part of Networks & IT means you'll be part of a team that focuses on extending our network superiority to enable the continued execution of our digital strategy.
With us, you'll be working with world-leading technology and change the way we do IT to ensure business needs drive priorities, accelerating our digitisation programme.
Focus of the role
Any new engineer who comes into data chapter would be mostly into developing reusable data processing and storage frameworks that can be used across data platform.
About you
To be successful in the role, you'll bring skills and experience in:-
Essential
- Hands-on experience in Spark Core, Spark SQL, SQL/Hive/Impala, Git/SVN/Any other VCS and Data warehousing
- Skilled in the Hadoop Ecosystem(HDP/Cloudera/MapR/EMR etc)
- Azure data factory/Airflow/control-M/Luigi
- PL/SQL
- Exposure to NOSQL(Hbase/Cassandra/GraphDB(Neo4J)/MongoDB)
- File formats (Parquet/ORC/AVRO/Delta/Hudi etc.)
- Kafka/Kinesis/Eventhub
Highly Desirable
Experience and knowledgeable on the following:
- Spark Streaming
- Cloud exposure (Azure/AWS/GCP)
- Azure data offerings - ADF, ADLS2, Azure Databricks, Azure Synapse, Eventhubs, CosmosDB etc.
- Presto/Athena
- Azure DevOps
- Jenkins/ Bamboo/Any similar build tools
- Power BI
- Prior experience in building or working in team building reusable frameworks,
- Data modelling.
- Data Architecture and design principles. (Delta/Kappa/Lambda architecture)
- Exposure to CI/CD
- Code Quality - Static and Dynamic code scans
- Agile SDLC
If you've got a passion to innovate, succeed as part of a great team, and looking for the next step in your career, we'd welcome you to apply!
___________________________
We’re committed to building a diverse and inclusive workforce in all its forms. We encourage applicants from diverse gender, cultural and linguistic backgrounds and applicants who may be living with a disability. We also offer flexibility in all our roles, to ensure everyone can participate.
To learn more about how we support our people, including accessibility adjustments we can provide you through the recruitment process, visit tel.st/thrive.
About Kloud9:
Kloud9 exists with the sole purpose of providing cloud expertise to the retail industry. Our team of cloud architects, engineers and developers help retailers launch a successful cloud initiative so you can quickly realise the benefits of cloud technology. Our standardised, proven cloud adoption methodologies reduce the cloud adoption time and effort so you can directly benefit from lower migration costs.
Kloud9 was founded with the vision of bridging the gap between E-commerce and cloud. The E-commerce of any industry is limiting and poses a huge challenge in terms of the finances spent on physical data structures.
At Kloud9, we know migrating to the cloud is the single most significant technology shift your company faces today. We are your trusted advisors in transformation and are determined to build a deep partnership along the way. Our cloud and retail experts will ease your transition to the cloud.
Our sole focus is to provide cloud expertise to retail industry giving our clients the empowerment that will take their business to the next level. Our team of proficient architects, engineers and developers have been designing, building and implementing solutions for retailers for an average of more than 20 years.
We are a cloud vendor that is both platform and technology independent. Our vendor independence not just provides us with a unique perspective into the cloud market but also ensures that we deliver the cloud solutions available that best meet our clients' requirements.
What we are looking for:
● 3+ years’ experience developing Big Data & Analytic solutions
● Experience building data lake solutions leveraging Google Data Products (e.g. Dataproc, AI Building Blocks, Looker, Cloud Data Fusion, Dataprep, etc.), Hive, Spark
● Experience with relational SQL/No SQL
● Experience with Spark (Scala/Python/Java) and Kafka
● Work experience with using Databricks (Data Engineering and Delta Lake components)
● Experience with source control tools such as GitHub and related dev process
● Experience with workflow scheduling tools such as Airflow
● In-depth knowledge of any scalable cloud vendor(GCP preferred)
● Has a passion for data solutions
● Strong understanding of data structures and algorithms
● Strong understanding of solution and technical design
● Has a strong problem solving and analytical mindset
● Experience working with Agile Teams.
● Able to influence and communicate effectively, both verbally and written, with team members and business stakeholders
● Able to quickly pick up new programming languages, technologies, and frameworks
● Bachelor’s Degree in computer science
Why Explore a Career at Kloud9:
With job opportunities in prime locations of US, London, Poland and Bengaluru, we help build your career paths in cutting edge technologies of AI, Machine Learning and Data Science. Be part of an inclusive and diverse workforce that's changing the face of retail technology with their creativity and innovative solutions. Our vested interest in our employees translates to deliver the best products and solutions to our customers!

About the Company :
Our Client enables enterprises in their digital transformation journey by offering Consulting & Implementation Services related to Data Analytics &Enterprise Performance Management (EPM).
Our Cleint deliver the best-suited solutions to our customers across industries such as Retail & E-commerce, Consumer Goods, Pharmaceuticals & Life Sciences, Real Estate & Senior Housing, Hi-tech, Media & Telecom as Manufacturing and Automotive clientele.
Our in-house research and innovation lab has conceived multiple plug-n-play apps, toolkits and plugins to streamline implementation and faster time-to-market
Job Title– AWS Developer
Notice period- Immediate to 60 days
Experience – 3-8
Location - Noida, Mumbai, Bangalore & Kolkata
Roles & Responsibilities
- Bachelor’s degree in Computer Science or a related analytical field or equivalent experience is preferred
- 3+ years’ experience in one or more architecture domains (e.g., business architecture, solutions architecture, application architecture)
- Must have 2 years of experience in design and implementation of cloud workloads in AWS.
- Minimum of 2 years of experience handling workloads in large-scale environments. Experience in managing large operational cloud environments spanning multiple tenants through techniques such as Multi-Account management, AWS Well Architected Best Practices.
- Minimum 3 years of microservice architectural experience.
- Minimum of 3 years of experience working exclusively designing and implementing cloud-native workloads.
- Experience with analysing and defining technical requirements & design specifications.
- Experience with database design with both relational and document-based database systems.
- Experience with integrating complex multi-tier applications.
- Experience with API design and development.
- Experience with cloud networking and network security, including virtual networks, network security groups, cloud-native firewalls, etc.
- Proven ability to write programs using an object-oriented or functional programming language such as Spark, Python, AWS Glue, Aws Lambda
Job Specification
*Strong and innovative approach to problem solving and finding solutions.
*Excellent communicator (written and verbal, formal and informal).
*Flexible and proactive/self-motivated working style with strong personal ownership of problem resolution.
*Ability to multi-task under pressure and work independently with minimal supervision.
Regards
Team Merito