Cutshort logo

50+ Spark Jobs in India

Apply to 50+ Spark Jobs on CutShort.io. Find your next job, effortlessly. Browse Spark Jobs and apply today!

icon
Tecblic Pvt Ltd
Himanshu Chavla
Posted by Himanshu Chavla
Ahmedabad
3 - 5 yrs
₹9L - ₹12L / yr
Data engineering
Hadoop
Spark
AWS CloudFormation
Microsoft Windows Azure
+1 more

About Job


We are seeking an experienced Data Engineer to join our data team. As a Senior Data Engineer, you will work on various data engineering tasks including designing and optimizing data pipelines, data modelling, and troubleshooting data issues. You will collaborate with other data team members, stakeholders, and data scientists to provide data-driven insights and solutions to the organization. Experience required is of 3+ Years.


Responsibilities:

Design and optimize data pipelines for various data sources

Design and implement efficient data storage and retrieval mechanisms

Develop data modelling solutions and data validation mechanisms

Troubleshoot data-related issues and recommend process improvements

Collaborate with data scientists and stakeholders to provide data-driven insights and solutions

Coach and mentor junior data engineers in the team


Skills Required:

3+ years of experience in data engineering or related field

Strong experience in designing and optimizing data pipelines, and data modelling

Strong proficiency in programming languages Python

Experience with big data technologies like Hadoop, Spark, and Hive

Experience with cloud data services such as AWS, Azure, and GCP

Strong experience with database technologies like SQL, NoSQL, and data warehousing

Knowledge of distributed computing and storage systems

Understanding of DevOps and power automate and Microsoft Fabric will be an added advantage

Strong analytical and problem-solving skills

Excellent communication and collaboration skills

Qualifications


Bachelor's degree in Computer Science, Data Science, or a Computer related field (Master's degree preferred)

Read more
Smartavya Analytica

Smartavya Analytica

Agency job
via Pluginlive by Joslyn Gomes
Mumbai
12 - 15 yrs
₹30L - ₹35L / yr
Hadoop
Cloudera
HDFS
Apache Hive
Apache Impala
+3 more

Experience: 12-15 Years

Key Responsibilities: 

  • Client Engagement & Requirements Gathering: Independently engage with client stakeholders to
  • understand data landscapes and requirements, translating them into functional and technical specifications.
  • Data Architecture & Solution Design: Architect and implement Hadoop-based Cloudera CDP solutions,
  • including data integration, data warehousing, and data lakes.
  • Data Processes & Governance: Develop data ingestion and ETL/ELT frameworks, ensuring robust data governance and quality practices.
  • Performance Optimization: Provide SQL expertise and optimize Hadoop ecosystems (HDFS, Ozone, Kudu, Spark Streaming, etc.) for maximum performance.
  • Coding & Development: Hands-on coding in relevant technologies and frameworks, ensuring project deliverables meet stringent quality and performance standards.
  • API & Database Management: Integrate APIs and manage databases (e.g., PostgreSQL, Oracle) to support seamless data flows.
  • Leadership & Mentoring: Guide and mentor a team of data engineers and analysts, fostering collaboration and technical excellence.

Skills Required:

  • a. Technical Proficiency:
  • • Extensive experience with Hadoop ecosystem tools and services (HDFS, YARN, Cloudera
  • Manager, Impala, Kudu, Hive, Spark Streaming, etc.).
  • • Proficiency in programming languages like Spark, Python, Scala and a strong grasp of SQL
  • performance tuning.
  • • ETL tool expertise (e.g., Informatica, Talend, Apache Nifi) and data modelling knowledge.
  • • API integration skills for effective data flow management.
  • b. Project Management & Communication:
  • • Proven ability to lead large-scale data projects and manage project timelines.
  • • Excellent communication, presentation, and critical thinking skills.
  • c. Client & Team Leadership:
  • • Engage effectively with clients and partners, leading onsite and offshore teams.


Read more
Affine
Rishika Chadha
Posted by Rishika Chadha
Remote only
5 - 8 yrs
Best in industry
skill iconScala
ETL
Apache Kafka
Object Oriented Programming (OOPs)
CI/CD
+4 more

Role Objective:


Big Data Engineer will be responsible for expanding and optimizing our data and database architecture, as well as optimizing data flow and collection for cross functional teams. The ideal candidate is an experienced data pipeline builder and data wrangler who enjoys optimizing data systems and building. The Data Engineer will support our software developers, database architects, data analysts and data scientists on data initiatives and will ensure optimal data delivery architecture is consistent throughout ongoing projects. They must be self-directed and comfortable supporting the data needs of multiple teams, systems, and products


Roles & Responsibilities:

  • Sound knowledge in Spark architecture and distributed computing and Spark streaming.
  • Proficient in Spark – including RDD and Data frames core functions, troubleshooting and performance tuning.
  • SFDC(Data modelling experience) would be given preference
  • Good understanding in object-oriented concepts and hands on experience on Scala with excellent programming logic and technique.
  • Good in functional programming and OOPS concept on Scala
  • Good experience in SQL – should be able to write complex queries.
  • Managing the team of Associates and Senior Associates and ensuring the utilization is maintained across the project.
  • Able to mentor new members for onboarding to the project.
  • Understand the client requirement and able to design, develop from scratch and deliver.
  • AWS cloud experience would be preferable.
  • Design, build and operationalize large scale enterprise data solutions and applications using one or more of AWS data and analytics services - DynamoDB, RedShift, Kinesis, Lambda, S3, etc. (preferred)
  • Hands on experience utilizing AWS Management Tools (CloudWatch, CloudTrail) to proactively monitor large and complex deployments (preferred)
  • Experience in analyzing, re-architecting, and re-platforming on-premises data warehouses to data platforms on AWS (preferred)
  • Leading the client calls to flag off any delays, blockers, escalations and collate all the requirements.
  • Managing project timing, client expectations and meeting deadlines.
  • Should have played project and team management roles.
  • Facilitate meetings within the team on regular basis.
  • Understand business requirement and analyze different approaches and plan deliverables and milestones for the project.
  • Optimization, maintenance, and support of pipelines.
  • Strong analytical and logical skills.
  • Ability to comfortably tackling new challenges and learn
Read more
Affine
Jeeba P
Posted by Jeeba P
Remote only
3 - 8 yrs
Best in industry
skill iconScala
Spark
Apache Kafka
SQL
skill iconAmazon Web Services (AWS)

Role Objective:


Big Data Engineer will be responsible for expanding and optimizing our data and database architecture, as well as optimizing data flow and collection for cross functional teams. The ideal candidate is an experienced data pipeline builder and data wrangler who enjoys optimizing data systems and building. The Data Engineer will support our software developers, database architects, data analysts and data scientists on data initiatives and will ensure optimal data delivery architecture is consistent throughout ongoing projects. They must be self-directed and comfortable supporting the data needs of multiple teams, systems, and products


Roles & Responsibilities:

  • Sound knowledge in Spark architecture and distributed computing and Spark streaming.
  • Proficient in Spark – including RDD and Data frames core functions, troubleshooting and performance tuning.
  • Good understanding in object-oriented concepts and hands on experience on Scala with excellent programming logic and technique.
  • Good in functional programming and OOPS concept on Scala
  • Good experience in SQL – should be able to write complex queries.
  • Managing the team of Associates and Senior Associates and ensuring the utilization is maintained across the project.
  • Able to mentor new members for onboarding to the project.
  • Understand the client requirement and able to design, develop from scratch and deliver.
  • AWS cloud experience would be preferable.
  • Design, build and operationalize large scale enterprise data solutions and applications using one or more of AWS data and analytics services - DynamoDB, RedShift, Kinesis, Lambda, S3, etc. (preferred)
  • Hands on experience utilizing AWS Management Tools (CloudWatch, CloudTrail) to proactively monitor large and complex deployments (preferred)
  • Experience in analyzing, re-architecting, and re-platforming on-premises data warehouses to data platforms on AWS (preferred)
  • Leading the client calls to flag off any delays, blockers, escalations and collate all the requirements.
  • Managing project timing, client expectations and meeting deadlines.
  • Should have played project and team management roles.
  • Facilitate meetings within the team on regular basis.
  • Understand business requirement and analyze different approaches and plan deliverables and milestones for the project.
  • Optimization, maintenance, and support of pipelines.
  • Strong analytical and logical skills.
  • Ability to comfortably tackling new challenges and learn

External Skills And Expertise

Must have Skills:

  • Scala
  • Spark
  • SQL (Intermediate to advanced level)
  • Spark Streaming
  • AWS preferable/Any cloud
  • Kafka /Kinesis/Any streaming services
  • Object-Oriented Programming
  • Hive, ETL/ELT design experience
  • CICD experience (ETL pipeline deployment)

Good to Have Skills:

  • AWS Certification
  • Git/similar version control tool
  • Knowledge in CI/CD, Microservices


Read more
Solix Technologies

at Solix Technologies

3 recruiters
Sumathi Arramraju
Posted by Sumathi Arramraju
Hyderabad
3 - 7 yrs
₹6L - ₹12L / yr
Hadoop
skill iconJava
HDFS
Spring
Spark
+1 more
Primary Skills required: Java, J2ee, JSP, Servlets, JDBC, Tomcat, Hadoop (hdfs, map reduce, hive, hbase, spark, impala) 
Secondary Skills: Streaming, Archiving , AWS / AZURE / CLOUD

Role:
·         Should have strong programming and support experience in Java, J2EE technologies 
·         Should have good experience in Core Java, JSP, Sevlets, JDBC
·         Good exposure in Hadoop development ( HDFS, Map Reduce, Hive, HBase, Spark)
·         Should have 2+ years of Java experience and 1+ years of experience in Hadoop 
·         Should possess good communication skills
·         Web Services or Elastic \ Map Reduce 
·         Familiarity with data-loading tools such as Sqoop
·         Good to know: Spark, Storm, Apache HBase
Read more
HrBizHub

HrBizHub

Agency job
via HR BIZ HUB by Pooja shankla
Bengaluru (Bangalore), Gurugram
3 - 7 yrs
₹3L - ₹18L / yr
Bigdata
Hibernate (Java)
skill iconSpring Boot
Microservices
Spark

software development and automated testing Proficient in Big Data technologies Designs, codes, tests, corrects and documents large and/or complex programs and program modifications from supplied specifications using agreed standards and tools, to achieve a well engineered result Proficient and Hands-on Data Warehousing,

Experience with Agile development, Continuous Integration, and Continuous Delivery Ability to effectively interpret technical and business objectives and provide solutions Strong communication skills, with ability to articulate technical solutions effectively across diverse group of stakeholders

Need to be a fast learner willing to adapt to evolving needs of the developer community.




Thanks & Regards

snehalata verma

IT Recruiter --HrBizHub


Read more
MathCo
Nabhan Mustafa
Posted by Nabhan Mustafa
Bengaluru (Bangalore)
2 - 8 yrs
Best in industry
Data Warehouse (DWH)
Microsoft Windows Azure
Data engineering
skill iconPython
skill iconAmazon Web Services (AWS)
+2 more
  • Responsible for designing, storing, processing, and maintaining of large-scale data and related infrastructure.
  • Can drive multiple projects both from operational and technical standpoint.
  • Ideate and build PoV or PoC for new product that can help drive more business.
  • Responsible for defining, designing, and implementing data engineering best practices, strategies, and solutions.
  • Is an Architect who can guide the customers, team, and overall organization on tools, technologies, and best practices around data engineering.
  • Lead architecture discussions, align with business needs, security, and best practices.
  • Has strong conceptual understanding of Data Warehousing and ETL, Data Governance and Security, Cloud Computing, and Batch & Real Time data processing
  • Has strong execution knowledge of Data Modeling, Databases in general (SQL and NoSQL), software development lifecycle and practices, unit testing, functional programming, etc.
  • Understanding of Medallion architecture pattern
  • Has worked on at least one cloud platform.
  • Has worked as data architect and executed multiple end-end data engineering project.
  • Has extensive knowledge of different data architecture designs and data modelling concepts.
  • Manages conversation with the client stakeholders to understand the requirement and translate it into technical outcomes.


Required Tech Stack

 

  • Strong proficiency in SQL
  • Experience working on any of the three major cloud platforms i.e., AWS/Azure/GCP
  • Working knowledge of an ETL and/or orchestration tools like IICS, Talend, Matillion, Airflow, Azure Data Factory, AWS Glue, GCP Composer, etc.
  • Working knowledge of one or more OLTP databases (Postgres, MySQL, SQL Server, etc.)
  • Working knowledge of one or more Data Warehouse like Snowflake, Redshift, Azure Synapse, Hive, Big Query, etc.
  • Proficient in at least one programming language used in data engineering, such as Python (or Scala/Rust/Java)
  • Has strong execution knowledge of Data Modeling (star schema, snowflake schema, fact vs dimension tables)
  • Proficient in Spark and related applications like Databricks, GCP DataProc, AWS Glue, EMR, etc.
  • Has worked on Kafka and real-time streaming.
  • Has strong execution knowledge of data architecture design patterns (lambda vs kappa architecture, data harmonization, customer data platforms, etc.)
  • Has worked on code and SQL query optimization.
  • Strong knowledge of version control systems like Git to manage source code repositories and designing CI/CD pipelines for continuous delivery.
  • Has worked on data and networking security (RBAC, secret management, key vaults, vnets, subnets, certificates)
Read more
Hyderabad
3 - 6 yrs
₹10L - ₹16L / yr
SQL
Spark
Analytical Skills
Hadoop
Communication Skills
+4 more

The Sr. Analytics Engineer would provide technical expertise in needs identification, data modeling, data movement, and transformation mapping (source to target), automation and testing strategies, translating business needs into technical solutions with adherence to established data guidelines and approaches from a business unit or project perspective.


Understands and leverages best-fit technologies (e.g., traditional star schema structures, cloud, Hadoop, NoSQL, etc.) and approaches to address business and environmental challenges.


Provides data understanding and coordinates data-related activities with other data management groups such as master data management, data governance, and metadata management.


Actively participates with other consultants in problem-solving and approach development.


Responsibilities :


Provide a consultative approach with business users, asking questions to understand the business need and deriving the data flow, conceptual, logical, and physical data models based on those needs.


Perform data analysis to validate data models and to confirm the ability to meet business needs.


Assist with and support setting the data architecture direction, ensuring data architecture deliverables are developed, ensuring compliance to standards and guidelines, implementing the data architecture, and supporting technical developers at a project or business unit level.


Coordinate and consult with the Data Architect, project manager, client business staff, client technical staff and project developers in data architecture best practices and anything else that is data related at the project or business unit levels.


Work closely with Business Analysts and Solution Architects to design the data model satisfying the business needs and adhering to Enterprise Architecture.


Coordinate with Data Architects, Program Managers and participate in recurring meetings.


Help and mentor team members to understand the data model and subject areas.


Ensure that the team adheres to best practices and guidelines.


Requirements :


- Strong working knowledge of at least 3 years of Spark, Java/Scala/Pyspark, Kafka, Git, Unix / Linux, and ETL pipeline designing.


- Experience with Spark optimization/tuning/resource allocations


- Excellent understanding of IN memory distributed computing frameworks like Spark and its parameter tuning, writing optimized workflow sequences.


- Experience of relational databases (e.g., PostgreSQL, MySQL) and NoSQL databases (e.g., Redshift, Bigquery, Cassandra, etc).


- Familiarity with Docker, Kubernetes, Azure Data Lake/Blob storage, AWS S3, Google Cloud storage, etc.


- Have a deep understanding of the various stacks and components of the Big Data ecosystem.


- Hands-on experience with Python is a huge plus

Read more
TVARIT GmbH

at TVARIT GmbH

2 candid answers
Shivani Kawade
Posted by Shivani Kawade
Remote, Pune
2 - 6 yrs
₹8L - ₹25L / yr
SQL Azure
databricks
skill iconPython
SQL
ETL
+9 more

TVARIT GmbH develops and delivers solutions in the field of artificial intelligence (AI) for the Manufacturing, automotive, and process industries. With its software products, TVARIT makes it possible for its customers to make intelligent and well-founded decisions, e.g., in forward-looking Maintenance, increasing the OEE and predictive quality. We have renowned reference customers, competent technology, a good research team from renowned Universities, and the award of a renowned AI prize (e.g., EU Horizon 2020) which makes TVARIT one of the most innovative AI companies in Germany and Europe.


We are looking for a self-motivated person with a positive "can-do" attitude and excellent oral and written communication skills in English.


We are seeking a skilled and motivated senior Data Engineer from the manufacturing Industry with over four years of experience to join our team. The Senior Data Engineer will oversee the department’s data infrastructure, including developing a data model, integrating large amounts of data from different systems, building & enhancing a data lake-house & subsequent analytics environment, and writing scripts to facilitate data analysis. The ideal candidate will have a strong foundation in ETL pipelines and Python, with additional experience in Azure and Terraform being a plus. This role requires a proactive individual who can contribute to our data infrastructure and support our analytics and data science initiatives.


Skills Required:


  • Experience in the manufacturing industry (metal industry is a plus)
  • 4+ years of experience as a Data Engineer
  • Experience in data cleaning & structuring and data manipulation
  • Architect and optimize complex data pipelines, leading the design and implementation of scalable data infrastructure, and ensuring data quality and reliability at scale
  • ETL Pipelines: Proven experience in designing, building, and maintaining ETL pipelines.
  • Python: Strong proficiency in Python programming for data manipulation, transformation, and automation.
  • Experience in SQL and data structures
  • Knowledge in big data technologies such as Spark, Flink, Hadoop, Apache, and NoSQL databases.
  • Knowledge of cloud technologies (at least one) such as AWS, Azure, and Google Cloud Platform.
  • Proficient in data management and data governance
  • Strong analytical experience & skills that can extract actionable insights from raw data to help improve the business.
  • Strong analytical and problem-solving skills.
  • Excellent communication and teamwork abilities.


Nice To Have:

  • Azure: Experience with Azure data services (e.g., Azure Data Factory, Azure Databricks, Azure SQL Database).
  • Terraform: Knowledge of Terraform for infrastructure as code (IaC) to manage cloud.
  • Bachelor’s degree in computer science, Information Technology, Engineering, or a related field from top-tier Indian Institutes of Information Technology (IIITs).
  • Benefits And Perks
  • A culture that fosters innovation, creativity, continuous learning, and resilience
  • Progressive leave policy promoting work-life balance
  • Mentorship opportunities with highly qualified internal resources and industry-driven programs
  • Multicultural peer groups and supportive workplace policies
  • Annual workcation program allowing you to work from various scenic locations
  • Experience the unique environment of a dynamic start-up


Why should you join TVARIT ?


Working at TVARIT, a deep-tech German IT startup, offers a unique blend of innovation, collaboration, and growth opportunities. We seek individuals eager to adapt and thrive in a rapidly evolving environment.


If this opportunity excites you and aligns with your career aspirations, we encourage you to apply today!

Read more
Smartavya

Smartavya

Agency job
via Pluginlive by Harsha Saggi
Mumbai
10 - 18 yrs
₹35L - ₹40L / yr
Hadoop
Architecture
skill iconAmazon Web Services (AWS)
Google Cloud Platform (GCP)
PySpark
+13 more
  • Architectural Leadership:
  • Design and architect robust, scalable, and high-performance Hadoop solutions.
  • Define and implement data architecture strategies, standards, and processes.
  • Collaborate with senior leadership to align data strategies with business goals.
  • Technical Expertise:
  • Develop and maintain complex data processing systems using Hadoop and its ecosystem (HDFS, YARN, MapReduce, Hive, HBase, Pig, etc.).
  • Ensure optimal performance and scalability of Hadoop clusters.
  • Oversee the integration of Hadoop solutions with existing data systems and third-party applications.
  • Strategic Planning:
  • Develop long-term plans for data architecture, considering emerging technologies and future trends.
  • Evaluate and recommend new technologies and tools to enhance the Hadoop ecosystem.
  • Lead the adoption of big data best practices and methodologies.
  • Team Leadership and Collaboration:
  • Mentor and guide data engineers and developers, fostering a culture of continuous improvement.
  • Work closely with data scientists, analysts, and other stakeholders to understand requirements and deliver high-quality solutions.
  • Ensure effective communication and collaboration across all teams involved in data projects.
  • Project Management:
  • Lead large-scale data projects from inception to completion, ensuring timely delivery and high quality.
  • Manage project resources, budgets, and timelines effectively.
  • Monitor project progress and address any issues or risks promptly.
  • Data Governance and Security:
  • Implement robust data governance policies and procedures to ensure data quality and compliance.
  • Ensure data security and privacy by implementing appropriate measures and controls.
  • Conduct regular audits and reviews of data systems to ensure compliance with industry standards and regulations.
Read more
Nielsen
Dheeraj Sidana
Posted by Dheeraj Sidana
Gurugram, Bengaluru (Bangalore), Mumbai
1 - 20 yrs
Best in industry
PySpark
Data engineering
Big Data
Hadoop
Spark


Nielsen, a global company specialising in audience measurement and analytics, is currently seeking a proficient leader in data engineering to join their team in Bangalore, Gurgaon, or Mumbai.


This is a manager of managers role that involves managing multiple scrum teams and overseeing an advanced data platform that analyses audience consumption patterns across various channels like OTT, TV, Radio, and Social Media worldwide. You will be responsible for building and supervising a top-performing data engineering team that delivers data for targeted campaigns. Moreover, you will work with AWS services (S3, Lambda, Kinesis) and other data engineering technologies such as Spark, Scala/Python, Kafka, etc. There may also be opportunities to establish deep integrations with OTT platforms like Netflix, Prime Video, and other.


Read more
Molecular Connections

at Molecular Connections

4 recruiters
Molecular Connections
Posted by Molecular Connections
Bengaluru (Bangalore)
2 - 5 yrs
₹13L - ₹16L / yr
Spotfire
Qlikview
Tableau
PowerBI
Data Visualization
+4 more

Responsibilities:

·       Analyze complex data sets to answer specific questions using MMIT’s market access data (MMIT) and Norstella claims data, third-party claims data (IQVIA LAAD, Symphony SHA). Applicant must have experience working with the aforementioned data sets exclusively.

·       Deliver consultative services to clients related to MMIT RWD sets

·       Produce complex analytical reports using data visualization tools such as Power BI or Tableau

·       Define customized technical specifications to surface MMIT RWD in MMIT tools. 

·       Execute work in a timely fashion with high accuracy, while managing various competing priorities; Perform thorough troubleshooting and execute QA; Communicate with internal teams to obtain required data

·       Ensure adherence to documentation requirements, process workflows, timelines, and escalation protocols

·       And other duties as assigned.

 

Requirements:

·       Bachelor’s Degree or relevant experience required

·       2-5 yrs. of professional experience in RWD analytics using SQL

·       Fundamental understanding of Pharma and Market access space

·       Strong analysis skills and proficiency with tools such as Tableau or PowerBI

·       Excellent written and verbal communication skills.

·       Analytical, critical thinking and creative problem-solving skills.

·       Relationship building skills.

·       Solid organizational skills including attention to detail and multitasking skills.

·       Excellent time management and prioritization skills.

 

Read more
Molecular Connections

at Molecular Connections

4 recruiters
Molecular Connections
Posted by Molecular Connections
Bengaluru (Bangalore)
4 - 9 yrs
₹8L - ₹12L / yr
Data Warehouse (DWH)
Informatica
ETL
Spark
Hadoop
+5 more

Job Description: Data Engineer


Experience: Over 4 years


Responsibilities:

-       Design, develop, and maintain scalable data pipelines for efficient data extraction, transformation, and loading (ETL) processes.

-       Architect and implement data storage solutions, including data warehouses, data lakes, and data marts, aligned with business needs.

-       Implement robust data quality checks and data cleansing techniques to ensure data accuracy and consistency.

-       Optimize data pipelines for performance, scalability, and cost-effectiveness.

-       Collaborate with data analysts and data scientists to understand data requirements and translate them into technical solutions.

-       Develop and maintain data security measures to ensure data privacy and regulatory compliance.

-       Automate data processing tasks using scripting languages (Python, Bash) and big data frameworks (Spark, Hadoop).

-       Monitor data pipelines and infrastructure for performance and troubleshoot any issues.

-       Stay up to date with the latest trends and technologies in data engineering, including cloud platforms (AWS, Azure, GCP).

-        Document data pipelines, processes, and data models for maintainability and knowledge sharing.

-       Contribute to the overall data governance strategy and best practices.

 

Qualifications:

-       Strong understanding of data architectures, data modelling principles, and ETL processes.

-       Proficiency in SQL (e.g., MySQL, PostgreSQL) and experience with big data querying languages (e.g., Hive, Spark SQL).

-       Experience with scripting languages (Python, Bash) for data manipulation and automation.

-       Experience with distributed data processing frameworks (Spark, Hadoop) (preferred).

-       Familiarity with cloud platforms (AWS, Azure, GCP) for data storage and processing (a plus).

-       Experience with data quality tools and techniques.

-       Excellent problem-solving, analytical, and critical thinking skills.

-       Strong communication, collaboration, and teamwork abilities.

Read more
Scremer
Sathish Dhawan
Posted by Sathish Dhawan
Pune, Mumbai
6 - 11 yrs
₹15L - ₹15L / yr
skill iconAmazon Web Services (AWS)
skill iconPython
skill iconJava
Spark


Primary Skills

DynamoDB, Java, Kafka, Spark, Amazon Redshift, AWS Lake Formation, AWS Glue, Python


Skills:

Good work experience showing growth as a Data Engineer.

Hands On programming experience

Implementation Experience on Kafka, Kinesis, Spark, AWS Glue, AWS Lake Formation.

Excellent knowledge in: Python, Scala/Java, Spark, AWS (Lambda, Step Functions, Dynamodb, EMR), Terraform, UI (Angular), Git, Mavena

Experience of performance optimization in Batch and Real time processing applications

Expertise in Data Governance and Data Security Implementation

Good hands-on design and programming skills building reusable tools and products Experience developing in AWS or similar cloud platforms. Preferred:, ECS, EKS, S3, EMR, DynamoDB, Aurora, Redshift, Quick Sight or similar.

Familiarity with systems with very high volume of transactions, micro service design, or data processing pipelines (Spark).

Knowledge and hands-on experience with server less technologies such as Lambda, MSK, MWAA, Kinesis Analytics a plus.

Expertise in practices like Agile, Peer reviews, Continuous Integration


Roles and responsibilities:

Determining project requirements and developing work schedules for the team.

Delegating tasks and achieving daily, weekly, and monthly goals.

Responsible for designing, building, testing, and deploying the software releases.


Salary: 25LPA-40LPA

Read more
Sadup Softech

at Sadup Softech

1 recruiter
madhuri g
Posted by madhuri g
Bengaluru (Bangalore)
3 - 6 yrs
₹12L - ₹15L / yr
PySpark
Data engineering
Big Data
Hadoop
Spark
+2 more

Must have skills


3 to 6 years

Data Science

SQL, Excel, Big Query - mandate 3+ years

Python/ML, Hadoop, Spark - 2+ years


Requirements


• 3+ years prior experience as a data analyst

• Detail oriented, structural thinking and analytical mindset.

• Proven analytic skills, including data analysis and data validation.

• Technical writing experience in relevant areas, including queries, reports, and presentations.

• Strong SQL and Excel skills with the ability to learn other analytic tools

• Good communication skills (being precise and clear)

• Good to have prior knowledge of python and ML algorithms

Read more
Wissen Technology

at Wissen Technology

4 recruiters
Tony Tom
Posted by Tony Tom
Pune
6 - 12 yrs
₹2L - ₹30L / yr
Python
AWS
Spark

Location: Pune

Required Skills : Scala, Python, Data Engineering, AWS, Cassandra/AstraDB, Athena, EMR, Spark/Snowflake


Read more
Sadup Softech

at Sadup Softech

1 recruiter
madhuri g
Posted by madhuri g
Remote only
4 - 6 yrs
₹4L - ₹15L / yr
Google Cloud Platform (GCP)
big query
PySpark
Data engineering
Big Data
+2 more

Job Description:

We are seeking a talented Machine Learning Engineer with expertise in software engineering to join our team. As a Machine Learning Engineer, your primary responsibility will be to develop machine learning (ML) solutions that focus on technology process improvements. Specifically, you will be working on projects involving ML & Generative AI solutions for Technology & Data Management Efficiencies such as optimal cloud computing, knowledge bots, Software Code Assistants, Automatic Data Management etc

 

Responsibilities:

- Collaborate with cross-functional teams to identify opportunities for technology process improvements that can be solved using machine learning and generative AI.

- Define and build innovate ML and Generative AI systems such as AI Assistants for varied SDLC tasks, and improve Data & Infrastructure management etc. 

- Design and develop ML Engineering Solutions, generative AI Applications & Fine-Tuning Large Language Models (LLMs) for above ensuring scalability, efficiency, and maintainability of such solutions.

- Implement prompt engineering techniques to fine-tune and enhance LLMs for better performance and application-specific needs.

- Stay abreast of the latest advancements in the field of Generative AI and actively contribute to the research and development of new ML & Generative AI Solutions.

 

Requirements:

- A Master's or Ph.D. degree in Computer Science, Statistics, Data Science, or a related field.

- Proven experience working as a Software Engineer, with a focus on ML Engineering and exposure to Generative AI Applications such as chatGPT.

- Strong proficiency in programming languages such as Java, Scala, Python, Google Cloud, Biq Query, Hadoop & Spark etc

- Solid knowledge of software engineering best practices, including version control systems (e.g., Git), code reviews, and testing methodologies.

- Familiarity with large language models (LLMs), prompt engineering techniques, vector DB's, embedding & various fine-tuning techniques.

- Strong communication skills to effectively collaborate and present findings to both technical and non-technical stakeholders.

- Proven ability to adapt and learn new technologies and frameworks quickly.

- A proactive mindset with a passion for continuous learning and research in the field of Generative AI.

 

If you are a skilled and innovative Data Scientist with a passion for Generative AI, and have a desire to contribute to technology process improvements, we would love to hear from you. Join our team and help shape the future of our AI Driven Technology Solutions.

Read more
Publicis Sapient

at Publicis Sapient

10 recruiters
Mohit Singh
Posted by Mohit Singh
Bengaluru (Bangalore), Pune, Hyderabad, Gurugram, Noida
5 - 11 yrs
₹20L - ₹36L / yr
PySpark
Data engineering
Big Data
Hadoop
Spark
+7 more

Publicis Sapient Overview:

The Senior Associate People Senior Associate L1 in Data Engineering, you will translate client requirements into technical design, and implement components for data engineering solution. Utilize deep understanding of data integration and big data design principles in creating custom solutions or implementing package solutions. You will independently drive design discussions to insure the necessary health of the overall solution 

.

Job Summary:

As Senior Associate L2 in Data Engineering, you will translate client requirements into technical design, and implement components for data engineering solution. Utilize deep understanding of data integration and big data design principles in creating custom solutions or implementing package solutions. You will independently drive design discussions to insure the necessary health of the overall solution

The role requires a hands-on technologist who has strong programming background like Java / Scala / Python, should have experience in Data Ingestion, Integration and data Wrangling, Computation, Analytics pipelines and exposure to Hadoop ecosystem components. You are also required to have hands-on knowledge on at least one of AWS, GCP, Azure cloud platforms.


Role & Responsibilities:

Your role is focused on Design, Development and delivery of solutions involving:

• Data Integration, Processing & Governance

• Data Storage and Computation Frameworks, Performance Optimizations

• Analytics & Visualizations

• Infrastructure & Cloud Computing

• Data Management Platforms

• Implement scalable architectural models for data processing and storage

• Build functionality for data ingestion from multiple heterogeneous sources in batch & real-time mode

• Build functionality for data analytics, search and aggregation

Experience Guidelines:

Mandatory Experience and Competencies:

# Competency

1.Overall 5+ years of IT experience with 3+ years in Data related technologies

2.Minimum 2.5 years of experience in Big Data technologies and working exposure in at least one cloud platform on related data services (AWS / Azure / GCP)

3.Hands-on experience with the Hadoop stack – HDFS, sqoop, kafka, Pulsar, NiFi, Spark, Spark Streaming, Flink, Storm, hive, oozie, airflow and other components required in building end to end data pipeline.

4.Strong experience in at least of the programming language Java, Scala, Python. Java preferable

5.Hands-on working knowledge of NoSQL and MPP data platforms like Hbase, MongoDb, Cassandra, AWS Redshift, Azure SQLDW, GCP BigQuery etc

6.Well-versed and working knowledge with data platform related services on at least 1 cloud platform, IAM and data security


Preferred Experience and Knowledge (Good to Have):

# Competency

1.Good knowledge of traditional ETL tools (Informatica, Talend, etc) and database technologies (Oracle, MySQL, SQL Server, Postgres) with hands on experience

2.Knowledge on data governance processes (security, lineage, catalog) and tools like Collibra, Alation etc

3.Knowledge on distributed messaging frameworks like ActiveMQ / RabbiMQ / Solace, search & indexing and Micro services architectures

4.Performance tuning and optimization of data pipelines

5.CI/CD – Infra provisioning on cloud, auto build & deployment pipelines, code quality

6.Cloud data specialty and other related Big data technology certifications


Personal Attributes:

• Strong written and verbal communication skills

• Articulation skills

• Good team player

• Self-starter who requires minimal oversight

• Ability to prioritize and manage multiple tasks

• Process orientation and the ability to define and set up processes


Read more
Wissen Technology

at Wissen Technology

4 recruiters
Gloria Dsouza
Posted by Gloria Dsouza
Bengaluru (Bangalore)
5 - 12 yrs
₹15L - ₹15L / yr
Snow flake schema
SQL
skill iconPython
Spark
Data Warehouse (DWH)
  • As a data engineer, you will build systems that collect, manage, and convert raw data into usable information for data scientists and business analysts to interpret. You ultimate goal is to make data accessible for organizations to optimize their performance. 
  • Work closely with PMs, business analysts to build and improvise data pipelines, identify and model business objects • Write scripts implementing data transformation, data structures, metadata for bringing structure for partially unstructured data and improvise quality of data 
  • Experience performing root cause analysis on internal and external data and processes to answer specific business questions and identify opportunities for improvement
  • Build the infrastructure required for optimal extraction, transformation, and loading of data from a wide variety of data sources using SQL
  • Own data pipelines - Monitoring, testing, validating and ensuring meaningful data exists in data warehouse with high level of data quality 
  • What we look for in the candidate is strong analytical skills with the ability to collect, organise, analyse, and disseminate significant amounts of information with attention to detail and accuracy 
  • Create long term and short-term design solutions through collaboration with colleagues
  • Proactive to experiment with new tools
  • Strong programming skill in python
  • Skillset: Python, SQL, ETL frameworks, PySpark and Snowflake 
  • Strong communication and interpersonal skills to interact with senior-level management regarding the implementation of changes 
  • Willingness to learn and eagerness to contribute to projects 
  • Designing datawarehouse and most appropriate DB schema for the data product
  • Positive attitude and proactive problem-solving mindset
  • Experience in building data pipelines and connectors
  • Knowledge on AWS cloud services would be preferred


Read more
Mumbai
5 - 10 yrs
₹8L - ₹20L / yr
skill iconData Science
skill iconMachine Learning (ML)
Natural Language Processing (NLP)
Computer Vision
recommendation algorithm
+6 more


Data Scientist – Delivery & New Frontiers Manager 

Job Description:   

We are seeking highly skilled and motivated data scientist to join our Data Science team. The successful candidate will play a pivotal role in our data-driven initiatives and be responsible for designing, developing, and deploying data science solutions that drives business values for stakeholders. This role involves mapping business problems to a formal data science solution, working with wide range of structured and unstructured data, architecture design, creating sophisticated models, setting up operations for the data science product with the support from MLOps team and facilitating business workshops. In a nutshell, this person will represent data science and provide expertise in the full project cycle. Expectation of the successful candidate will be above that of a typical data scientist. Beyond technical expertise, problem solving in complex set-up will be key to the success for this role. 

Responsibilities: 

  • Collaborate with cross-functional teams, including software engineers, product managers, and business stakeholders, to understand business needs and identify data science opportunities. 
  • Map complex business problems to data science problem, design data science solution using GCP/Azure Databricks platform. 
  • Collect, clean, and preprocess large datasets from various internal and external sources.  
  • Streamlining data science process working with Data Engineering, and Technology teams. 
  • Managing multiple analytics projects within a Function to deliver end-to-end data science solutions, creation of insights and identify patterns.  
  • Develop and maintain data pipelines and infrastructure to support the data science projects 
  • Communicate findings and recommendations to stakeholders through data visualizations and presentations. 
  • Stay up to date with the latest data science trends and technologies, specifically for GCP companies 

 

Education / Certifications:  

Bachelor’s or Master’s in Computer Science, Engineering, Computational Statistics, Mathematics. 

Job specific requirements:  

  • Brings 5+ years of deep data science experience 

∙       Strong knowledge of machine learning and statistical modeling techniques in a in a clouds-based environment such as GCP, Azure, Amazon 

  • Experience with programming languages such as Python, R, Spark 
  • Experience with data visualization tools such as Tableau, Power BI, and D3.js 
  • Strong understanding of data structures, algorithms, and software design principles 
  • Experience with GCP platforms and services such as Big Query, Cloud ML Engine, and Cloud Storage 
  • Experience in configuring and setting up the version control on Code, Data, and Machine Learning Models using GitHub. 
  • Self-driven, be able to work with cross-functional teams in a fast-paced environment, adaptability to the changing business needs. 
  • Strong analytical and problem-solving skills 
  • Excellent verbal and written communication skills 
  • Working knowledge with application architecture, data security and compliance team. 


Read more
one-to-one, one-to-many, and many-to-many

one-to-one, one-to-many, and many-to-many

Agency job
via The Hub by Sridevi Viswanathan
Chennai
5 - 9 yrs
₹1L - ₹15L / yr
PowerBI
skill iconPython
Spark
skill iconData Analytics
data brick

Position Overview: We are seeking a talented Data Engineer with expertise in Power BI to join our team. The ideal candidate will be responsible for designing and implementing data pipelines, as well as developing insightful visualizations and reports using Power BI. Additionally, the candidate should have strong skills in Python, data analytics, PySpark, and Databricks. This role requires a blend of technical expertise, analytical thinking, and effective communication skills.

Key Responsibilities:

  1. Design, develop, and maintain data pipelines and architectures using PySpark and Databricks.
  2. Implement ETL processes to extract, transform, and load data from various sources into data warehouses or data lakes.
  3. Collaborate with data analysts and business stakeholders to understand data requirements and translate them into actionable insights.
  4. Develop interactive dashboards, reports, and visualizations using Power BI to communicate key metrics and trends.
  5. Optimize and tune data pipelines for performance, scalability, and reliability.
  6. Monitor and troubleshoot data infrastructure to ensure data quality, integrity, and availability.
  7. Implement security measures and best practices to protect sensitive data.
  8. Stay updated with emerging technologies and best practices in data engineering and data visualization.
  9. Document processes, workflows, and configurations to maintain a comprehensive knowledge base.

Requirements:

  1. Bachelor’s degree in Computer Science, Engineering, or related field. (Master’s degree preferred)
  2. Proven experience as a Data Engineer with expertise in Power BI, Python, PySpark, and Databricks.
  3. Strong proficiency in Power BI, including data modeling, DAX calculations, and creating interactive reports and dashboards.
  4. Solid understanding of data analytics concepts and techniques.
  5. Experience working with Big Data technologies such as Hadoop, Spark, or Kafka.
  6. Proficiency in programming languages such as Python and SQL.
  7. Hands-on experience with cloud platforms like AWS, Azure, or Google Cloud.
  8. Excellent analytical and problem-solving skills with attention to detail.
  9. Strong communication and collaboration skills to work effectively with cross-functional teams.
  10. Ability to work independently and manage multiple tasks simultaneously in a fast-paced environment.

Preferred Qualifications:

  • Advanced degree in Computer Science, Engineering, or related field.
  • Certifications in Power BI or related technologies.
  • Experience with data visualization tools other than Power BI (e.g., Tableau, QlikView).
  • Knowledge of machine learning concepts and frameworks.


Read more
Publicis Sapient

at Publicis Sapient

10 recruiters
Mohit Singh
Posted by Mohit Singh
Bengaluru (Bangalore), Gurugram, Pune, Hyderabad, Noida
4 - 10 yrs
Best in industry
PySpark
Data engineering
Big Data
Hadoop
Spark
+6 more

Publicis Sapient Overview:

The Senior Associate People Senior Associate L1 in Data Engineering, you will translate client requirements into technical design, and implement components for data engineering solution. Utilize deep understanding of data integration and big data design principles in creating custom solutions or implementing package solutions. You will independently drive design discussions to insure the necessary health of the overall solution 

.

Job Summary:

As Senior Associate L1 in Data Engineering, you will do technical design, and implement components for data engineering solution. Utilize deep understanding of data integration and big data design principles in creating custom solutions or implementing package solutions. You will independently drive design discussions to insure the necessary health of the overall solution

The role requires a hands-on technologist who has strong programming background like Java / Scala / Python, should have experience in Data Ingestion, Integration and data Wrangling, Computation, Analytics pipelines and exposure to Hadoop ecosystem components. Having hands-on knowledge on at least one of AWS, GCP, Azure cloud platforms will be preferable.


Role & Responsibilities:

Job Title: Senior Associate L1 – Data Engineering

Your role is focused on Design, Development and delivery of solutions involving:

• Data Ingestion, Integration and Transformation

• Data Storage and Computation Frameworks, Performance Optimizations

• Analytics & Visualizations

• Infrastructure & Cloud Computing

• Data Management Platforms

• Build functionality for data ingestion from multiple heterogeneous sources in batch & real-time

• Build functionality for data analytics, search and aggregation


Experience Guidelines:

Mandatory Experience and Competencies:

# Competency

1.Overall 3.5+ years of IT experience with 1.5+ years in Data related technologies

2.Minimum 1.5 years of experience in Big Data technologies

3.Hands-on experience with the Hadoop stack – HDFS, sqoop, kafka, Pulsar, NiFi, Spark, Spark Streaming, Flink, Storm, hive, oozie, airflow and other components required in building end to end data pipeline. Working knowledge on real-time data pipelines is added advantage.

4.Strong experience in at least of the programming language Java, Scala, Python. Java preferable

5.Hands-on working knowledge of NoSQL and MPP data platforms like Hbase, MongoDb, Cassandra, AWS Redshift, Azure SQLDW, GCP BigQuery etc


Preferred Experience and Knowledge (Good to Have):

# Competency

1.Good knowledge of traditional ETL tools (Informatica, Talend, etc) and database technologies (Oracle, MySQL, SQL Server, Postgres) with hands on experience

2.Knowledge on data governance processes (security, lineage, catalog) and tools like Collibra, Alation etc

3.Knowledge on distributed messaging frameworks like ActiveMQ / RabbiMQ / Solace, search & indexing and Micro services architectures

4.Performance tuning and optimization of data pipelines

5.CI/CD – Infra provisioning on cloud, auto build & deployment pipelines, code quality

6.Working knowledge with data platform related services on at least 1 cloud platform, IAM and data security

7.Cloud data specialty and other related Big data technology certifications


Job Title: Senior Associate L1 – Data Engineering

Personal Attributes:

• Strong written and verbal communication skills

• Articulation skills

• Good team player

• Self-starter who requires minimal oversight

• Ability to prioritize and manage multiple tasks

• Process orientation and the ability to define and set up processes

Read more
Arting Digital
Pragati Bhardwaj
Posted by Pragati Bhardwaj
Bengaluru (Bangalore)
4 - 6 yrs
₹10L - ₹15L / yr
databricks
skill iconPython
Spark
SQL
AWS Lambda

Title:- Senior Data Engineer 


Experience: 4-6 yrs

Budget: 24-28 lpa

Location: Bangalore 

Work of Mode: Work from office

Primary Skills: Data Bricks, Spark, Pyspark,Sql, Python, AWS

Qualification: Any Engineering degree


Responsibilities:

∙Design and build reusable components, frameworks and libraries at scale to support  analytics products. 

∙Design and implement product features in collaboration with business and Technology

 stakeholders. 

∙Anticipate, identify and solve issues concerning data management to improve data   quality.

∙Clean, prepare and optimize data at scale for ingestion and consumption.

∙Drive the implementation of new data management projects and re-structure of the   current data architecture.

∙Implement complex automated workflows and routines using workflow scheduling tools. 

∙Build continuous integration, test-driven development and production deployment

 frameworks.

∙Drive collaborative reviews of design, code, test plans and dataset implementation  performed by other data engineers in support of maintaining data engineering  standards.

∙Analyze and profile data for the purpose of designing scalable solutions. 

∙Troubleshoot complex data issues and perform root cause analysis to proactively resolve

 product and operational issues.

∙Mentor and develop other data engineers in adopting best practices.

 

Qualifications:

 

Primary skillset:


∙Experience working with distributed technology tools for developing Batch and  Streaming pipelines using 


  o SQL, Spark, Python, PySpark [4+ years],

  o Airflow [3+ years],

  o Scala [2+ years].


∙Able to write code which is optimized for performance.

∙Experience in Cloud platform, e.g., AWS, GCP, Azure, etc.

∙Able to quickly pick up new programming languages, technologies, and frameworks.

∙Strong skills building positive relationships across Product and Engineering.

∙Able to influence and communicate effectively, both verbally and written, with team  members and business stakeholders

∙Experience with creating/ configuring Jenkins pipeline for smooth CI/CD process for  Managed Spark jobs, build Docker images, etc.

∙Working knowledge of Data warehousing, Data modelling, Governance and Data   Architecture

 

Good to have:


∙Experience working with Data platforms, including EMR, Airflow, Databricks (Data

 Engineering & Delta Lake components, and Lakehouse Medallion architecture), etc.

∙Experience working in Agile and Scrum development process.

∙Experience in EMR/ EC2, Databricks etc.

∙Experience working with Data warehousing tools, including SQL database, Presto, and

 Snowflake

∙Experience architecting data product in Streaming, Serverless and Microservices  Architecture and platform.

Read more
xyz

xyz

Agency job
via HR BIZ HUB by Pooja shankla
Bengaluru (Bangalore)
4 - 6 yrs
₹12L - ₹15L / yr
skill iconJava
Big Data
Apache Hive
Hadoop
Spark

Job Title Big Data Developer

Job Description

Bachelor's degree in Engineering or Computer Science or equivalent OR Master's in Computer Applications or equivalent.

Solid Experience of software development experience and leading teams of engineers and scrum teams.

4+ years of hands-on experience of working with Map-Reduce, Hive, Spark (core, SQL and PySpark).

Solid Datawarehousing concepts.

Knowledge of Financial reporting ecosystem will be a plus.

4+ years of experience within Data Engineering/ Data Warehousing using Big Data technologies will be an addon.

Expert on Distributed ecosystem.

Hands-on experience with programming using Core Java or Python/Scala

Expert on Hadoop and Spark Architecture and its working principle

Hands-on experience on writing and understanding complex SQL(Hive/PySpark-dataframes), optimizing joins while processing huge amount of data.

Experience in UNIX shell scripting.

Roles & Responsibilities

Ability to design and develop optimized Data pipelines for batch and real time data processing

Should have experience in analysis, design, development, testing, and implementation of system applications

Demonstrated ability to develop and document technical and functional specifications and analyze software and system processing flows.

Excellent technical and analytical aptitude

Good communication skills.

Excellent Project management skills.

Results driven Approach.

Mandatory SkillsBig Data, PySpark, Hive

Read more
A LEADING US BASED MNC

A LEADING US BASED MNC

Agency job
via Zeal Consultants by Zeal Consultants
Bengaluru (Bangalore), Hyderabad, Delhi, Gurugram
5 - 10 yrs
₹14L - ₹15L / yr
Google Cloud Platform (GCP)
Spark
PySpark
Apache Spark
"DATA STREAMING"

Data Engineering : Senior Engineer / Manager


As Senior Engineer/ Manager in Data Engineering, you will translate client requirements into technical design, and implement components for a data engineering solutions. Utilize a deep understanding of data integration and big data design principles in creating custom solutions or implementing package solutions. You will independently drive design discussions to insure the necessary health of the overall solution.


Must Have skills :


1. GCP


2. Spark streaming : Live data streaming experience is desired.


3. Any 1 coding language: Java/Pyhton /Scala



Skills & Experience :


- Overall experience of MINIMUM 5+ years with Minimum 4 years of relevant experience in Big Data technologies


- Hands-on experience with the Hadoop stack - HDFS, sqoop, kafka, Pulsar, NiFi, Spark, Spark Streaming, Flink, Storm, hive, oozie, airflow and other components required in building end to end data pipeline. Working knowledge on real-time data pipelines is added advantage.


- Strong experience in at least of the programming language Java, Scala, Python. Java preferable


- Hands-on working knowledge of NoSQL and MPP data platforms like Hbase, MongoDb, Cassandra, AWS Redshift, Azure SQLDW, GCP BigQuery etc.


- Well-versed and working knowledge with data platform related services on GCP


- Bachelor's degree and year of work experience of 6 to 12 years or any combination of education, training and/or experience that demonstrates the ability to perform the duties of the position


Your Impact :


- Data Ingestion, Integration and Transformation


- Data Storage and Computation Frameworks, Performance Optimizations


- Analytics & Visualizations


- Infrastructure & Cloud Computing


- Data Management Platforms


- Build functionality for data ingestion from multiple heterogeneous sources in batch & real-time


- Build functionality for data analytics, search and aggregation

Read more
 is a software product company that provides

is a software product company that provides

Agency job
via Dangi Digital Media LLP by jaibir dangi
Hyderabad
6 - 15 yrs
₹11L - ₹15L / yr
skill iconPython
Spark
SQL Azure
Apache Kafka
skill iconMongoDB
+4 more

5+ years of experience designing, developing, validating, and automating ETL processes 3+ years of experience traditional ETL tools such as Visual Studio, SQL Server Management Studio, SSIS, SSAS and SSRS 2+ years of experience with cloud technologies and platforms, such as: Kubernetes, Spark, Kafka, Azure Data Factory, Snowflake, ML Flow, Databricks, Airflow or similar Must have experience with designing and implementing data access layers Must be an expert with SQL/T-SQL and Python Must have experience in Kafka Define and implement data models with various database technologies like MongoDB, CosmosDB, Neo4j, MariaDB and SQL Serve Ingest and publish data from sources and to destinations via an API Exposure to ETL/ELT with using Kafka or Azure Event Hubs with Spark or Databricks is a plus Exposure to healthcare technologies and integrations for FHIR API, HL7 or other HIE protocols is a plus


Skills Required :


Designing, Developing, ETL, Visual Studio, Python, Spark, Kubernetes, Kafka, Azure Data Factory, SQL Server, Airflow, Databricks, T-SQL, MongoDB, CosmosDB, Snowflake, SSIS, SSAS, SSRS, FHIR API, HL7, HIE Protocols

Read more
NutaNXT Technologies

at NutaNXT Technologies

1 recruiter
Jidnyasa S
Posted by Jidnyasa S
Pune
6 - 9 yrs
₹15L - ₹28L / yr
Spark
skill iconScala
databricks,
NOSQL Databases

DATA ENGINEERING CONSULTANT


About NutaNXT: NutaNXT is a next-gen Software Product Engineering services provider building ground-breaking products using AI/ML, Data Analytics, IOT, Cloud & new emerging technologies disrupting the global markets. Our mission is to help clients leverage our specialized Digital Product Engineering capabilities on Data Engineering, AI Automations, Software Full stack solutions and services to build best-in-class products and stay ahead of the curve. You will get a chance to work on multiple projects critical to NutaNXT needs with opportunities to learn, develop new skills,switch teams and projects as you and our fast-paced business grow and evolve. Location: Pune Experience: 6 to 8 years


Job Description: NutaNXT is looking for supporting the planning and implementation of data design services, providing sizing and configuration assistance and performing needs assessments. Delivery of architectures for transformations and modernizations of enterprise data solutions using Azure cloud data technologies. As a Data Engineering Consultant, you will collect, aggregate, store, and reconcile data in support of Customer's business decisions. You will design and build data pipelines, data streams, data service APIs, data generators and other end-user information portals and insight tools.


Mandatory Skills: -


  1. Demonstrable experience in enterprise level data platforms involving implementation of end-to-end data pipelines with Python or Scala - Hands-on experience with at least one of the leading public cloud data platforms (Ideally Azure)
  2. - Experience with different Databases (like column-oriented database, NoSQL database, RDBMS)
  3. - Experience in architecting data pipelines and solutions for both streaming and batch integrations using tools/frameworks like Azure Databricks, Azure Data Factory, Spark, Spark Streaming, etc
  4. . - Understanding of data modeling, warehouse design and fact/dimension concepts - Good Communication


Good To Have:


Certifications for any of the cloud services (Ideally Azure)

• Experience working with code repositories and continuous integration • Understanding of development and project methodologies


Why Join Us?


We offer Innovative work in AI & Data Engineering Space, with a unique, diverse workplace environment having a Continuous learning and development opportunities. These are just some of the reasons we're consistently being recognized as one of the best companies to work for, and why our people choose to grow careers at NutaNXT. We also offer a highly flexible, self-driven, remote work culture which fosters the best of innovation, creativity and work-life balance, market industry-leading compensation which we believe help us consistently deliver to our clients and grow in the highly competitive, fast evolving Digital Engineering space with a strong focus on building advanced software products for clients in the US, Europe and APAC regions.

Read more
Compile

at Compile

16 recruiters
Sarumathi NH
Posted by Sarumathi NH
Bengaluru (Bangalore)
7 - 10 yrs
Best in industry
Data Warehouse (DWH)
Informatica
ETL
Spark

You will be responsible for designing, building, and maintaining data pipelines that handle Real-world data at Compile. You will be handling both inbound and outbound data deliveries at Compile for datasets including Claims, Remittances, EHR, SDOH, etc.

You will

  • Work on building and maintaining data pipelines (specifically RWD).
  • Build, enhance and maintain existing pipelines in pyspark, python and help build analytical insights and datasets.
  • Scheduling and maintaining pipeline jobs for RWD.
  • Develop, test, and implement data solutions based on the design.
  • Design and implement quality checks on existing and new data pipelines.
  • Ensure adherence to security and compliance that is required for the products.
  • Maintain relationships with various data vendors and track changes and issues across vendors and deliveries.

You have

  • Hands-on experience with ETL process (min of 5 years).
  • Excellent communication skills and ability to work with multiple vendors.
  • High proficiency with Spark, SQL.
  • Proficiency in Data modeling, validation, quality check, and data engineering concepts.
  • Experience in working with big-data processing technologies using - databricks, dbt, S3, Delta lake, Deequ, Griffin, Snowflake, BigQuery.
  • Familiarity with version control technologies, and CI/CD systems.
  • Understanding of scheduling tools like Airflow/Prefect.
  • Min of 3 years of experience managing data warehouses.
  • Familiarity with healthcare datasets is a plus.

Compile embraces diversity and equal opportunity in a serious way. We are committed to building a team of people from many backgrounds, perspectives, and skills. We know the more inclusive we are, the better our work will be.         

Read more
Molecular Connections

at Molecular Connections

4 recruiters
Molecular Connections
Posted by Molecular Connections
Bengaluru (Bangalore)
8 - 10 yrs
₹15L - ₹20L / yr
Spark
Hadoop
Big Data
Data engineering
PySpark
+4 more
  1. Big data developer with 8+ years of professional IT experience with expertise in Hadoop ecosystem components in ingestion, Data modeling, querying, processing, storage, analysis, Data Integration and Implementing enterprise level systems spanning Big Data.
  2. A skilled developer with strong problem solving, debugging and analytical capabilities, who actively engages in understanding customer requirements.
  3. Expertise in Apache Hadoop ecosystem components like Spark, Hadoop Distributed File Systems(HDFS), HiveMapReduce, Hive, Sqoop, HBase, Zookeeper, YARN, Flume, Pig, Nifi, Scala and Oozie.
  4. Hands on experience in creating real - time data streaming solutions using Apache Spark core, Spark SQL & DataFrames, Kafka, Spark streaming and Apache Storm.
  5. Excellent knowledge of Hadoop architecture and daemons of Hadoop clusters, which include Name node,Data node, Resource manager, Node Manager and Job history server.
  6. Worked on both Cloudera and Horton works in Hadoop Distributions. Experience in managing Hadoop clustersusing Cloudera Manager tool.
  7. Well versed in installation, Configuration, Managing of Big Data and underlying infrastructure of Hadoop Cluster.
  8. Hands on experience in coding MapReduce/Yarn Programs using Java, Scala and Python for analyzing Big Data.
  9. Exposure to Cloudera development environment and management using Cloudera Manager.
  10. Extensively worked on Spark using Scala on cluster for computational (analytics), installed it on top of Hadoop performed advanced analytical application by making use of Spark with Hive and SQL/Oracle .
  11. Implemented Spark using PYTHON and utilizing Data frames and Spark SQL API for faster processing of data and handled importing data from different data sources into HDFS using Sqoop and performing transformations using Hive, MapReduce and then loading data into HDFS.
  12. Used Spark Data Frames API over Cloudera platform to perform analytics on Hive data.
  13. Hands on experience in MLlib from Spark which are used for predictive intelligence, customer segmentation and for smooth maintenance in Spark streaming.
  14. Experience in using Flume to load log files into HDFS and Oozie for workflow design and scheduling.
  15. Experience in optimizing MapReduce jobs to use HDFS efficiently by using various compression mechanisms.
  16. Working on creating data pipeline for different events of ingestion, aggregation, and load consumer response data into Hive external tables in HDFS location to serve as feed for tableau dashboards.
  17. Hands on experience in using Sqoop to import data into HDFS from RDBMS and vice-versa.
  18. In-depth Understanding of Oozie to schedule all Hive/Sqoop/HBase jobs.
  19. Hands on expertise in real time analytics with Apache Spark.
  20. Experience in converting Hive/SQL queries into RDD transformations using Apache Spark, Scala and Python.
  21. Extensive experience in working with different ETL tool environments like SSIS, Informatica and reporting tool environments like SQL Server Reporting Services (SSRS).
  22. Experience in Microsoft cloud and setting cluster in Amazon EC2 & S3 including the automation of setting & extending the clusters in AWS Amazon cloud.
  23. Extensively worked on Spark using Python on cluster for computational (analytics), installed it on top of Hadoop performed advanced analytical application by making use of Spark with Hive and SQL.
  24. Strong experience and knowledge of real time data analytics using Spark Streaming, Kafka and Flume.
  25. Knowledge in installation, configuration, supporting and managing Hadoop Clusters using Apache, Cloudera (CDH3, CDH4) distributions and on Amazon web services (AWS).
  26. Experienced in writing Ad Hoc queries using Cloudera Impala, also used Impala analytical functions.
  27. Experience in creating Data frames using PySpark and performing operation on the Data frames using Python.
  28. In depth understanding/knowledge of Hadoop Architecture and various components such as HDFS and MapReduce Programming Paradigm, High Availability and YARN architecture.
  29. Establishing multiple connections to different Redshift clusters (Bank Prod, Card Prod, SBBDA Cluster) and provide the access for pulling the information we need for analysis. 
  30. Generated various kinds of knowledge reports using Power BI based on Business specification. 
  31. Developed interactive Tableau dashboards to provide a clear understanding of industry specific KPIs using quick filters and parameters to handle them more efficiently.
  32. Well Experience in projects using JIRA, Testing, Maven and Jenkins build tools.
  33. Experienced in designing, built, and deploying and utilizing almost all the AWS stack (Including EC2, S3,), focusing on high-availability, fault tolerance, and auto-scaling.
  34. Good experience with use-case development, with Software methodologies like Agile and Waterfall.
  35. Working knowledge of Amazon's Elastic Cloud Compute( EC2 ) infrastructure for computational tasks and Simple Storage Service ( S3 ) as Storage mechanism.
  36. Good working experience in importing data using Sqoop, SFTP from various sources like RDMS, Teradata, Mainframes, Oracle, Netezza to HDFS and performed transformations on it using Hive, Pig and Spark .
  37. Extensive experience in Text Analytics, developing different Statistical Machine Learning solutions to various business problems and generating data visualizations using Python and R.
  38. Proficient in NoSQL databases including HBase, Cassandra, MongoDB and its integration with Hadoop cluster.
  39. Hands on experience in Hadoop Big data technology working on MapReduce, Pig, Hive as Analysis tool, Sqoop and Flume data import/export tools.
Read more
A Product Based Client,Chennai

A Product Based Client,Chennai

Agency job
via SangatHR by Anna Poorni
Chennai
4 - 8 yrs
₹10L - ₹15L / yr
Data Warehouse (DWH)
Informatica
ETL
Spark
PySpark
+2 more

Analytics Job Description

We are hiring an Analytics Engineer to help drive our Business Intelligence efforts. You will

partner closely with leaders across the organization, working together to understand the how

and why of people, team and company challenges, workflows and culture. The team is

responsible for delivering data and insights that drive decision-making, execution, and

investments for our product initiatives.

You will work cross-functionally with product, marketing, sales, engineering, finance, and our

customer-facing teams enabling them with data and narratives about the customer journey.

You’ll also work closely with other data teams, such as data engineering and product analytics,

to ensure we are creating a strong data culture at Blend that enables our cross-functional partners

to be more data-informed.


Role : DataEngineer 

Please find below the JD for the DataEngineer Role..

  Location: Guindy,Chennai

How you’ll contribute:

• Develop objectives and metrics, ensure priorities are data-driven, and balance short-

term and long-term goals


• Develop deep analytical insights to inform and influence product roadmaps and

business decisions and help improve the consumer experience

• Work closely with GTM and supporting operations teams to author and develop core

data sets that empower analyses

• Deeply understand the business and proactively spot risks and opportunities

• Develop dashboards and define metrics that drive key business decisions

• Build and maintain scalable ETL pipelines via solutions such as Fivetran, Hightouch,

and Workato

• Design our Analytics and Business Intelligence architecture, assessing and

implementing new technologies that fitting


• Work with our engineering teams to continually make our data pipelines and tooling

more resilient


Who you are:

• Bachelor’s degree or equivalent required from an accredited institution with a

quantitative focus such as Economics, Operations Research, Statistics, Computer Science OR 1-3 Years of Experience as a Data Analyst, Data Engineer, Data Scientist

• Must have strong SQL and data modeling skills, with experience applying skills to

thoughtfully create data models in a warehouse environment.

• A proven track record of using analysis to drive key decisions and influence change

• Strong storyteller and ability to communicate effectively with managers and

executives

• Demonstrated ability to define metrics for product areas, understand the right

questions to ask and push back on stakeholders in the face of ambiguous, complex

problems, and work with diverse teams with different goals

• A passion for documentation.

• A solution-oriented growth mindset. You’ll need to be a self-starter and thrive in a

dynamic environment.

• A bias towards communication and collaboration with business and technical

stakeholders.

• Quantitative rigor and systems thinking.

• Prior startup experience is preferred, but not required.

• Interest or experience in machine learning techniques (such as clustering, decision

tree, and segmentation)

• Familiarity with a scientific computing language, such as Python, for data wrangling

and statistical analysis

• Experience with a SQL focused data transformation framework such as dbt

• Experience with a Business Intelligence Tool such as Mode/Tableau


Mandatory Skillset:


-Very Strong in SQL

-Spark OR pyspark OR Python

-Shell Scripting


Read more
Epik Solutions
Sakshi Sarraf
Posted by Sakshi Sarraf
Bengaluru (Bangalore), Noida
5 - 10 yrs
₹7L - ₹28L / yr
skill iconPython
SQL
databricks
skill iconScala
Spark
+2 more

Job Description:


As an Azure Data Engineer, your role will involve designing, developing, and maintaining data solutions on the Azure platform. You will be responsible for building and optimizing data pipelines, ensuring data quality and reliability, and implementing data processing and transformation logic. Your expertise in Azure Databricks, Python, SQL, Azure Data Factory (ADF), PySpark, and Scala will be essential for performing the following key responsibilities:


Designing and developing data pipelines: You will design and implement scalable and efficient data pipelines using Azure Databricks, PySpark, and Scala. This includes data ingestion, data transformation, and data loading processes.


Data modeling and database design: You will design and implement data models to support efficient data storage, retrieval, and analysis. This may involve working with relational databases, data lakes, or other storage solutions on the Azure platform.


Data integration and orchestration: You will leverage Azure Data Factory (ADF) to orchestrate data integration workflows and manage data movement across various data sources and targets. This includes scheduling and monitoring data pipelines.


Data quality and governance: You will implement data quality checks, validation rules, and data governance processes to ensure data accuracy, consistency, and compliance with relevant regulations and standards.


Performance optimization: You will optimize data pipelines and queries to improve overall system performance and reduce processing time. This may involve tuning SQL queries, optimizing data transformation logic, and leveraging caching techniques.


Monitoring and troubleshooting: You will monitor data pipelines, identify performance bottlenecks, and troubleshoot issues related to data ingestion, processing, and transformation. You will work closely with cross-functional teams to resolve data-related problems.


Documentation and collaboration: You will document data pipelines, data flows, and data transformation processes. You will collaborate with data scientists, analysts, and other stakeholders to understand their data requirements and provide data engineering support.


Skills and Qualifications:


Strong experience with Azure Databricks, Python, SQL, ADF, PySpark, and Scala.

Proficiency in designing and developing data pipelines and ETL processes.

Solid understanding of data modeling concepts and database design principles.

Familiarity with data integration and orchestration using Azure Data Factory.

Knowledge of data quality management and data governance practices.

Experience with performance tuning and optimization of data pipelines.

Strong problem-solving and troubleshooting skills related to data engineering.

Excellent collaboration and communication skills to work effectively in cross-functional teams.

Understanding of cloud computing principles and experience with Azure services.

Read more
Bengaluru (Bangalore)
5 - 9 yrs
₹10L - ₹18L / yr
skill iconMachine Learning (ML)
skill iconData Science
Natural Language Processing (NLP)
Computer Vision
recommendation algorithm
+10 more

Requirements

Experience

  • 5+ years of professional experience in implementing MLOps framework to scale up ML in production.
  • Hands-on experience with Kubernetes, Kubeflow, MLflow, Sagemaker, and other ML model experiment management tools including training, inference, and evaluation.
  • Experience in ML model serving (TorchServe, TensorFlow Serving, NVIDIA Triton inference server, etc.)
  • Proficiency with ML model training frameworks (PyTorch, Pytorch Lightning, Tensorflow, etc.).
  • Experience with GPU computing to do data and model training parallelism.
  • Solid software engineering skills in developing systems for production.
  • Strong expertise in Python.
  • Building end-to-end data systems as an ML Engineer, Platform Engineer, or equivalent.
  • Experience working with cloud data processing technologies (S3, ECR, Lambda, AWS, Spark, Dask, ElasticSearch, Presto, SQL, etc.).
  • Having Geospatial / Remote sensing experience is a plus.
Read more
BlueYonder
Bengaluru (Bangalore), Hyderabad
10 - 14 yrs
Best in industry
skill iconJava
J2EE
skill iconSpring Boot
Hibernate (Java)
Gradle
+13 more

·      Core responsibilities to include analyze business requirements and designs for accuracy and completeness. Develops and maintains relevant product.

·      BlueYonder is seeking a Senior/Principal Architect in the Data Services department (under Luminate Platform ) to act as one of key technology leaders to build and manage BlueYonder’ s technology assets in the Data Platform and Services.

·      This individual will act as a trusted technical advisor and strategic thought leader to the Data Services department. The successful candidate will have the opportunity to lead, participate, guide, and mentor other people in the team on architecture and design in a hands-on manner. You are responsible for technical direction of Data Platform. This position reports to the Global Head, Data Services and will be based in Bangalore, India.

·      Core responsibilities to include Architecting and designing (along with counterparts and distinguished Architects) a ground up cloud native (we use Azure) SaaS product in Order management and micro-fulfillment

·      The team currently comprises of 60+ global associates across US, India (COE) and UK and is expected to grow rapidly. The incumbent will need to have leadership qualities to also mentor junior and mid-level software associates in our team. This person will lead the Data platform architecture – Streaming, Bulk with Snowflake/Elastic Search/other tools

Our current technical environment:

·      Software: Java, Springboot, Gradle, GIT, Hibernate, Rest API, OAuth , Snowflake

·      • Application Architecture: Scalable, Resilient, event driven, secure multi-tenant Microservices architecture

·      • Cloud Architecture: MS Azure (ARM templates, AKS, HD insight, Application gateway, Virtue Networks, Event Hub, Azure AD)

·      Frameworks/Others: Kubernetes, Kafka, Elasticsearch, Spark, NOSQL, RDBMS, Springboot, Gradle GIT, Ignite

Read more
Exponentia.ai

at Exponentia.ai

1 product
1 recruiter
Vipul Tiwari
Posted by Vipul Tiwari
Mumbai
4 - 6 yrs
₹12L - ₹19L / yr
ETL
Informatica
Data Warehouse (DWH)
databricks
skill iconAmazon Web Services (AWS)
+6 more

 Job DescriptionPosition: Sr Data Engineer – Databricks & AWS

Experience: 4 - 5 Years

 

Company Profile:


Exponentia.ai is an AI tech organization with a presence across India, Singapore, the Middle East, and the UK. We are an innovative and disruptive organization, working on cutting-edge technology to help our clients transform into the enterprises of the future. We provide artificial intelligence-based products/platforms capable of automated cognitive decision-making to improve productivity, quality, and economics of the underlying business processes. Currently, we are transforming ourselves and rapidly expanding our business.

Exponentia.ai has developed long-term relationships with world-class clients such as PayPal, PayU, SBI Group, HDFC Life, Kotak Securities, Wockhardt and Adani Group amongst others.

One of the top partners of Cloudera (leading analytics player) and Qlik (leader in BI technologies), Exponentia.ai has recently been awarded the ‘Innovation Partner Award’ by Qlik in 2017.

Get to know more about us on our website: http://www.exponentia.ai/ and Life @Exponentia.

 

​Role Overview: 


·         A Data Engineer understands the client requirements and develops and delivers the data engineering solutions as per the scope.

·         The role requires good skills in the development of solutions using various services required for data architecture on Databricks Delta Lake, streaming, AWS, ETL Development, and data modeling.

 

Job Responsibilities


•         Design of data solutions on Databricks including delta lake, data warehouse, data marts and other data solutions to support the analytics needs of the organization.

•         Apply best practices during design in data modeling (logical, physical) and ETL pipelines (streaming and batch) using cloud-based services.

•         Design, develop and manage the pipelining (collection, storage, access), data engineering (data quality, ETL, Data Modelling) and understanding (documentation, exploration) of the data.

•         Interact with stakeholders regarding data landscape understanding, conducting discovery exercises, developing proof of concepts and demonstrating it to stakeholders.

 

Technical Skills 


•         Has more than 2 Years of experience in developing data lakes, and datamarts on the Databricks platform.

•         Proven skill sets in AWS Data Lake services such as - AWS Glue, S3, Lambda, SNS, IAM, and skills in Spark, Python, and SQL.

•         Experience in Pentaho

•         Good understanding of developing a data warehouse, data marts etc.

•         Has a good understanding of system architectures, and design patterns and should be able to design and develop applications using these principles.

 

Personality Traits


•         Good collaboration and communication skills

•         Excellent problem-solving skills to be able to structure the right analytical solutions.

•         Strong sense of teamwork, ownership, and accountability

•         Analytical and conceptual thinking 

•         Ability to work in a fast-paced environment with tight schedules.

•         Good presentation skills with the ability to convey complex ideas to peers and management.

 

Education:

 

BE / ME / MS/MCA.

    



Read more
iLink Systems

at iLink Systems

1 video
1 recruiter
Ganesh Sooriyamoorthu
Posted by Ganesh Sooriyamoorthu
Chennai, Pune, Noida, Bengaluru (Bangalore)
5 - 15 yrs
₹10L - ₹15L / yr
Apache Kafka
Big Data
skill iconJava
Spark
Hadoop
+1 more
  • KSQL
  • Data Engineering spectrum (Java/Spark)
  • Spark Scala / Kafka Streaming
  • Confluent Kafka components
  • Basic understanding of Hadoop


Read more
Kloud9 Technologies
Bengaluru (Bangalore)
3 - 6 yrs
₹5L - ₹20L / yr
skill iconAmazon Web Services (AWS)
Amazon EMR
EMR
Spark
PySpark
+9 more

About Kloud9:

 

Kloud9 exists with the sole purpose of providing cloud expertise to the retail industry. Our team of cloud architects, engineers and developers help retailers launch a successful cloud initiative so you can quickly realise the benefits of cloud technology. Our standardised, proven cloud adoption methodologies reduce the cloud adoption time and effort so you can directly benefit from lower migration costs.

 

Kloud9 was founded with the vision of bridging the gap between E-commerce and cloud. The E-commerce of any industry is limiting and poses a huge challenge in terms of the finances spent on physical data structures.

 

At Kloud9, we know migrating to the cloud is the single most significant technology shift your company faces today. We are your trusted advisors in transformation and are determined to build a deep partnership along the way. Our cloud and retail experts will ease your transition to the cloud.

 

Our sole focus is to provide cloud expertise to retail industry giving our clients the empowerment that will take their business to the next level. Our team of proficient architects, engineers and developers have been designing, building and implementing solutions for retailers for an average of more than 20 years.

 

We are a cloud vendor that is both platform and technology independent. Our vendor independence not just provides us with a unique perspective into the cloud market but also ensures that we deliver the cloud solutions available that best meet our clients' requirements.


What we are looking for:

● 3+ years’ experience developing Data & Analytic solutions

● Experience building data lake solutions leveraging one or more of the following AWS, EMR, S3, Hive& Spark

● Experience with relational SQL

● Experience with scripting languages such as Shell, Python

● Experience with source control tools such as GitHub and related dev process

● Experience with workflow scheduling tools such as Airflow

● In-depth knowledge of scalable cloud

● Has a passion for data solutions

● Strong understanding of data structures and algorithms

● Strong understanding of solution and technical design

● Has a strong problem-solving and analytical mindset

● Experience working with Agile Teams.

● Able to influence and communicate effectively, both verbally and written, with team members and business stakeholders

● Able to quickly pick up new programming languages, technologies, and frameworks

● Bachelor’s Degree in computer science


Why Explore a Career at Kloud9:

 

With job opportunities in prime locations of US, London, Poland and Bengaluru, we help build your career paths in cutting edge technologies of AI, Machine Learning and Data Science. Be part of an inclusive and diverse workforce that's changing the face of retail technology with their creativity and innovative solutions. Our vested interest in our employees translates to deliver the best products and solutions to our customers.

Read more
TensorGo Software Private Limited
Deepika Agarwal
Posted by Deepika Agarwal
Remote only
5 - 8 yrs
₹5L - ₹15L / yr
skill iconPython
PySpark
apache airflow
Spark
Hadoop
+4 more

Requirements:

● Understanding our data sets and how to bring them together.

● Working with our engineering team to support custom solutions offered to the product development.

● Filling the gap between development, engineering and data ops.

● Creating, maintaining and documenting scripts to support ongoing custom solutions.

● Excellent organizational skills, including attention to precise details

● Strong multitasking skills and ability to work in a fast-paced environment

● 5+ years experience with Python to develop scripts.

● Know your way around RESTFUL APIs.[Able to integrate not necessary to publish]

● You are familiar with pulling and pushing files from SFTP and AWS S3.

● Experience with any Cloud solutions including GCP / AWS / OCI / Azure.

● Familiarity with SQL programming to query and transform data from relational Databases.

● Familiarity to work with Linux (and Linux work environment).

● Excellent written and verbal communication skills

● Extracting, transforming, and loading data into internal databases and Hadoop

● Optimizing our new and existing data pipelines for speed and reliability

● Deploying product build and product improvements

● Documenting and managing multiple repositories of code

● Experience with SQL and NoSQL databases (Casendra, MySQL)

● Hands-on experience in data pipelining and ETL. (Any of these frameworks/tools: Hadoop, BigQuery,

RedShift, Athena)

● Hands-on experience in AirFlow

● Understanding of best practices, common coding patterns and good practices around

● storing, partitioning, warehousing and indexing of data

● Experience in reading the data from Kafka topic (both live stream and offline)

● Experience in PySpark and Data frames

Responsibilities:

You’ll

● Collaborating across an agile team to continuously design, iterate, and develop big data systems.

● Extracting, transforming, and loading data into internal databases.

● Optimizing our new and existing data pipelines for speed and reliability.

● Deploying new products and product improvements.

● Documenting and managing multiple repositories of code.

Read more
Telstra

at Telstra

1 video
1 recruiter
Mahesh Balappa
Posted by Mahesh Balappa
Bengaluru (Bangalore), Hyderabad, Pune
3 - 7 yrs
Best in industry
Spark
Hadoop
NOSQL Databases
Apache Kafka

About Telstra

 

Telstra is Australia’s leading telecommunications and technology company, with operations in more than 20 countries, including In India where we’re building a new Innovation and Capability Centre (ICC) in Bangalore.

 

We’re growing, fast, and for you that means many exciting opportunities to develop your career at Telstra. Join us on this exciting journey, and together, we’ll reimagine the future.

 

Why Telstra?

 

  • We're an iconic Australian company with a rich heritage that's been built over 100 years. Telstra is Australia's leading Telecommunications and Technology Company. We've been operating internationally for more than 70 years.
  • International presence spanning over 20 countries.
  • We are one of the 20 largest telecommunications providers globally
  • At Telstra, the work is complex and stimulating, but with that comes a great sense of achievement. We are shaping the tomorrow's modes of communication with our innovation driven teams.

 

Telstra offers an opportunity to make a difference to lives of millions of people by providing the choice of flexibility in work and a rewarding career that you will be proud of!

 

About the team

Being part of Networks & IT means you'll be part of a team that focuses on extending our network superiority to enable the continued execution of our digital strategy.

With us, you'll be working with world-leading technology and change the way we do IT to ensure business needs drive priorities, accelerating our digitisation programme.

 

Focus of the role

Any new engineer who comes into data chapter would be mostly into developing reusable data processing and storage frameworks that can be used across data platform.

 

About you

To be successful in the role, you'll bring skills and experience in:-

 

Essential 

  • Hands-on experience in Spark Core, Spark SQL, SQL/Hive/Impala, Git/SVN/Any other VCS and Data warehousing
  • Skilled in the Hadoop Ecosystem(HDP/Cloudera/MapR/EMR etc)
  • Azure data factory/Airflow/control-M/Luigi
  • PL/SQL
  • Exposure to NOSQL(Hbase/Cassandra/GraphDB(Neo4J)/MongoDB)
  • File formats (Parquet/ORC/AVRO/Delta/Hudi etc.)
  • Kafka/Kinesis/Eventhub

 

Highly Desirable

Experience and knowledgeable on the following:

  • Spark Streaming
  • Cloud exposure (Azure/AWS/GCP)
  • Azure data offerings - ADF, ADLS2, Azure Databricks, Azure Synapse, Eventhubs, CosmosDB etc.
  • Presto/Athena
  • Azure DevOps
  • Jenkins/ Bamboo/Any similar build tools
  • Power BI
  • Prior experience in building or working in team building reusable frameworks,
  • Data modelling.
  • Data Architecture and design principles. (Delta/Kappa/Lambda architecture)
  • Exposure to CI/CD
  • Code Quality - Static and Dynamic code scans
  • Agile SDLC      

 

If you've got a passion to innovate, succeed as part of a great team, and looking for the next step in your career, we'd welcome you to apply!

___________________________

 

We’re committed to building a diverse and inclusive workforce in all its forms. We encourage applicants from diverse gender, cultural and linguistic backgrounds and applicants who may be living with a disability. We also offer flexibility in all our roles, to ensure everyone can participate.

To learn more about how we support our people, including accessibility adjustments we can provide you through the recruitment process, visit tel.st/thrive.

Read more
Kloud9 Technologies
manjula komala
Posted by manjula komala
Bengaluru (Bangalore)
3 - 6 yrs
₹18L - ₹27L / yr
PySpark
Data engineering
Big Data
Hadoop
Spark
+6 more

About Kloud9:


Kloud9 exists with the sole purpose of providing cloud expertise to the retail industry. Our team of cloud architects, engineers and developers help retailers launch a successful cloud initiative so you can quickly realise the benefits of cloud technology. Our standardised, proven cloud adoption methodologies reduce the cloud adoption time and effort so you can directly benefit from lower migration costs.


Kloud9 was founded with the vision of bridging the gap between E-commerce and cloud. The E-commerce of any industry is limiting and poses a huge challenge in terms of the finances spent on physical data structures.


At Kloud9, we know migrating to the cloud is the single most significant technology shift your company faces today. We are your trusted advisors in transformation and are determined to build a deep partnership along the way. Our cloud and retail experts will ease your transition to the cloud.


Our sole focus is to provide cloud expertise to retail industry giving our clients the empowerment that will take their business to the next level. Our team of proficient architects, engineers and developers have been designing, building and implementing solutions for retailers for an average of more than 20 years.


We are a cloud vendor that is both platform and technology independent. Our vendor independence not just provides us with a unique perspective into the cloud market but also ensures that we deliver the cloud solutions available that best meet our clients' requirements.



What we are looking for:


●       3+ years’ experience developing Big Data & Analytic solutions

●       Experience building data lake solutions leveraging Google Data Products (e.g. Dataproc, AI Building Blocks, Looker, Cloud Data Fusion, Dataprep, etc.), Hive, Spark

●       Experience with relational SQL/No SQL

●       Experience with Spark (Scala/Python/Java) and Kafka

●       Work experience with using Databricks (Data Engineering and Delta Lake components)

●       Experience with source control tools such as GitHub and related dev process

●       Experience with workflow scheduling tools such as Airflow

●       In-depth knowledge of any scalable cloud vendor(GCP preferred)

●       Has a passion for data solutions

●       Strong understanding of data structures and algorithms

●       Strong understanding of solution and technical design

●       Has a strong problem solving and analytical mindset

●       Experience working with Agile Teams.

●       Able to influence and communicate effectively, both verbally and written, with team members and business stakeholders

●       Able to quickly pick up new programming languages, technologies, and frameworks

●       Bachelor’s Degree in computer science


Why Explore a Career at Kloud9:


With job opportunities in prime locations of US, London, Poland and Bengaluru, we help build your career paths in cutting edge technologies of AI, Machine Learning and Data Science. Be part of an inclusive and diverse workforce that's changing the face of retail technology with their creativity and innovative solutions. Our vested interest in our employees translates to deliver the best products and solutions to our customers!

Read more
Consulting & Implementation Services Data Analytic & EPM

Consulting & Implementation Services Data Analytic & EPM

Agency job
via Merito by Sana Patel
Noida
3 - 8 yrs
₹9L - ₹24L / yr
skill iconAmazon Web Services (AWS)
AWS Lambda
AWS CloudFormation
Spark
skill iconPython
Hi

About the Company :

Our Client enables enterprises in their digital transformation journey by offering Consulting & Implementation Services related to Data Analytics &Enterprise Performance Management (EPM).

Our Cleint  deliver the best-suited solutions to our customers across industries such as Retail & E-commerce, Consumer Goods, Pharmaceuticals & Life Sciences, Real Estate & Senior Housing, Hi-tech, Media & Telecom as Manufacturing and Automotive clientele.

Our in-house research and innovation lab has conceived multiple plug-n-play apps, toolkits and plugins to streamline implementation and faster time-to-market

 

Job Title– AWS Developer

Notice period- Immediate to 60 days

Experience – 3-8

Location -  Noida, Mumbai, Bangalore & Kolkata

 

Roles & Responsibilities

  • Bachelor’s degree in Computer Science or a related analytical field or equivalent experience is preferred
  • 3+ years’ experience in one or more architecture domains (e.g., business architecture, solutions architecture, application architecture) 
  • Must have 2 years of experience in design and implementation of cloud workloads in AWS.
  • Minimum of 2 years of experience handling workloads in large-scale environments. Experience in managing large operational cloud environments spanning multiple tenants through techniques such as Multi-Account management, AWS Well Architected Best Practices. 
  • Minimum 3 years of microservice architectural experience. 
  • Minimum of 3 years of experience working exclusively designing and implementing cloud-native workloads. 
  • Experience with analysing and defining technical requirements & design specifications. 
  • Experience with database design with both relational and document-based database systems.
  • Experience with integrating complex multi-tier applications. 
  • Experience with API design and development.
  • Experience with cloud networking and network security, including virtual networks, network security groups, cloud-native firewalls, etc.
  • Proven ability to write programs using an object-oriented or functional programming language such as Spark, Python, AWS Glue, Aws Lambda

 

Job Specification

*Strong and innovative approach to problem solving and finding solutions.

*Excellent communicator (written and verbal, formal and informal).

*Flexible and proactive/self-motivated working style with strong personal ownership of problem resolution.

*Ability to multi-task under pressure and work independently with minimal supervision.

Regards
Team Merito

Read more
Pune
0 - 1 yrs
₹10L - ₹15L / yr
skill iconJava
J2EE
skill iconSpring Boot
Hibernate (Java)
SQL
+6 more
1. Work closely with senior engineers to design, implement and deploy applications that impact the business with an emphasis on mobile, payments, and product website development
2. Design software and make technology choices across the stack (from data storage to application to front-end)
3. Understand a range of tier-1 systems/services that power our product to make scalable changes to critical path code
4. Own the design and delivery of an integral piece of a tier-1 system or application
5. Work closely with product managers, UX designers, and end users and integrate software components into a fully functional system
6. Work on the management and execution of project plans and delivery commitments
7. Take ownership of product/feature end-to-end for all phases from the development to the production
8. Ensure the developed features are scalable and highly available with no quality concerns
9. Work closely with senior engineers for refining and implementation
10. Manage and execute project plans and delivery commitments
11. Create and execute appropriate quality plans, project plans, test strategies, and processes for development activities in concert with business and project management efforts
Read more
one of the world's leading multinational investment bank

one of the world's leading multinational investment bank

Agency job
via HiyaMee by Lithin Raj
Pune
5 - 9 yrs
₹5L - ₹15L / yr
PySpark
Data engineering
Big Data
Hadoop
Spark
+2 more
This role is for a developer with strong core application or system programming skills in Scala, java and
good exposure to concepts and/or technology across the broader spectrum. Enterprise Risk Technology
covers a variety of existing systems and green-field projects.
A Full stack Hadoop development experience with Scala development
A Full stack Java development experience covering Core Java (including JDK 1.8) and good understanding
of design patterns.
Requirements:-
• Strong hands-on development in Java technologies.
• Strong hands-on development in Hadoop technologies like Spark, Scala and experience on Avro.
• Participation in product feature design and documentation
• Requirement break-up, ownership and implantation.
• Product BAU deliveries and Level 3 production defects fixes.
Qualifications & Experience
• Degree holder in numerate subject
• Hands on Experience on Hadoop, Spark, Scala, Impala, Avro and messaging like Kafka
• Experience across a core compiled language – Java
• Proficiency in Java related frameworks like Springs, Hibernate, JPA
• Hands on experience in JDK 1.8 and strong skillset covering Collections, Multithreading with

For internal use only
For internal use only
experience working on Distributed applications.
• Strong hands-on development track record with end-to-end development cycle involvement
• Good exposure to computational concepts
• Good communication and interpersonal skills
• Working knowledge of risk and derivatives pricing (optional)
• Proficiency in SQL (PL/SQL), data modelling.
• Understanding of Hadoop architecture and Scala program language is a good to have.
Read more
Multinational Company providing energy & Automation digital

Multinational Company providing energy & Automation digital

Agency job
via Jobdost by Sathish Kumar
Hyderabad
4 - 7 yrs
₹14L - ₹25L / yr
Spark
Hadoop
Big Data
Data engineering
PySpark
+5 more

Roles and Responsibilities

Big Data Engineer + Spark Responsibilies Atleast 3 to 4 years of relevant experience as Big Data Engineer Min 1 year of relevant hands-on experience into Spark framework. Minimum 4 years of Application Development experience using any programming language like Scala/Java/Python. Hands on experience on any major components in Hadoop Ecosystem like HDFS or Map or Reduce or Hive or Impala. Strong programming experience of building applications / platforms using Scala/Java/Python. Experienced in implementing Spark RDD Transformations, actions to implement business analysis. An efficient interpersonal communicator with sound analytical problemsolving skills and management capabilities. Strive to keep the slope of the learning curve high and able to quickly adapt to new environments and technologies. Good knowledge on agile methodology of Software development.
Read more
Multinational Company providing energy & Automation digital

Multinational Company providing energy & Automation digital

Agency job
via Jobdost by Sathish Kumar
Hyderabad
3 - 5 yrs
₹10L - ₹14L / yr
Microservices
skill iconJava
Ansible
skill iconSpring Boot
Spring MVC
+8 more

Roles and Responsibilities

Java + Microservices Developer Responsibilies Hands-on experience of minimum 3-5 Years in development of scalable and extensible systems using Java. Hands-on experience into Microservices. Experience into frameworks like Spring, Spring MVC, Spring Boot, Hibernate etc. Good knowledge or hands-on experience with a minimum 1 year in Java Script Good working exposure into any Bigdata Technologies like Hadoop, Spark, Scala etc. Experience into Jenkins, Maven, Git. Solid and fluent understanding of algorithm and data structures. Excellent software design, problem-solving and analytical skills. Candidates graduated from Good schools like IIT's, NIIT's, IIIT's (Preferred). Excellent Communication Skills Experience in Database technology such as SQL & No SQL. Good understanding of Elastic Search, Redis, Routines Sync & Async
Read more
Multinational Company providing energy & Automation digital

Multinational Company providing energy & Automation digital

Agency job
via Jobdost by Sathish Kumar
Hyderabad
7 - 12 yrs
₹12L - ₹24L / yr
Spark
Hadoop
Big Data
Data engineering
PySpark
+5 more

Skills

Proficient experience of minimum 7 years into Hadoop. Hands-on experience of minimum 2 years into AWS - EMR/ S3 and other AWS services and dashboards. Good experience of minimum 2 years into Spark framework. Good understanding of Hadoop Eco system including Hive, MR, Spark and Zeppelin. Responsible for troubleshooting and recommendation for Spark and MR jobs. Should be able to use existing logs to debug the issue. Responsible for implementation and ongoing administration of Hadoop infrastructure including monitoring, tuning and troubleshooting Triage production issues when they occur with other operational teams. Hands on experience to troubleshoot incidents, formulate theories and test hypothesis and narrow down possibilities to find the root cause.
Read more
Consulting

Consulting

Agency job
via Michael Page by Pratanu Chakraborty
Pune, Mumbai
6 - 8 yrs
₹5L - ₹20L / yr
skill iconPython
Spark
SQL
6-8 years of hands-on development experience using core Python
Hands-on experience with Spark and SQL
Good to have java knowledge
Read more
Chennai, Hyderabad
5 - 10 yrs
₹10L - ₹25L / yr
PySpark
Data engineering
Big Data
Hadoop
Spark
+2 more

Bigdata with cloud:

 

Experience : 5-10 years

 

Location : Hyderabad/Chennai

 

Notice period : 15-20 days Max

 

1.  Expertise in building AWS Data Engineering pipelines with AWS Glue -> Athena -> Quick sight

2.  Experience in developing lambda functions with AWS Lambda

3.  Expertise with Spark/PySpark – Candidate should be hands on with PySpark code and should be able to do transformations with Spark

4.  Should be able to code in Python and Scala.

5.  Snowflake experience will be a plus

Read more
Remote only
5 - 8 yrs
₹10L - ₹25L / yr
DevOps
skill iconKubernetes
skill iconDocker
SAS
Apache Hive
+2 more

Must Have skills :

Experience in Linux Administration

Experience in building, deploying, and monitoring distributed apps using container systems (Docker) and container orchestration (Kubernetes, EKS)

Ability to read and understand code (Java / Python / R / Scala)

Experience AWS and tools

 

Nice to have skills:

Experience in SAS Viya administration

Experience managing large Big Data clusters

Experience in Big Data tools like Hue, Hive, Spark, Jupyter, SAS and R-Studio

Read more
Encubate Tech Private Ltd

Encubate Tech Private Ltd

Agency job
via staff hire solutions by Purvaja Patidar
Mumbai
5 - 6 yrs
₹15L - ₹20L / yr
skill iconAmazon Web Services (AWS)
Amazon Redshift
Data modeling
ITL
Agile/Scrum
+7 more

Roles and

Responsibilities

Seeking AWS Cloud Engineer /Data Warehouse Developer for our Data CoE team to

help us in configure and develop new AWS environments for our Enterprise Data Lake,

migrate the on-premise traditional workloads to cloud. Must have a sound

understanding of BI best practices, relational structures, dimensional data modelling,

structured query language (SQL) skills, data warehouse and reporting techniques.

 Extensive experience in providing AWS Cloud solutions to various business

use cases.

 Creating star schema data models, performing ETLs and validating results with

business representatives

 Supporting implemented BI solutions by: monitoring and tuning queries and

data loads, addressing user questions concerning data integrity, monitoring

performance and communicating functional and technical issues.

Job Description: -

This position is responsible for the successful delivery of business intelligence

information to the entire organization and is experienced in BI development and

implementations, data architecture and data warehousing.

Requisite Qualification

Essential

-

AWS Certified Database Specialty or -

AWS Certified Data Analytics

Preferred

Any other Data Engineer Certification

Requisite Experience

Essential 4 -7 yrs of experience

Preferred 2+ yrs of experience in ETL & data pipelines

Skills Required

Special Skills Required

 AWS: S3, DMS, Redshift, EC2, VPC, Lambda, Delta Lake, CloudWatch etc.

 Bigdata: Databricks, Spark, Glue and Athena

 Expertise in Lake Formation, Python programming, Spark, Shell scripting

 Minimum Bachelor’s degree with 5+ years of experience in designing, building,

and maintaining AWS data components

 3+ years of experience in data component configuration, related roles and

access setup

 Expertise in Python programming

 Knowledge in all aspects of DevOps (source control, continuous integration,

deployments, etc.)

 Comfortable working with DevOps: Jenkins, Bitbucket, CI/CD

 Hands on ETL development experience, preferably using or SSIS

 SQL Server experience required

 Strong analytical skills to solve and model complex business requirements

 Sound understanding of BI Best Practices/Methodologies, relational structures,

dimensional data modelling, structured query language (SQL) skills, data

warehouse and reporting techniques

Preferred Skills

Required

 Experience working in the SCRUM Environment.

 Experience in Administration (Windows/Unix/Network/Database/Hadoop) is a

plus.

 Experience in SQL Server, SSIS, SSAS, SSRS

 Comfortable with creating data models and visualization using Power BI

 Hands on experience in relational and multi-dimensional data modelling,

including multiple source systems from databases and flat files, and the use of

standard data modelling tools

 Ability to collaborate on a team with infrastructure, BI report development and

business analyst resources, and clearly communicate solutions to both

technical and non-technical team members

Read more
Hyderabad
5 - 15 yrs
₹4L - ₹14L / yr
Spark
Hadoop
Big Data
Data engineering
PySpark
+4 more
Big Data Engineer:-


-Expertise in building AWS Data Engineering pipelines with AWS Glue -> Athena -> Quick sight.

-Experience in developing lambda functions with AWS Lambda.

-
Expertise with Spark/PySpark

– Candidate should be hands on with PySpark code and should be able to do transformations with Spark

-Should be able to code in Python and Scala.

-
Snowflake experience will be a plus
Read more
Get to hear about interesting companies hiring right now
Company logo
Company logo
Company logo
Company logo
Company logo
Linkedin iconFollow Cutshort
Why apply via Cutshort?
Connect with actual hiring teams and get their fast response. No spam.
Find more jobs
Get to hear about interesting companies hiring right now
Company logo
Company logo
Company logo
Company logo
Company logo
Linkedin iconFollow Cutshort