Big data Jobs in Delhi, NCR and Gurgaon

37+ Big data Jobs in Delhi, NCR and Gurgaon | Big data Job openings in Delhi, NCR and Gurgaon

Apply to 37+ Big data Jobs in Delhi, NCR and Gurgaon on CutShort.io. Explore the latest Big data Job opportunities across top companies like Google, Amazon & Adobe.

Data Scientist

Data Havn

Agency job

via Infinium Associate by Toshi Srivastava

Delhi, Gurugram, Noida, Ghaziabad, Faridabad

4 - 6 yrs

₹40L - ₹45L / yr

R Programming

Google Cloud Platform (GCP)

Data Science

Python

Data Visualization

+3 more

DataHavn IT Solutions is a company that specializes in big data and cloud computing, artificial intelligence and machine learning, application development, and consulting services. We want to be in the frontrunner into anything to do with data and we have the required expertise to transform customer businesses by making right use of data.

About the Role:

As a Data Scientist specializing in Google Cloud, you will play a pivotal role in driving data-driven decision-making and innovation within our organization. You will leverage the power of Google Cloud's robust data analytics and machine learning tools to extract valuable insights from large datasets, develop predictive models, and optimize business processes.

Key Responsibilities:

Data Ingestion and Preparation:
Design and implement efficient data pipelines for ingesting, cleaning, and transforming data from various sources (e.g., databases, APIs, cloud storage) into Google Cloud Platform (GCP) data warehouses (BigQuery) or data lakes (Dataflow).
Perform data quality assessments, handle missing values, and address inconsistencies to ensure data integrity.
Exploratory Data Analysis (EDA):
Conduct in-depth EDA to uncover patterns, trends, and anomalies within the data.
Utilize visualization techniques (e.g., Tableau, Looker) to communicate findings effectively.
Feature Engineering:
Create relevant features from raw data to enhance model performance and interpretability.
Explore techniques like feature selection, normalization, and dimensionality reduction.
Model Development and Training:
Develop and train predictive models using machine learning algorithms (e.g., linear regression, logistic regression, decision trees, random forests, neural networks) on GCP platforms like Vertex AI.
Evaluate model performance using appropriate metrics and iterate on the modeling process.
Model Deployment and Monitoring:
Deploy trained models into production environments using GCP's ML tools and infrastructure.
Monitor model performance over time, identify drift, and retrain models as needed.
Collaboration and Communication:
Work closely with data engineers, analysts, and business stakeholders to understand their requirements and translate them into data-driven solutions.
Communicate findings and insights in a clear and concise manner, using visualizations and storytelling techniques.

Required Skills and Qualifications:

Strong proficiency in Python or R programming languages.
Experience with Google Cloud Platform (GCP) services such as BigQuery, Dataflow, Cloud Dataproc, and Vertex AI.
Familiarity with machine learning algorithms and techniques.
Knowledge of data visualization tools (e.g., Tableau, Looker).
Excellent problem-solving and analytical skills.
Ability to work independently and as part of a team.
Strong communication and interpersonal skills.

Preferred Qualifications:

Experience with cloud-native data technologies (e.g., Apache Spark, Kubernetes).
Knowledge of distributed systems and scalable data architectures.
Experience with natural language processing (NLP) or computer vision applications.
Certifications in Google Cloud Platform or relevant machine learning frameworks.

About the Role:

Key Responsibilities:

Data Ingestion and Preparation:
Design and implement efficient data pipelines for ingesting, cleaning, and transforming data from various sources (e.g., databases, APIs, cloud storage) into Google Cloud Platform (GCP) data warehouses (BigQuery) or data lakes (Dataflow).
Perform data quality assessments, handle missing values, and address inconsistencies to ensure data integrity.
Exploratory Data Analysis (EDA):
Conduct in-depth EDA to uncover patterns, trends, and anomalies within the data.
Utilize visualization techniques (e.g., Tableau, Looker) to communicate findings effectively.
Feature Engineering:
Create relevant features from raw data to enhance model performance and interpretability.
Explore techniques like feature selection, normalization, and dimensionality reduction.
Model Development and Training:
Develop and train predictive models using machine learning algorithms (e.g., linear regression, logistic regression, decision trees, random forests, neural networks) on GCP platforms like Vertex AI.
Evaluate model performance using appropriate metrics and iterate on the modeling process.
Model Deployment and Monitoring:
Deploy trained models into production environments using GCP's ML tools and infrastructure.
Monitor model performance over time, identify drift, and retrain models as needed.
Collaboration and Communication:
Work closely with data engineers, analysts, and business stakeholders to understand their requirements and translate them into data-driven solutions.
Communicate findings and insights in a clear and concise manner, using visualizations and storytelling techniques.

Required Skills and Qualifications:

Strong proficiency in Python or R programming languages.
Experience with Google Cloud Platform (GCP) services such as BigQuery, Dataflow, Cloud Dataproc, and Vertex AI.
Familiarity with machine learning algorithms and techniques.
Knowledge of data visualization tools (e.g., Tableau, Looker).
Excellent problem-solving and analytical skills.
Ability to work independently and as part of a team.
Strong communication and interpersonal skills.

Preferred Qualifications:

Experience with cloud-native data technologies (e.g., Apache Spark, Kubernetes).
Knowledge of distributed systems and scalable data architectures.
Experience with natural language processing (NLP) or computer vision applications.
Certifications in Google Cloud Platform or relevant machine learning frameworks.

Data Engineer

Data Havn

Agency job

via Infinium Associate by Toshi Srivastava

Delhi, Gurugram, Noida, Ghaziabad, Faridabad

2.5 - 4.5 yrs

₹10L - ₹20L / yr

Python

SQL

Google Cloud Platform (GCP)

SQL server

ETL

+9 more

About the Role:

We are seeking a talented Data Engineer to join our team and play a pivotal role in transforming raw data into valuable insights. As a Data Engineer, you will design, develop, and maintain robust data pipelines and infrastructure to support our organization's analytics and decision-making processes.

Responsibilities:

Data Pipeline Development: Build and maintain scalable data pipelines to extract, transform, and load (ETL) data from various sources (e.g., databases, APIs, files) into data warehouses or data lakes.
Data Infrastructure: Design, implement, and manage data infrastructure components, including data warehouses, data lakes, and data marts.
Data Quality: Ensure data quality by implementing data validation, cleansing, and standardization processes.
Team Management: Able to handle team.
Performance Optimization: Optimize data pipelines and infrastructure for performance and efficiency.
Collaboration: Collaborate with data analysts, scientists, and business stakeholders to understand their data needs and translate them into technical requirements.
Tool and Technology Selection: Evaluate and select appropriate data engineering tools and technologies (e.g., SQL, Python, Spark, Hadoop, cloud platforms).
Documentation: Create and maintain clear and comprehensive documentation for data pipelines, infrastructure, and processes.

Skills:

Strong proficiency in SQL and at least one programming language (e.g., Python, Java).
Experience with data warehousing and data lake technologies (e.g., Snowflake, AWS Redshift, Databricks).
Knowledge of cloud platforms (e.g., AWS, GCP, Azure) and cloud-based data services.
Understanding of data modeling and data architecture concepts.
Experience with ETL/ELT tools and frameworks.
Excellent problem-solving and analytical skills.
Ability to work independently and as part of a team.

Preferred Qualifications:

Experience with real-time data processing and streaming technologies (e.g., Kafka, Flink).
Knowledge of machine learning and artificial intelligence concepts.
Experience with data visualization tools (e.g., Tableau, Power BI).
Certification in cloud platforms or data engineering.

About the Role:

Responsibilities:

Data Pipeline Development: Build and maintain scalable data pipelines to extract, transform, and load (ETL) data from various sources (e.g., databases, APIs, files) into data warehouses or data lakes.
Data Infrastructure: Design, implement, and manage data infrastructure components, including data warehouses, data lakes, and data marts.
Data Quality: Ensure data quality by implementing data validation, cleansing, and standardization processes.
Team Management: Able to handle team.
Performance Optimization: Optimize data pipelines and infrastructure for performance and efficiency.
Collaboration: Collaborate with data analysts, scientists, and business stakeholders to understand their data needs and translate them into technical requirements.
Tool and Technology Selection: Evaluate and select appropriate data engineering tools and technologies (e.g., SQL, Python, Spark, Hadoop, cloud platforms).
Documentation: Create and maintain clear and comprehensive documentation for data pipelines, infrastructure, and processes.

Skills:

Strong proficiency in SQL and at least one programming language (e.g., Python, Java).
Experience with data warehousing and data lake technologies (e.g., Snowflake, AWS Redshift, Databricks).
Knowledge of cloud platforms (e.g., AWS, GCP, Azure) and cloud-based data services.
Understanding of data modeling and data architecture concepts.
Experience with ETL/ELT tools and frameworks.
Excellent problem-solving and analytical skills.
Ability to work independently and as part of a team.

Preferred Qualifications:

Experience with real-time data processing and streaming technologies (e.g., Kafka, Flink).
Knowledge of machine learning and artificial intelligence concepts.
Experience with data visualization tools (e.g., Tableau, Power BI).
Certification in cloud platforms or data engineering.

VP - Data Architect (B2B SaaS)

Technology Industry

Agency job

via Peak Hire Solutions by Dhara Thakkar

Delhi

10 - 15 yrs

₹105L - ₹140L / yr

Data engineering

Apache Spark

Apache

Apache Kafka

Java

+25 more

MANDATORY:

Super Quality Data Architect, Data Engineering Manager / Director Profile
Must have 12+ YOE in Data Engineering roles, with at least 2+ years in a Leadership role
Must have 7+ YOE in hands-on Tech development with Java (Highly preferred) or Python, Node.JS, GoLang
Must have strong experience in large data technologies, tools like HDFS, YARN, Map-Reduce, Hive, Kafka, Spark, Airflow, Presto etc.
Strong expertise in HLD and LLD, to design scalable, maintainable data architectures.
Must have managed a team of at least 5+ Data Engineers (Read Leadership role in CV)
Product Companies (Prefers high-scale, data-heavy companies)

PREFERRED:

Must be from Tier - 1 Colleges, preferred IIT
Candidates must have spent a minimum 3 yrs in each company.
Must have recent 4+ YOE with high-growth Product startups, and should have implemented Data Engineering systems from an early stage in the Company

ROLES & RESPONSIBILITIES:

Lead and mentor a team of data engineers, ensuring high performance and career growth.
Architect and optimize scalable data infrastructure, ensuring high availability and reliability.
Drive the development and implementation of data governance frameworks and best practices.
Work closely with cross-functional teams to define and execute a data roadmap.
Optimize data processing workflows for performance and cost efficiency.
Ensure data security, compliance, and quality across all data platforms.
Foster a culture of innovation and technical excellence within the data team.

IDEAL CANDIDATE:

10+ years of experience in software/data engineering, with at least 3+ years in a leadership role.
Expertise in backend development with programming languages such as Java, PHP, Python, Node.JS, GoLang, JavaScript, HTML, and CSS.
Proficiency in SQL, Python, and Scala for data processing and analytics.
Strong understanding of cloud platforms (AWS, GCP, or Azure) and their data services.
Strong foundation and expertise in HLD and LLD, as well as design patterns, preferably using Spring Boot or Google Guice
Experience in big data technologies such as Spark, Hadoop, Kafka, and distributed computing frameworks.
Hands-on experience with data warehousing solutions such as Snowflake, Redshift, or BigQuery
Deep knowledge of data governance, security, and compliance (GDPR, SOC2, etc.).
Experience in NoSQL databases like Redis, Cassandra, MongoDB, and TiDB.
Familiarity with automation and DevOps tools like Jenkins, Ansible, Docker, Kubernetes, Chef, Grafana, and ELK.
Proven ability to drive technical strategy and align it with business objectives.
Strong leadership, communication, and stakeholder management skills.

PREFERRED QUALIFICATIONS:

Experience in machine learning infrastructure or MLOps is a plus.
Exposure to real-time data processing and analytics.
Interest in data structures, algorithm analysis and design, multicore programming, and scalable architecture.
Prior experience in a SaaS or high-growth tech company.

MANDATORY:

Super Quality Data Architect, Data Engineering Manager / Director Profile
Must have 12+ YOE in Data Engineering roles, with at least 2+ years in a Leadership role
Must have 7+ YOE in hands-on Tech development with Java (Highly preferred) or Python, Node.JS, GoLang
Must have strong experience in large data technologies, tools like HDFS, YARN, Map-Reduce, Hive, Kafka, Spark, Airflow, Presto etc.
Strong expertise in HLD and LLD, to design scalable, maintainable data architectures.
Must have managed a team of at least 5+ Data Engineers (Read Leadership role in CV)
Product Companies (Prefers high-scale, data-heavy companies)

PREFERRED:

Must be from Tier - 1 Colleges, preferred IIT
Candidates must have spent a minimum 3 yrs in each company.
Must have recent 4+ YOE with high-growth Product startups, and should have implemented Data Engineering systems from an early stage in the Company

ROLES & RESPONSIBILITIES:

Lead and mentor a team of data engineers, ensuring high performance and career growth.
Architect and optimize scalable data infrastructure, ensuring high availability and reliability.
Drive the development and implementation of data governance frameworks and best practices.
Work closely with cross-functional teams to define and execute a data roadmap.
Optimize data processing workflows for performance and cost efficiency.
Ensure data security, compliance, and quality across all data platforms.
Foster a culture of innovation and technical excellence within the data team.

IDEAL CANDIDATE:

10+ years of experience in software/data engineering, with at least 3+ years in a leadership role.
Expertise in backend development with programming languages such as Java, PHP, Python, Node.JS, GoLang, JavaScript, HTML, and CSS.
Proficiency in SQL, Python, and Scala for data processing and analytics.
Strong understanding of cloud platforms (AWS, GCP, or Azure) and their data services.
Strong foundation and expertise in HLD and LLD, as well as design patterns, preferably using Spring Boot or Google Guice
Experience in big data technologies such as Spark, Hadoop, Kafka, and distributed computing frameworks.
Hands-on experience with data warehousing solutions such as Snowflake, Redshift, or BigQuery
Deep knowledge of data governance, security, and compliance (GDPR, SOC2, etc.).
Experience in NoSQL databases like Redis, Cassandra, MongoDB, and TiDB.
Familiarity with automation and DevOps tools like Jenkins, Ansible, Docker, Kubernetes, Chef, Grafana, and ELK.
Proven ability to drive technical strategy and align it with business objectives.
Strong leadership, communication, and stakeholder management skills.

PREFERRED QUALIFICATIONS:

Experience in machine learning infrastructure or MLOps is a plus.
Exposure to real-time data processing and analytics.
Interest in data structures, algorithm analysis and design, multicore programming, and scalable architecture.
Prior experience in a SaaS or high-growth tech company.

PySpark/Scala Developer

at Tata Consultancy Services

2 recruiters

Agency job

via Risk Resources LLP hyd by susmitha o

Bengaluru (Bangalore), Hyderabad, Pune, Delhi, Kolkata, Chennai

5 - 8 yrs

₹7L - ₹30L / yr

Scala

Python

PySpark

Apache Hive

Spark

+3 more

Skills and competencies:

Required:

· Strong analytical skills in conducting sophisticated statistical analysis using bureau/vendor data, customer performance

Data and macro-economic data to solve business problems.

· Working experience in languages PySpark & Scala to develop code to validate and implement models and codes in

Credit Risk/Banking

· Experience with distributed systems such as Hadoop/MapReduce, Spark, streaming data processing, cloud architecture.

Familiarity with machine learning frameworks and libraries (like scikit-learn, SparkML, tensorflow, pytorch etc.
Experience in systems integration, web services, batch processing
Experience in migrating codes to PySpark/Scala is big Plus
The ability to act as liaison conveying information needs of the business to IT and data constraints to the business

applies equal conveyance regarding business strategy and IT strategy, business processes and work flow

· Flexibility in approach and thought process

· Attitude to learn and comprehend the periodical changes in the regulatory requirement as per FED

Skills and competencies:

Required:

· Strong analytical skills in conducting sophisticated statistical analysis using bureau/vendor data, customer performance

Data and macro-economic data to solve business problems.

· Working experience in languages PySpark & Scala to develop code to validate and implement models and codes in

Credit Risk/Banking

· Experience with distributed systems such as Hadoop/MapReduce, Spark, streaming data processing, cloud architecture.

Familiarity with machine learning frameworks and libraries (like scikit-learn, SparkML, tensorflow, pytorch etc.
Experience in systems integration, web services, batch processing
Experience in migrating codes to PySpark/Scala is big Plus
The ability to act as liaison conveying information needs of the business to IT and data constraints to the business

applies equal conveyance regarding business strategy and IT strategy, business processes and work flow

· Flexibility in approach and thought process

· Attitude to learn and comprehend the periodical changes in the regulatory requirement as per FED

IBM MDM Developer

at Tata Consultancy Services

2 recruiters

Agency job

via Risk Resources LLP hyd by susmitha o

Chennai, Hyderabad, Kolkata, Delhi, Pune, Bengaluru (Bangalore)

5 - 8 yrs

₹7L - ₹30L / yr

Informatica MDM

MDM

ETL

Big Data

• Technical expertise in the area of development of Master Data Management, data extraction, transformation, and load (ETL) applications, big data using existing and emerging technology platforms and cloud architecture

• Functions as lead developer• Support System Analysis, Technical/Data design, development, unit testing, and oversee end-to-end data solution.

• Technical SME in Master Data Management application, ETL, big data and cloud technologies

• Collaborate with IT teams to ensure technical designs and implementations account for requirements, standards, and best practices

• Performance tuning of end-to-end MDM, database, ETL, Big data processes or in the source/target database endpoints as needed.

• Mentor and advise junior members of team to provide guidance.

• Perform a technical lead and solution lead role for a team of onshore and offshore developers

• Functions as lead developer• Support System Analysis, Technical/Data design, development, unit testing, and oversee end-to-end data solution.

• Technical SME in Master Data Management application, ETL, big data and cloud technologies

• Collaborate with IT teams to ensure technical designs and implementations account for requirements, standards, and best practices

• Performance tuning of end-to-end MDM, database, ETL, Big data processes or in the source/target database endpoints as needed.

• Mentor and advise junior members of team to provide guidance.

• Perform a technical lead and solution lead role for a team of onshore and offshore developers

Cassandra DBA

at Cornertree

1 recruiter

Posted by Deepesh Shrimal

Bengaluru (Bangalore), Pune, Hyderabad, Gurugram, Noida

5 - 10 yrs

₹15L - ₹30L / yr

Cassandra

PySpark

Data engineering

Big Data

Hadoop

+3 more

Skills:

Experience with Cassandra, including installing configuring and monitoring a Cassandra cluster.

Experience with Cassandra data modeling and CQL scripting. Experience with DataStax Enterprise Graph

Experience with both Windows and Linux Operating Systems. Knowledge of Microsoft .NET Framework (C#, NETCore).

Ability to perform effectively in a team-oriented environment

Skills:

Experience with Cassandra, including installing configuring and monitoring a Cassandra cluster.

Experience with Cassandra data modeling and CQL scripting. Experience with DataStax Enterprise Graph

Experience with both Windows and Linux Operating Systems. Knowledge of Microsoft .NET Framework (C#, NETCore).

Ability to perform effectively in a team-oriented environment

Kafka Developer

at iLink Systems

1 video

1 recruiter

Posted by Ganesh Sooriyamoorthu

Chennai, Pune, Noida, Bengaluru (Bangalore)

5 - 15 yrs

₹10L - ₹15L / yr

Apache Kafka

Big Data

Java

Spark

Hadoop

+1 more

KSQL
Data Engineering spectrum (Java/Spark)
Spark Scala / Kafka Streaming
Confluent Kafka components
Basic understanding of Hadoop

KSQL
Data Engineering spectrum (Java/Spark)
Spark Scala / Kafka Streaming
Confluent Kafka components
Basic understanding of Hadoop

Engineering Manager Devops

at Classplus

1 video

4 recruiters

Posted by Peoples Office

Noida

8 - 10 yrs

₹35L - ₹55L / yr

Docker

Kubernetes

DevOps

Google Cloud Platform (GCP)

Amazon Web Services (AWS)

+16 more

About us

Classplus is India's largest B2B ed-tech start-up, enabling 1 Lac+ educators and content creators to create their digital identity with their own branded apps. Starting in 2018, we have grown more than 10x in the last year, into India's fastest-growing video learning platform.

Over the years, marquee investors like Tiger Global, Surge, GSV Ventures, Blume, Falcon, Capital, RTP Global, and Chimera Ventures have supported our vision. Thanks to our awesome and dedicated team, we achieved a major milestone in March this year when we secured a “Series-D” funding.

Now as we go global, we are super excited to have new folks on board who can take the rocketship higher🚀. Do you think you have what it takes to help us achieve this? Find Out Below!

What will you do?

· Define the overall process, which includes building a team for DevOps activities and ensuring that infrastructure changes are reviewed from an architecture and security perspective

· Create standardized tooling and templates for development teams to create CI/CD pipelines

· Ensure infrastructure is created and maintained using terraform

· Work with various stakeholders to design and implement infrastructure changes to support new feature sets in various product lines.

· Maintain transparency and clear visibility of costs associated with various product verticals, environments and work with stakeholders to plan for optimization and implementation

· Spearhead continuous experimenting and innovating initiatives to optimize the infrastructure in terms of uptime, availability, latency and costs

You should apply, if you

1. Are a seasoned Veteran: Have managed infrastructure at scale running web apps, microservices, and data pipelines using tools and languages like JavaScript(NodeJS), Go, Python, Java, Erlang, Elixir, C++ or Ruby (experience in any one of them is enough)

2. Are a Mr. Perfectionist: You have a strong bias for automation and taking the time to think about the right way to solve a problem versus quick fixes or band-aids.

3. Bring your A-Game: Have hands-on experience and ability to design/implement infrastructure with GCP services like Compute, Database, Storage, Load Balancers, API Gateway, Service Mesh, Firewalls, Message Brokers, Monitoring, Logging and experience in setting up backups, patching and DR planning

4. Are up with the times: Have expertise in one or more cloud platforms (Amazon WebServices or Google Cloud Platform or Microsoft Azure), and have experience in creating and managing infrastructure completely through Terraform kind of tool

5. Have it all on your fingertips: Have experience building CI/CD pipeline using Jenkins, Docker for applications majorly running on Kubernetes. Hands-on experience in managing and troubleshooting applications running on K8s

6. Have nailed the data storage game: Good knowledge of Relational and NoSQL databases (MySQL,Mongo, BigQuery, Cassandra…)

7. Bring that extra zing: Have the ability to program/script is and strong fundamentals in Linux and Networking.

8. Know your toys: Have a good understanding of Microservices architecture, Big Data technologies and experience with highly available distributed systems, scaling data store technologies, and creating multi-tenant and self hosted environments, that’s a plus

Being Part of the Clan

At Classplus, you’re not an “employee” but a part of our “Clan”. So, you can forget about being bound by the clock as long as you’re crushing it workwise😎. Add to that some passionate people working with and around you, and what you get is the perfect work vibe you’ve been looking for!

It doesn’t matter how long your journey has been or your position in the hierarchy (we don’t do Sirs and Ma’ams); you’ll be heard, appreciated, and rewarded. One can say, we have a special place in our hearts for the Doers! ✊🏼❤️

Are you a go-getter with the chops to nail what you do? Then this is the place for you.

About us

Now as we go global, we are super excited to have new folks on board who can take the rocketship higher🚀. Do you think you have what it takes to help us achieve this? Find Out Below!

What will you do?

· Define the overall process, which includes building a team for DevOps activities and ensuring that infrastructure changes are reviewed from an architecture and security perspective

· Create standardized tooling and templates for development teams to create CI/CD pipelines

· Ensure infrastructure is created and maintained using terraform

· Work with various stakeholders to design and implement infrastructure changes to support new feature sets in various product lines.

· Maintain transparency and clear visibility of costs associated with various product verticals, environments and work with stakeholders to plan for optimization and implementation

· Spearhead continuous experimenting and innovating initiatives to optimize the infrastructure in terms of uptime, availability, latency and costs

You should apply, if you

2. Are a Mr. Perfectionist: You have a strong bias for automation and taking the time to think about the right way to solve a problem versus quick fixes or band-aids.

6. Have nailed the data storage game: Good knowledge of Relational and NoSQL databases (MySQL,Mongo, BigQuery, Cassandra…)

7. Bring that extra zing: Have the ability to program/script is and strong fundamentals in Linux and Networking.

Being Part of the Clan

Are you a go-getter with the chops to nail what you do? Then this is the place for you.

Data Engineer

consulting & implementation services in the area of Oil & Gas, Mining and Manufacturing Industry

Agency job

via Jobdost by Sathish Kumar

Ahmedabad, Hyderabad, Pune, Delhi

5 - 7 yrs

₹18L - ₹25L / yr

AWS Lambda

AWS Simple Notification Service (SNS)

AWS Simple Queuing Service (SQS)

Python

PySpark

+9 more

Data Engineer

Required skill set: AWS GLUE, AWS LAMBDA, AWS SNS/SQS, AWS ATHENA, SPARK, SNOWFLAKE, PYTHON

Mandatory Requirements 

Experience in AWS Glue
Experience in Apache Parquet 
Proficient in AWS S3 and data lake 
Knowledge of Snowflake
Understanding of file-based ingestion best practices.
Scripting language - Python & pyspark

CORE RESPONSIBILITIES

Create and manage cloud resources in AWS 
Data ingestion from different data sources which exposes data using different technologies, such as: RDBMS, REST HTTP API, flat files, Streams, and Time series data based on various proprietary systems. Implement data ingestion and processing with the help of Big Data technologies 
Data processing/transformation using various technologies such as Spark and Cloud Services. You will need to understand your part of business logic and implement it using the language supported by the base data platform 
Develop automated data quality check to make sure right data enters the platform and verifying the results of the calculations 
Develop an infrastructure to collect, transform, combine and publish/distribute customer data.
Define process improvement opportunities to optimize data collection, insights and displays.
Ensure data and results are accessible, scalable, efficient, accurate, complete and flexible 
Identify and interpret trends and patterns from complex data sets 
Construct a framework utilizing data visualization tools and techniques to present consolidated analytical and actionable results to relevant stakeholders. 
Key participant in regular Scrum ceremonies with the agile teams  
Proficient at developing queries, writing reports and presenting findings 
Mentor junior members and bring best industry practices

 QUALIFICATIONS

5-7+ years’ experience as data engineer in consumer finance or equivalent industry (consumer loans, collections, servicing, optional product, and insurance sales) 
Strong background in math, statistics, computer science, data science or related discipline
Advanced knowledge one of language: Java, Scala, Python, C# 
Production experience with: HDFS, YARN, Hive, Spark, Kafka, Oozie / Airflow, Amazon Web Services (AWS), Docker / Kubernetes, Snowflake  
Proficient with
Data mining/programming tools (e.g. SAS, SQL, R, Python)
Database technologies (e.g. PostgreSQL, Redshift, Snowflake. and Greenplum)
Data visualization (e.g. Tableau, Looker, MicroStrategy)
Comfortable learning about and deploying new technologies and tools. 
Organizational skills and the ability to handle multiple projects and priorities simultaneously and meet established deadlines. 
Good written and oral communication skills and ability to present results to non-technical audiences 
Knowledge of business intelligence and analytical tools, technologies and techniques.

Familiarity and experience in the following is a plus: 

AWS certification
Spark Streaming 
Kafka Streaming / Kafka Connect 
ELK Stack 
Cassandra / MongoDB 
CI/CD: Jenkins, GitLab, Jira, Confluence other related tools

Data Engineer

Required skill set: AWS GLUE, AWS LAMBDA, AWS SNS/SQS, AWS ATHENA, SPARK, SNOWFLAKE, PYTHON

Mandatory Requirements 

Experience in AWS Glue
Experience in Apache Parquet 
Proficient in AWS S3 and data lake 
Knowledge of Snowflake
Understanding of file-based ingestion best practices.
Scripting language - Python & pyspark

CORE RESPONSIBILITIES

Create and manage cloud resources in AWS 
Data ingestion from different data sources which exposes data using different technologies, such as: RDBMS, REST HTTP API, flat files, Streams, and Time series data based on various proprietary systems. Implement data ingestion and processing with the help of Big Data technologies 
Data processing/transformation using various technologies such as Spark and Cloud Services. You will need to understand your part of business logic and implement it using the language supported by the base data platform 
Develop automated data quality check to make sure right data enters the platform and verifying the results of the calculations 
Develop an infrastructure to collect, transform, combine and publish/distribute customer data.
Define process improvement opportunities to optimize data collection, insights and displays.
Ensure data and results are accessible, scalable, efficient, accurate, complete and flexible 
Identify and interpret trends and patterns from complex data sets 
Construct a framework utilizing data visualization tools and techniques to present consolidated analytical and actionable results to relevant stakeholders. 
Key participant in regular Scrum ceremonies with the agile teams  
Proficient at developing queries, writing reports and presenting findings 
Mentor junior members and bring best industry practices

 QUALIFICATIONS

5-7+ years’ experience as data engineer in consumer finance or equivalent industry (consumer loans, collections, servicing, optional product, and insurance sales) 
Strong background in math, statistics, computer science, data science or related discipline
Advanced knowledge one of language: Java, Scala, Python, C# 
Production experience with: HDFS, YARN, Hive, Spark, Kafka, Oozie / Airflow, Amazon Web Services (AWS), Docker / Kubernetes, Snowflake  
Proficient with
Data mining/programming tools (e.g. SAS, SQL, R, Python)
Database technologies (e.g. PostgreSQL, Redshift, Snowflake. and Greenplum)
Data visualization (e.g. Tableau, Looker, MicroStrategy)
Comfortable learning about and deploying new technologies and tools. 
Organizational skills and the ability to handle multiple projects and priorities simultaneously and meet established deadlines. 
Good written and oral communication skills and ability to present results to non-technical audiences 
Knowledge of business intelligence and analytical tools, technologies and techniques.

Familiarity and experience in the following is a plus: 

AWS certification
Spark Streaming 
Kafka Streaming / Kafka Connect 
ELK Stack 
Cassandra / MongoDB 
CI/CD: Jenkins, GitLab, Jira, Confluence other related tools

Technical Project Manager

at Celebal Technologies

2 recruiters

Posted by Payal Hasnani

Jaipur, Noida, Gurugram, Delhi, Ghaziabad, Faridabad, Pune, Mumbai

5 - 15 yrs

₹7L - ₹25L / yr

PySpark

Data engineering

Big Data

Hadoop

Spark

+4 more

Job Responsibilities:

• Project Planning and Management
o Take end-to-end ownership of multiple projects / project tracks
o Create and maintain project plans and other related documentation for project
objectives, scope, schedule and delivery milestones
o Lead and participate across all the phases of software engineering, right from
requirements gathering to GO LIVE
o Lead internal team meetings on solution architecture, effort estimation, manpower
planning and resource (software/hardware/licensing) planning
o Manage RIDA (Risks, Impediments, Dependencies, Assumptions) for projects by
developing effective mitigation plans
• Team Management
o Act as the Scrum Master
o Conduct SCRUM ceremonies like Sprint Planning, Daily Standup, Sprint Retrospective
o Set clear objectives for the project and roles/responsibilities for each team member
o Train and mentor the team on their job responsibilities and SCRUM principles
o Make the team accountable for their tasks and help the team in achieving them
o Identify the requirements and come up with a plan for Skill Development for all team
members
• Communication
o Be the Single Point of Contact for the client in terms of day-to-day communication
o Periodically communicate project status to all the stakeholders (internal/external)
• Process Management and Improvement
o Create and document processes across all disciplines of software engineering
o Identify gaps and continuously improve processes within the team
o Encourage team members to contribute towards process improvement
o Develop a culture of quality and efficiency within the team

Must have:
• Minimum 08 years of experience (hands-on as well as leadership) in software / data engineering
across multiple job functions like Business Analysis, Development, Solutioning, QA, DevOps and
Project Management
• Hands-on as well as leadership experience in Big Data Engineering projects
• Experience developing or managing cloud solutions using Azure or other cloud provider
• Demonstrable knowledge on Hadoop, Hive, Spark, NoSQL DBs, SQL, Data Warehousing, ETL/ELT,
DevOps tools
• Strong project management and communication skills
• Strong analytical and problem-solving skills
• Strong systems level critical thinking skills
• Strong collaboration and influencing skills

Good to have:
• Knowledge on PySpark, Azure Data Factory, Azure Data Lake Storage, Synapse Dedicated SQL
Pool, Databricks, PowerBI, Machine Learning, Cloud Infrastructure
• Background in BFSI with focus on core banking
• Willingness to travel

Work Environment
• Customer Office (Mumbai) / Remote Work

Education
• UG: B. Tech - Computers / B. E. – Computers / BCA / B.Sc. Computer Science

GCP Developer

at Quess Corp Limited

6 recruiters

Posted by Anjali Singh

Noida, Delhi, Gurugram, Ghaziabad, Faridabad, Bengaluru (Bangalore), Chennai

5 - 8 yrs

₹1L - ₹15L / yr

Google Cloud Platform (GCP)

Python

Big Data

Data processing

Data Visualization

GCP Data Analyst profile must have below skills sets :

Knowledge of programming languages like https://apc01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.simplilearn.com%2Ftutorials%2Fsql-tutorial%2Fhow-to-become-sql-developer&;data=05%7C01%7Ca_anjali%40hcl.com%7C4ae720b3f3cc45c3e04608da3346b335%7C189de737c93a4f5a8b686f4ca9941912%7C0%7C0%7C637878675987971859%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=EImfaJAD1KHOyrBQ7FkbaPl1STtfnf4QdQlbjw72%2BmE%3D&reserved=0" target="_blank">SQL, Oracle, R, MATLAB, Java and https://apc01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.simplilearn.com%2Fwhy-learn-python-a-guide-to-unlock-your-python-career-article&;data=05%7C01%7Ca_anjali%40hcl.com%7C4ae720b3f3cc45c3e04608da3346b335%7C189de737c93a4f5a8b686f4ca9941912%7C0%7C0%7C637878675987971859%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=Z2n1Xy%2F3YN6nQqSweU5T7EfUTa1kPAAjbCMTWxDCh%2FY%3D&reserved=0" target="_blank">Python
Data cleansing, data visualization, data wrangling
Data modeling , data warehouse concepts
Adapt to Big data platform like Hadoop, Spark for stream & batch processing
GCP (Cloud Dataproc, Cloud Dataflow, Cloud Datalab, Cloud Dataprep, BigQuery, Cloud Datastore, Cloud Datafusion, Auto ML etc)

GCP Data Analyst profile must have below skills sets :

Knowledge of programming languages like https://apc01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.simplilearn.com%2Ftutorials%2Fsql-tutorial%2Fhow-to-become-sql-developer&;data=05%7C01%7Ca_anjali%40hcl.com%7C4ae720b3f3cc45c3e04608da3346b335%7C189de737c93a4f5a8b686f4ca9941912%7C0%7C0%7C637878675987971859%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=EImfaJAD1KHOyrBQ7FkbaPl1STtfnf4QdQlbjw72%2BmE%3D&reserved=0" target="_blank">SQL, Oracle, R, MATLAB, Java and https://apc01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.simplilearn.com%2Fwhy-learn-python-a-guide-to-unlock-your-python-career-article&;data=05%7C01%7Ca_anjali%40hcl.com%7C4ae720b3f3cc45c3e04608da3346b335%7C189de737c93a4f5a8b686f4ca9941912%7C0%7C0%7C637878675987971859%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=Z2n1Xy%2F3YN6nQqSweU5T7EfUTa1kPAAjbCMTWxDCh%2FY%3D&reserved=0" target="_blank">Python
Data cleansing, data visualization, data wrangling
Data modeling , data warehouse concepts
Adapt to Big data platform like Hadoop, Spark for stream & batch processing
GCP (Cloud Dataproc, Cloud Dataflow, Cloud Datalab, Cloud Dataprep, BigQuery, Cloud Datastore, Cloud Datafusion, Auto ML etc)

Vice President - DevOps Engineering

at APT Portfolio

1 recruiter

Posted by Ankita Pachauri

Delhi, Gurugram, Bengaluru (Bangalore)

10 - 15 yrs

₹50L - ₹70L / yr

DevOps

Kubernetes

Docker

Amazon Web Services (AWS)

Windows Azure

+13 more

A.P.T Portfolio, a high frequency trading firm that specialises in Quantitative Trading & Investment Strategies.Founded in November 2009, it has been a major liquidity provider in global Stock markets.

As a manager, you would be incharge of managing the devops team and your remit shall include the following

Private Cloud - Design & maintain a high performance and reliable network architecture to support HPC applications
Scheduling Tool - Implement and maintain a HPC scheduling technology like Kubernetes, Hadoop YARN Mesos, HTCondor or Nomad for processing & scheduling analytical jobs. Implement controls which allow analytical jobs to seamlessly utilize ideal capacity on the private cloud.
Security - Implementing best security practices and implementing data isolation policy between different divisions internally.
Capacity Sizing - Monitor private cloud usage and share details with different teams. Plan capacity enhancements on a quarterly basis.
Storage solution - Optimize storage solutions like NetApp, EMC, Quobyte for analytical jobs. Monitor their performance on a daily basis to identify issues early.
NFS - Implement and optimize latest version of NFS for our use case.
Public Cloud - Drive AWS/Google-Cloud utilization in the firm for increasing efficiency, improving collaboration and for reducing cost. Maintain the environment for our existing use cases. Further explore potential areas of using public cloud within the firm.
BackUps - Identify and automate back up of all crucial data/binary/code etc in a secured manner at such duration warranted by the use case. Ensure that recovery from back-up is tested and seamless.
Access Control - Maintain password less access control and improve security over time. Minimize failures for automated job due to unsuccessful logins.
Operating System -Plan, test and roll out new operating system for all production, simulation and desktop environments. Work closely with developers to highlight new performance enhancements capabilities of new versions.
Configuration management -Work closely with DevOps/ development team to freeze configurations/playbook for various teams & internal applications. Deploy and maintain standard tools such as Ansible, Puppet, chef etc for the same.
Data Storage & Security Planning - Maintain a tight control of root access on various devices. Ensure root access is rolled back as soon the desired objective is achieved.
Audit access logs on devices. Use third party tools to put in a monitoring mechanism for early detection of any suspicious activity.
Maintaining all third party tools used for development and collaboration - This shall include maintaining a fault tolerant environment for GIT/Perforce, productivity tools such as Slack/Microsoft team, build tools like Jenkins/Bamboo etc

Qualifications

Bachelors or Masters Level Degree, preferably in CSE/IT
10+ years of relevant experience in sys-admin function
Must have strong knowledge of IT Infrastructure, Linux, Networking and grid.
Must have strong grasp of automation & Data management tools.
Efficient in scripting languages and python

Desirables

Professional attitude, co-operative and mature approach to work, must be focused, structured and well considered, troubleshooting skills.
Exhibit a high level of individual initiative and ownership, effectively collaborate with other team members.

APT Portfolio is an equal opportunity employer

As a manager, you would be incharge of managing the devops team and your remit shall include the following

Private Cloud - Design & maintain a high performance and reliable network architecture to support HPC applications
Scheduling Tool - Implement and maintain a HPC scheduling technology like Kubernetes, Hadoop YARN Mesos, HTCondor or Nomad for processing & scheduling analytical jobs. Implement controls which allow analytical jobs to seamlessly utilize ideal capacity on the private cloud.
Security - Implementing best security practices and implementing data isolation policy between different divisions internally.
Capacity Sizing - Monitor private cloud usage and share details with different teams. Plan capacity enhancements on a quarterly basis.
Storage solution - Optimize storage solutions like NetApp, EMC, Quobyte for analytical jobs. Monitor their performance on a daily basis to identify issues early.
NFS - Implement and optimize latest version of NFS for our use case.
Public Cloud - Drive AWS/Google-Cloud utilization in the firm for increasing efficiency, improving collaboration and for reducing cost. Maintain the environment for our existing use cases. Further explore potential areas of using public cloud within the firm.
BackUps - Identify and automate back up of all crucial data/binary/code etc in a secured manner at such duration warranted by the use case. Ensure that recovery from back-up is tested and seamless.
Access Control - Maintain password less access control and improve security over time. Minimize failures for automated job due to unsuccessful logins.
Operating System -Plan, test and roll out new operating system for all production, simulation and desktop environments. Work closely with developers to highlight new performance enhancements capabilities of new versions.
Configuration management -Work closely with DevOps/ development team to freeze configurations/playbook for various teams & internal applications. Deploy and maintain standard tools such as Ansible, Puppet, chef etc for the same.
Data Storage & Security Planning - Maintain a tight control of root access on various devices. Ensure root access is rolled back as soon the desired objective is achieved.
Audit access logs on devices. Use third party tools to put in a monitoring mechanism for early detection of any suspicious activity.
Maintaining all third party tools used for development and collaboration - This shall include maintaining a fault tolerant environment for GIT/Perforce, productivity tools such as Slack/Microsoft team, build tools like Jenkins/Bamboo etc

Qualifications

Bachelors or Masters Level Degree, preferably in CSE/IT
10+ years of relevant experience in sys-admin function
Must have strong knowledge of IT Infrastructure, Linux, Networking and grid.
Must have strong grasp of automation & Data management tools.
Efficient in scripting languages and python

Desirables

Professional attitude, co-operative and mature approach to work, must be focused, structured and well considered, troubleshooting skills.
Exhibit a high level of individual initiative and ownership, effectively collaborate with other team members.

APT Portfolio is an equal opportunity employer

Product Manager

at Impetus

3 recruiters

Agency job

via Impetus by Gangadhar TM

Bengaluru (Bangalore), Pune, Hyderabad, Indore, Noida, Gurugram

10 - 16 yrs

₹30L - ₹50L / yr

Big Data

Data Warehouse (DWH)

Product Management

Job Title: Product Manager

Job Description

Bachelor or master’s degree in computer science or equivalent experience.
Worked as Product Owner before and took responsibility for a product or project delivery.
Well-versed with data warehouse modernization to Big Data and Cloud environments.
Good knowledge* of any of the Cloud (AWS/Azure/GCP) – Must Have
Practical experience with continuous integration and continuous delivery workflows.
Self-motivated with strong organizational/prioritization skills and ability to multi-task with close attention to detail.
Good communication skills
Experience in working within a distributed agile team
Experience in handling migration projects – Good to Have

*Data Ingestion, Processing, and Orchestration knowledge

Roles & Responsibilities

Responsible for coming up with innovative and novel ideas for the product.
Define product releases, features, and roadmap.
Collaborate with product teams on defining product objectives, including creating a product roadmap, delivery, market research, customer feedback, and stakeholder inputs.
Work with the Engineering teams to communicate release goals and be a part of the product lifecycle. Work closely with the UX and UI team to create the best user experience for the end customer.
Work with the Marketing team to define GTM activities.
Interface with Sales & Customer teams to identify customer needs and product gaps
Market and competition analysis activities.
Participate in the Agile ceremonies with the team, define epics, user stories, acceptance criteria
Ensure product usability from the end-user perspective

Mandatory Skills

Product Management, DWH, Big Data

Job Title: Product Manager

Job Description

*Data Ingestion, Processing, and Orchestration knowledge

Roles & Responsibilities

Mandatory Skills

Product Management, DWH, Big Data

Senior Software Engineer

Hiring for a leading client

Agency job

via Jobaajcom by Saksham Agarwal

New Delhi

3 - 5 yrs

₹10L - ₹15L / yr

Big Data

Apache Kafka

Business Intelligence (BI)

Data Warehouse (DWH)

Coding

+15 more

Job Description:
Senior Software Engineer - Data Team

We are seeking a highly motivated Senior Software Engineer with hands-on experience and build scalable, extensible data solutions, identifying and addressing performance bottlenecks, collaborating with other team members, and implementing best practices for data engineering. Our engineering process is fully agile, and has a really fast release cycle - which keeps our environment very energetic and fun.

What you'll do:

Design and development of scalable applications.
Work with Product Management teams to get maximum value out of existing data.
Contribute to continual improvement by suggesting improvements to the software system.
Ensure high scalability and performance
You will advocate for good, clean, well documented and performing code; follow standards and best practices.
We'd love for you to have:

Education: Bachelor/Master Degree in Computer Science.
Experience: 3-5 years of relevant experience in BI/DW with hands-on coding experience.

Mandatory Skills

Strong in problem-solving
Strong experience with Big Data technologies, Hive, Hadoop, Impala, Hbase, Kafka, Spark
Strong experience with orchestration framework like Apache oozie, Airflow
Strong experience of Data Engineering
Strong experience with Database and Data Warehousing technologies and ability to understand complex design, system architecture
Experience with the full software development lifecycle, design, develop, review, debug, document, and deliver (especially in a multi-location organization)
Good knowledge of Java
Desired Skills

Experience with Python
Experience with reporting tools like Tableau, QlikView
Experience of Git and CI-CD pipeline
Awareness of cloud platform ex:- AWS
Excellent communication skills with team members, Business owners, across teams
Be able to work in a challenging, dynamic environment and meet tight deadlines

Big Data Engineers/Leads/Architects

at Impetus Technologies

1 recruiter

Posted by Gangadhar T.M

Bengaluru (Bangalore), Hyderabad, Pune, Indore, Gurugram, Noida

10 - 17 yrs

₹25L - ₹50L / yr

Product Management

Big Data

Data Warehouse (DWH)

ETL

Hi All,
Greetings! We are looking for Product Manager for our Data modernization product. We need a resource with good knowledge on Big Data/DWH. should have strong Stakeholders management and Presentation skills

QA Engineer / Senior QA Engineer

Hiring for a gaming company

Agency job

via Jobaajcom by Saksham Agarwal

Delhi

4 - 8 yrs

₹14L - ₹15L / yr

Software Testing (QA)

STLC

PHP

React Native

HTML/CSS

+19 more

Job Description:

● At least 4 to 8 years of experience in software quality assurance manual testing

● Proven record of test strategy and planning, STLC, deployment, defect tracking, and test harness.

● High degree of initiative with a passion for learning technology.

● Must have strong hands-on experience on any programming language PHP, React Native JavaScript, HTML, CSS.

● Must be competent enough on API testing using Rest Assured and Postman.

● Performance testing experience with relevant automation and monitoring tools (e.g., JMeter, load runner).

● Responsible for coming up with and executing scale tests, performance, and longevity tests.

● Must have testing experience on cloud-based platforms and working knowledge on integration testing using CI/CD pipelines.

● Should have working experience with at least one of the MYSQL databases and NoSQL DB viz.

● Cosmos Db, Cassandra, MongoDB.

● Should have strong fundamental understanding of network topology and the TCP/IP stack and working knowledge of load balancing methods and deployment architecture

● Experience with GIT Hub, Maven, HP ALM, Jira, Jenkins, code coverage tools and CI/CD
● Exposure of sizing (estimating) the project and defining timelines

● Good to have knowledge of Platforms like Windows / UNIX / Mainframe.

● Work under the direction of the Game QA Lead to provide test support for two dozen experienced game designers, artists, and engineers.

● Perform daily smoke tests of game builds, using the build computer and the install scripts.

● Run through and write test plans for existing and new features. Become familiar with the project design by closely reading project documentation and reaching out regularly to our team of game developers.

● Verify that identified bugs are fixed in the database.

● Identify and bug new issues that occur in the HMD and tablet interface.

● Build on your own fundamental QA skills by learning the specifics of game development QA for mobile devices, VR headsets, and experimental technology.

● Engage in continuous improvement of the QA process.

● Maintains quality service by establishing and enforcing organization standards.
● Maintains professional and technical knowledge by attending educational workshops; reviewing professional publications; establishing personal networks; benchmarking state-of the-art practices; participating in professional societies.

● Contributes to team effort by accomplishing related results as needed