Amazon EMR Jobs in Delhi, NCR and Gurgaon

11+ Amazon EMR Jobs in Delhi, NCR and Gurgaon | Amazon EMR Job openings in Delhi, NCR and Gurgaon

Apply to 11+ Amazon EMR Jobs in Delhi, NCR and Gurgaon on CutShort.io. Explore the latest Amazon EMR Job opportunities across top companies like Google, Amazon & Adobe.

Python developer

at codersbrain

1 recruiter

Posted by Tanuj Uppal

Delhi

4 - 8 yrs

₹2L - ₹15L / yr

Spark

Hadoop

Big Data

Data engineering

PySpark

+5 more

Mandatory - Hands on experience in Python and PySpark.

Build pySpark applications using Spark Dataframes in Python using Jupyter notebook and PyCharm(IDE).

Worked on optimizing spark jobs that processes huge volumes of data.

Hands on experience in version control tools like Git.

Worked on Amazon’s Analytics services like Amazon EMR, Lambda function etc

Worked on Amazon’s Compute services like Amazon Lambda, Amazon EC2 and Amazon’s Storage service like S3 and few other services like SNS.

Experience/knowledge of bash/shell scripting will be a plus.

Experience in working with fixed width, delimited , multi record file formats etc.

Hands on experience in tools like Jenkins to build, test and deploy the applications

Awareness of Devops concepts and be able to work in an automated release pipeline environment.

Excellent debugging skills.

Mandatory - Hands on experience in Python and PySpark.

Build pySpark applications using Spark Dataframes in Python using Jupyter notebook and PyCharm(IDE).

Worked on optimizing spark jobs that processes huge volumes of data.

Hands on experience in version control tools like Git.

Worked on Amazon’s Analytics services like Amazon EMR, Lambda function etc

Worked on Amazon’s Compute services like Amazon Lambda, Amazon EC2 and Amazon’s Storage service like S3 and few other services like SNS.

Experience/knowledge of bash/shell scripting will be a plus.

Experience in working with fixed width, delimited , multi record file formats etc.

Hands on experience in tools like Jenkins to build, test and deploy the applications

Awareness of Devops concepts and be able to work in an automated release pipeline environment.

Excellent debugging skills.

Data Scientist

at Client is a Machine Learning company based in New Delhi.

Agency job

via Jobdost by Sathish Kumar

NCR (Delhi | Gurgaon | Noida)

2 - 6 yrs

₹10L - ₹25L / yr

Data Science

R Programming

Python

Machine Learning (ML)

Entity Framework

+2 more

Job Responsibilities

Design machine learning systems
Research and implement appropriate ML algorithms and tools
Develop machine learning applications according to requirements
Select appropriate datasets and data representation methods
Run machine learning tests and experiments
Perform statistical analysis and fine-tuning using test results
Train and retrain systems when necessary

Requirements for the Job

Bachelor’s/Master's/PhD in Computer Science, Mathematics, Statistics or equivalent field andmust have a minimum of 2 years of overall experience in tier one colleges

Minimum 1 year of experience working as a Data Scientist in deploying ML at scale in production
Experience in machine learning techniques (e.g. NLP, Computer Vision, BERT, LSTM etc..) andframeworks (e.g. TensorFlow, PyTorch, Scikit-learn, etc.)

Working knowledge in deployment of Python systems (using Flask, Tensorflow Serving)
Previous experience in following areas will be preferred: Natural Language Processing(NLP) - Using LSTM and BERT; chatbots or dialogue systems, machine translation, comprehension of text, text summarization.
Computer Vision - Deep Neural Networks/CNNs for object detection and image classification, transfer learning pipeline and object detection/instance segmentation (Mask R-CNN, Yolo, SSD).

Job Responsibilities

Design machine learning systems
Research and implement appropriate ML algorithms and tools
Develop machine learning applications according to requirements
Select appropriate datasets and data representation methods
Run machine learning tests and experiments
Perform statistical analysis and fine-tuning using test results
Train and retrain systems when necessary

Requirements for the Job

Bachelor’s/Master's/PhD in Computer Science, Mathematics, Statistics or equivalent field andmust have a minimum of 2 years of overall experience in tier one colleges

Minimum 1 year of experience working as a Data Scientist in deploying ML at scale in production
Experience in machine learning techniques (e.g. NLP, Computer Vision, BERT, LSTM etc..) andframeworks (e.g. TensorFlow, PyTorch, Scikit-learn, etc.)

Working knowledge in deployment of Python systems (using Flask, Tensorflow Serving)
Previous experience in following areas will be preferred: Natural Language Processing(NLP) - Using LSTM and BERT; chatbots or dialogue systems, machine translation, comprehension of text, text summarization.
Computer Vision - Deep Neural Networks/CNNs for object detection and image classification, transfer learning pipeline and object detection/instance segmentation (Mask R-CNN, Yolo, SSD).

Artificial Intelligence (AI) Researchers and Developers

at Meslova Systems Pvt Ltd

1 recruiter

Posted by Sri Hari Nandan

Hyderabad, Bengaluru (Bangalore), Delhi

2 - 5 yrs

₹3L - ₹8L / yr

Artificial Intelligence (AI)

Machine Learning (ML)

Python

Agile/Scrum

Job Description

Artificial Intelligence (AI) Researchers and Developers

Successful candidate will be part of highly productive teams working on implementing core AI algorithms, Cryptography libraries, AI enabled products and intelligent 3D interface. Candidates will work on cutting edge products and technologies in highly challenging domains and will need to have highest level of commitment and interest to learn new technologies and domain specific subject matter very quickly. Successful completion of projects will require travel and working in remote locations with customers for extended periods

Education Qualification: Bachelor, Master or PhD degree in Computer Science, Mathematics, Electronics, Information Systems from a reputed university and/or equivalent Knowledge and Skills

Location : Hyderabad, Bengaluru, Delhi, Client Location (as needed)

Skillset and Expertise
• Strong software development experience using Python
• Strong background in mathematical, numerical and scientific computing using Python.
• Knowledge in Artificial Intelligence/Machine learning
• Experience working with SCRUM software development methodology
• Strong experience with implementing Web services, Web clients and JSON protocol is required
• Experience with Python Meta programming
• Strong analytical and problem-solving skills
• Design, develop and debug enterprise grade software products and systems
• Software systems testing methodology, including writing and execution of test plans, debugging, and testing scripts and tools
• Excellent written and verbal communication skills; Proficiency in English. Verbal communication in Hindi and other local
Indian languages
• Ability to effectively communicate product design, functionality and status to management, customers and other stakeholders
• Highest level of integrity and work ethic

Frameworks
1. Scikit-learn
2. Tensorflow
3. Keras
4. OpenCV
5. Django
6. CUDA
7. Apache Kafka

Mathematics
1. Advanced Calculus
2. Numerical Analysis
3. Complex Function Theory
4. Probability

Concepts (One or more of the below)
1. OpenGL based 3D programming
2. Cryptography
3. Artificial Intelligence (AI) Algorithms a) Statistical modelling b.) DNN c. RNN d. LSTM e.GAN f. CN

Data Scientist

at leading pharmacy provider

Agency job

via Econolytics by Jyotsna Econolytics

Noida, NCR (Delhi | Gurgaon | Noida)

4 - 10 yrs

₹18L - ₹24L / yr

Data Science

Python

R Programming

Algorithms

Predictive modelling

Job Description:

• Help build a Data Science team which will be engaged in researching, designing,
implementing, and deploying full-stack scalable data analytics vision and machine learning
solutions to challenge various business issues.
• Modelling complex algorithms, discovering insights and identifying business
opportunities through the use of algorithmic, statistical, visualization, and mining techniques
• Translates business requirements into quick prototypes and enable the
development of big data capabilities driving business outcomes
• Responsible for data governance and defining data collection and collation
guidelines.
• Must be able to advice, guide and train other junior data engineers in their job.

Must Have:

• 4+ experience in a leadership role as a Data Scientist
• Preferably from retail, Manufacturing, Healthcare industry(not mandatory)
• Willing to work from scratch and build up a team of Data Scientists
• Open for taking up the challenges with end to end ownership
• Confident with excellent communication skills along with a good decision maker

Sr Data Engineer

at Infogain

Agency job

via Technogen India PvtLtd by RAHUL BATTA

Bengaluru (Bangalore), Pune, Noida, NCR (Delhi | Gurgaon | Noida)

7 - 10 yrs

₹20L - ₹25L / yr

Data engineering

Python

SQL

Spark

PySpark

+10 more

Sr. Data Engineer:

Core Skills – Data Engineering, Big Data, Pyspark, Spark SQL and Python

Candidate with prior Palantir Cloud Foundry OR Clinical Trial Data Model background is preferred

Major accountabilities:

Responsible for Data Engineering, Foundry Data Pipeline Creation, Foundry Analysis & Reporting, Slate Application development, re-usable code development & management and Integrating Internal or External System with Foundry for data ingestion with high quality.
Have good understanding on Foundry Platform landscape and it’s capabilities
Performs data analysis required to troubleshoot data related issues and assist in the resolution of data issues.
Defines company data assets (data models), Pyspark, spark SQL, jobs to populate data models.
Designs data integrations and data quality framework.
Design & Implement integration with Internal, External Systems, F1 AWS platform using Foundry Data Connector or Magritte Agent
Collaboration with data scientists, data analyst and technology teams to document and leverage their understanding of the Foundry integration with different data sources - Actively participate in agile work practices
Coordinating with Quality Engineer to ensure the all quality controls, naming convention & best practices have been followed

Desired Candidate Profile :

Strong data engineering background
Experience with Clinical Data Model is preferred
Experience in

SQL Server ,Postgres, Cassandra, Hadoop, and Spark for distributed data storage and parallel computing
Java and Groovy for our back-end applications and data integration tools
Python for data processing and analysis
Cloud infrastructure based on AWS EC2 and S3

7+ years IT experience, 2+ years’ experience in Palantir Foundry Platform, 4+ years’ experience in Big Data platform
5+ years of Python and Pyspark development experience
Strong troubleshooting and problem solving skills
BTech or master's degree in computer science or a related technical field
Experience designing, building, and maintaining big data pipelines systems
Hands-on experience on Palantir Foundry Platform and Foundry custom Apps development
Able to design and implement data integration between Palantir Foundry and external Apps based on Foundry data connector framework
Hands-on in programming languages primarily Python, R, Java, Unix shell scripts
Hand-on experience in AWS / Azure cloud platform and stack
Strong in API based architecture and concept, able to do quick PoC using API integration and development
Knowledge of machine learning and AI
Skill and comfort working in a rapidly changing environment with dynamic objectives and iteration with users.

Demonstrated ability to continuously learn, work independently, and make decisions with minimal supervision

Sr. Data Engineer:

Core Skills – Data Engineering, Big Data, Pyspark, Spark SQL and Python

Candidate with prior Palantir Cloud Foundry OR Clinical Trial Data Model background is preferred

Major accountabilities:

Responsible for Data Engineering, Foundry Data Pipeline Creation, Foundry Analysis & Reporting, Slate Application development, re-usable code development & management and Integrating Internal or External System with Foundry for data ingestion with high quality.
Have good understanding on Foundry Platform landscape and it’s capabilities
Performs data analysis required to troubleshoot data related issues and assist in the resolution of data issues.
Defines company data assets (data models), Pyspark, spark SQL, jobs to populate data models.
Designs data integrations and data quality framework.
Design & Implement integration with Internal, External Systems, F1 AWS platform using Foundry Data Connector or Magritte Agent
Collaboration with data scientists, data analyst and technology teams to document and leverage their understanding of the Foundry integration with different data sources - Actively participate in agile work practices
Coordinating with Quality Engineer to ensure the all quality controls, naming convention & best practices have been followed

Desired Candidate Profile :

Strong data engineering background
Experience with Clinical Data Model is preferred
Experience in

SQL Server ,Postgres, Cassandra, Hadoop, and Spark for distributed data storage and parallel computing
Java and Groovy for our back-end applications and data integration tools
Python for data processing and analysis
Cloud infrastructure based on AWS EC2 and S3

7+ years IT experience, 2+ years’ experience in Palantir Foundry Platform, 4+ years’ experience in Big Data platform
5+ years of Python and Pyspark development experience
Strong troubleshooting and problem solving skills
BTech or master's degree in computer science or a related technical field
Experience designing, building, and maintaining big data pipelines systems
Hands-on experience on Palantir Foundry Platform and Foundry custom Apps development
Able to design and implement data integration between Palantir Foundry and external Apps based on Foundry data connector framework
Hands-on in programming languages primarily Python, R, Java, Unix shell scripts
Hand-on experience in AWS / Azure cloud platform and stack
Strong in API based architecture and concept, able to do quick PoC using API integration and development
Knowledge of machine learning and AI
Skill and comfort working in a rapidly changing environment with dynamic objectives and iteration with users.

Demonstrated ability to continuously learn, work independently, and make decisions with minimal supervision

Machine Learning Engineer

at SmartJoules

1 video

9 recruiters

Posted by Saksham Dutta

Remote, NCR (Delhi | Gurgaon | Noida)

3 - 5 yrs

₹8L - ₹12L / yr

Machine Learning (ML)

Python

Big Data

Apache Spark

Deep Learning

Responsibilities:

Exploring and visualizing data to gain an understanding of it, then identifying differences in data distribution that could affect performance when deploying the model in the real world.
Verifying data quality, and/or ensuring it via data cleaning.
Able to adapt and work fast in producing the output which upgrades the decision making of stakeholders using ML.
To design and develop Machine Learning systems and schemes.
To perform statistical analysis and fine-tune models using test results.
To train and retrain ML systems and models as and when necessary.
To deploy ML models in production and maintain the cost of cloud infrastructure.
To develop Machine Learning apps according to client and data scientist requirements.
To analyze the problem-solving capabilities and use-cases of ML algorithms and rank them by how successful they are in meeting the objective.

Technical Knowledge:

Worked with real time problems, solved them using ML and deep learning models deployed in real time and should have some awesome projects under his belt to showcase.
Proficiency in Python and experience with working with Jupyter Framework, Google collab and cloud hosted notebooks such as AWS sagemaker, DataBricks etc.
Proficiency in working with libraries Sklearn, Tensorflow, Open CV2, Pyspark, Pandas, Numpy and related libraries.
Expert in visualising and manipulating complex datasets.
Proficiency in working with visualisation libraries such as seaborn, plotly, matplotlib etc.
Proficiency in Linear Algebra, statistics and probability required for Machine Learning.
Proficiency in ML Based algorithms for example, Gradient boosting, stacked Machine learning, classification algorithms and deep learning algorithms. Need to have experience in hypertuning various models and comparing the results of algorithm performance.
Big data Technologies such as Hadoop stack and Spark.
Basic use of clouds (VM’s example EC2).
Brownie points for Kubernetes and Task Queues.
Strong written and verbal communications.
Experience working in an Agile environment.

Responsibilities:

Exploring and visualizing data to gain an understanding of it, then identifying differences in data distribution that could affect performance when deploying the model in the real world.
Verifying data quality, and/or ensuring it via data cleaning.
Able to adapt and work fast in producing the output which upgrades the decision making of stakeholders using ML.
To design and develop Machine Learning systems and schemes.
To perform statistical analysis and fine-tune models using test results.
To train and retrain ML systems and models as and when necessary.
To deploy ML models in production and maintain the cost of cloud infrastructure.
To develop Machine Learning apps according to client and data scientist requirements.
To analyze the problem-solving capabilities and use-cases of ML algorithms and rank them by how successful they are in meeting the objective.

Technical Knowledge:

Worked with real time problems, solved them using ML and deep learning models deployed in real time and should have some awesome projects under his belt to showcase.
Proficiency in Python and experience with working with Jupyter Framework, Google collab and cloud hosted notebooks such as AWS sagemaker, DataBricks etc.
Proficiency in working with libraries Sklearn, Tensorflow, Open CV2, Pyspark, Pandas, Numpy and related libraries.
Expert in visualising and manipulating complex datasets.
Proficiency in working with visualisation libraries such as seaborn, plotly, matplotlib etc.
Proficiency in Linear Algebra, statistics and probability required for Machine Learning.
Proficiency in ML Based algorithms for example, Gradient boosting, stacked Machine learning, classification algorithms and deep learning algorithms. Need to have experience in hypertuning various models and comparing the results of algorithm performance.
Big data Technologies such as Hadoop stack and Spark.
Basic use of clouds (VM’s example EC2).
Brownie points for Kubernetes and Task Queues.
Strong written and verbal communications.
Experience working in an Agile environment.

ML Researcher

at Oil & Energy Industry

Agency job

via Green Bridge Consulting LLP by Susmita Mishra

NCR (Delhi | Gurgaon | Noida)

1 - 3 yrs

₹8L - ₹12L / yr

Machine Learning (ML)

Data Science

Deep Learning

Digital Signal Processing

Statistical signal processing

+6 more

Understanding business objectives and developing models that help to achieve them,
along with metrics to track their progress
Managing available resources such as hardware, data, and personnel so that deadlines
are met
Analysing the ML algorithms that could be used to solve a given problem and ranking
them by their success probability
Exploring and visualizing data to gain an understanding of it, then identifying
differences in data distribution that could affect performance when deploying the model
in the real world
Verifying data quality, and/or ensuring it via data cleaning
Supervising the data acquisition process if more data is needed
Defining validation strategies
Defining the pre-processing or feature engineering to be done on a given dataset
Defining data augmentation pipelines
Training models and tuning their hyper parameters
Analysing the errors of the model and designing strategies to overcome them
Deploying models to production

Big Data Developer

at Cemtics

1 recruiter

Posted by Tapan Sahani

Remote, NCR (Delhi | Gurgaon | Noida)

4 - 6 yrs

₹5L - ₹12L / yr

Big Data

Spark

Hadoop

SQL

Python

+1 more

JD:

Required Skills:

Intermediate to Expert level hands-on programming using one of programming language- Java or Python or Pyspark or Scala.
Strong practical knowledge of SQL.
Hands on experience on Spark/SparkSQL
Data Structure and Algorithms
Hands-on experience as an individual contributor in Design, Development, Testing and Deployment of Big Data technologies based applications
Experience in Big Data application tools, such as Hadoop, MapReduce, Spark, etc
Experience on NoSQL Databases like HBase, etc
Experience with Linux OS environment (Shell script, AWK, SED)
Intermediate RDBMS skill, able to write SQL query with complex relation on top of big RDMS (100+ table)

JD:

Required Skills:

Intermediate to Expert level hands-on programming using one of programming language- Java or Python or Pyspark or Scala.
Strong practical knowledge of SQL.
Hands on experience on Spark/SparkSQL
Data Structure and Algorithms
Hands-on experience as an individual contributor in Design, Development, Testing and Deployment of Big Data technologies based applications
Experience in Big Data application tools, such as Hadoop, MapReduce, Spark, etc
Experience on NoSQL Databases like HBase, etc
Experience with Linux OS environment (Shell script, AWK, SED)
Intermediate RDBMS skill, able to write SQL query with complex relation on top of big RDMS (100+ table)

Data Analyst

at PayU

1 video

6 recruiters

Posted by Deeksha Srivastava

gurgaon, NCR (Delhi | Gurgaon | Noida)

1 - 3 yrs

₹7L - ₹15L / yr

Python

R Programming

Data Analytics

What you will be doing:

As a part of the Global Credit Risk and Data Analytics team, this person will be responsible for carrying out analytical initiatives which will be as follows: -

Dive into the data and identify patterns
Development of end-to-end Credit models and credit policy for our existing credit products
Leverage alternate data to develop best-in-class underwriting models
Working on Big Data to develop risk analytical solutions
Development of Fraud models and fraud rule engine
Collaborate with various stakeholders (e.g. tech, product) to understand and design best solutions which can be implemented
Working on cutting-edge techniques e.g. machine learning and deep learning models

Example of projects done in past:

Lazypay Credit Risk model using CatBoost modelling technique ; end-to-end pipeline for feature engineering and model deployment in production using Python
Fraud model development, deployment and rules for EMEA region

Basic Requirements:

1-3 years of work experience as a Data scientist (in Credit domain)
2016 or 2017 batch from a premium college (e.g B.Tech. from IITs, NITs, Economics from DSE/ISI etc)
Strong problem solving and understand and execute complex analysis
Experience in at least one of the languages - R/Python/SAS and SQL
Experience in in Credit industry (Fintech/bank)
Familiarity with the best practices of Data Science

Add-on Skills :

Experience in working with big data
Solid coding practices
Passion for building new tools/algorithms
Experience in developing Machine Learning models

What you will be doing:

As a part of the Global Credit Risk and Data Analytics team, this person will be responsible for carrying out analytical initiatives which will be as follows: -

Dive into the data and identify patterns
Development of end-to-end Credit models and credit policy for our existing credit products
Leverage alternate data to develop best-in-class underwriting models
Working on Big Data to develop risk analytical solutions
Development of Fraud models and fraud rule engine
Collaborate with various stakeholders (e.g. tech, product) to understand and design best solutions which can be implemented
Working on cutting-edge techniques e.g. machine learning and deep learning models

Example of projects done in past:

Lazypay Credit Risk model using CatBoost modelling technique ; end-to-end pipeline for feature engineering and model deployment in production using Python
Fraud model development, deployment and rules for EMEA region

Basic Requirements:

1-3 years of work experience as a Data scientist (in Credit domain)
2016 or 2017 batch from a premium college (e.g B.Tech. from IITs, NITs, Economics from DSE/ISI etc)
Strong problem solving and understand and execute complex analysis
Experience in at least one of the languages - R/Python/SAS and SQL
Experience in in Credit industry (Fintech/bank)
Familiarity with the best practices of Data Science

Add-on Skills :

Experience in working with big data
Solid coding practices
Passion for building new tools/algorithms
Experience in developing Machine Learning models

Senior Software Engineer

at LimeTray

5 recruiters

Posted by tanika monga

NCR (Delhi | Gurgaon | Noida)

4 - 6 yrs

₹15L - ₹18L / yr

Machine Learning (ML)

Python

Cassandra

MySQL

Apache Kafka

+2 more

Requirements: Minimum 4-years work experience in building, managing and maintaining Analytics applications B.Tech/BE in CS/IT from Tier 1/2 Institutes Strong Fundamentals of Data Structures and Algorithms Good analytical & problem-solving skills Strong hands-on experience in Python In depth Knowledge of queueing systems (Kafka/ActiveMQ/RabbitMQ) Experience in building Data pipelines & Real time Analytics Systems Experience in SQL (MYSQL) & NoSQL (Mongo/Cassandra) databases is a plus Understanding of Service Oriented Architecture Delivered high-quality work with a significant contribution Expert in git, unit tests, technical documentation and other development best practices Experience in Handling small teams

Data Scientist

at YCH Logistics

1 recruiter

Posted by Sanatan Upmanyu

NCR (Delhi | Gurgaon | Noida)

0 - 5 yrs

₹2L - ₹5L / yr

Python

Deep Learning

MySQL

Job Description: Data Science Analyst/ Data Science Senior Analyst Job description KSTYCH is seeking a Data Science Analyst to join our Data Science team. Individuals in this role are expected to be comfortable working as a software engineer and a quantitative researcher, should have a significant theoretical foundation in mathematical statistics. The ideal candidate will have a keen interest in the study of Pharma sector, network biology, text mining, machine learning, and a passion for identifying and answering questions that help us build the best consulting resource and continuous support to other teams. Responsibilities Work closely with a product scientific, medical, business development and commercial to identify and answer important healthcare/pharma/biology questions. Answer questions by using appropriate statistical techniques and tools on available data. Communicate findings to project managers and team managers. Drive the collection of new data and the refinement of existing data sources Analyze and interpret the results of an experiments Develop best practices for instrumentation and experimentation and communicate those to other teams Requirements B. Tech, M.Tech, M.S. or Ph.D. in a relevant technical field, or 1+ years experience in a relevant role Extensive experience solving analytical problems using quantitative approaches Comfort manipulating and analyzing complex, high-volume, high-dimensionality data from varying sources A strong passion for empirical research and for answering hard questions with data A flexible analytic approach that allows for results at varying levels of precision Ability to communicate complex quantitative analysis in a clear, precise, and actionable manner Fluency with at least one scripting language such as Python or PHP Familiarity with relational databases and SQL Experience working with large data sets, experience working with distributed computing tools a plus (KNIME, Map/Reduce, Hadoop, Hive, etc)

Get to hear about interesting companies hiring right now

Follow Cutshort

Why apply via Cutshort?

Connect with actual hiring teams and get their fast response. No spam.

Find more jobs

Get to hear about interesting companies hiring right now

Follow Cutshort