Amazon EMR Jobs in Delhi, NCR and Gurgaon

11+ Amazon EMR Jobs in Delhi, NCR and Gurgaon | Amazon EMR Job openings in Delhi, NCR and Gurgaon

Apply to 11+ Amazon EMR Jobs in Delhi, NCR and Gurgaon on CutShort.io. Explore the latest Amazon EMR Job opportunities across top companies like Google, Amazon & Adobe.

Python developer

at codersbrain

1 recruiter

Posted by Tanuj Uppal

Delhi

4 - 8 yrs

₹2L - ₹15L / yr

Spark

Hadoop

Big Data

Data engineering

PySpark

+5 more

Mandatory - Hands on experience in Python and PySpark.

Build pySpark applications using Spark Dataframes in Python using Jupyter notebook and PyCharm(IDE).

Worked on optimizing spark jobs that processes huge volumes of data.

Hands on experience in version control tools like Git.

Worked on Amazon’s Analytics services like Amazon EMR, Lambda function etc

Worked on Amazon’s Compute services like Amazon Lambda, Amazon EC2 and Amazon’s Storage service like S3 and few other services like SNS.

Experience/knowledge of bash/shell scripting will be a plus.

Experience in working with fixed width, delimited , multi record file formats etc.

Hands on experience in tools like Jenkins to build, test and deploy the applications

Awareness of Devops concepts and be able to work in an automated release pipeline environment.

Excellent debugging skills.

Mandatory - Hands on experience in Python and PySpark.

Build pySpark applications using Spark Dataframes in Python using Jupyter notebook and PyCharm(IDE).

Worked on optimizing spark jobs that processes huge volumes of data.

Hands on experience in version control tools like Git.

Worked on Amazon’s Analytics services like Amazon EMR, Lambda function etc

Worked on Amazon’s Compute services like Amazon Lambda, Amazon EC2 and Amazon’s Storage service like S3 and few other services like SNS.

Experience/knowledge of bash/shell scripting will be a plus.

Experience in working with fixed width, delimited , multi record file formats etc.

Hands on experience in tools like Jenkins to build, test and deploy the applications

Awareness of Devops concepts and be able to work in an automated release pipeline environment.

Excellent debugging skills.

Data Engineer

MNC Company - Product Based

Agency job

via Bharat Headhunters by Ranjini C. N

Bengaluru (Bangalore), Chennai, Hyderabad, Pune, Delhi, Gurugram, Noida, Ghaziabad, Faridabad

5 - 9 yrs

₹10L - ₹15L / yr

Data Warehouse (DWH)

Informatica

ETL

Python

Google Cloud Platform (GCP)

+2 more

Job Responsibilities

Design, build & test ETL processes using Python & SQL for the corporate data warehouse
Inform, influence, support, and execute our product decisions
Maintain advertising data integrity by working closely with R&D to organize and store data in a format that provides accurate data and allows the business to quickly identify issues.
Evaluate and prototype new technologies in the area of data processing
Think quickly, communicate clearly and work collaboratively with product, data, engineering, QA and operations teams
High energy level, strong team player and good work ethic
Data analysis, understanding of business requirements and translation into logical pipelines & processes
Identification, analysis & resolution of production & development bugs
Support the release process including completing & reviewing documentation
Configure data mappings & transformations to orchestrate data integration & validation
Provide subject matter expertise
Document solutions, tools & processes
Create & support test plans with hands-on testing
Peer reviews of work developed by other data engineers within the team
Establish good working relationships & communication channels with relevant departments

Skills and Qualifications we look for

University degree 2.1 or higher (or equivalent) in a relevant subject. Master’s degree in any data subject will be a strong advantage.
4 - 6 years experience with data engineering.
Strong coding ability and software development experience in Python.
Strong hands-on experience with SQL and Data Processing.
Google cloud platform (Cloud composer, Dataflow, Cloud function, Bigquery, Cloud storage, dataproc)
Good working experience in any one of the ETL tools (Airflow would be preferable).
Should possess strong analytical and problem solving skills.
Good to have skills - Apache pyspark, CircleCI, Terraform
Motivated, self-directed, able to work with ambiguity and interested in emerging technologies, agile and collaborative processes.
Understanding & experience of agile / scrum delivery methodology

Job Responsibilities

Design, build & test ETL processes using Python & SQL for the corporate data warehouse
Inform, influence, support, and execute our product decisions
Maintain advertising data integrity by working closely with R&D to organize and store data in a format that provides accurate data and allows the business to quickly identify issues.
Evaluate and prototype new technologies in the area of data processing
Think quickly, communicate clearly and work collaboratively with product, data, engineering, QA and operations teams
High energy level, strong team player and good work ethic
Data analysis, understanding of business requirements and translation into logical pipelines & processes
Identification, analysis & resolution of production & development bugs
Support the release process including completing & reviewing documentation
Configure data mappings & transformations to orchestrate data integration & validation
Provide subject matter expertise
Document solutions, tools & processes
Create & support test plans with hands-on testing
Peer reviews of work developed by other data engineers within the team
Establish good working relationships & communication channels with relevant departments

Skills and Qualifications we look for

University degree 2.1 or higher (or equivalent) in a relevant subject. Master’s degree in any data subject will be a strong advantage.
4 - 6 years experience with data engineering.
Strong coding ability and software development experience in Python.
Strong hands-on experience with SQL and Data Processing.
Google cloud platform (Cloud composer, Dataflow, Cloud function, Bigquery, Cloud storage, dataproc)
Good working experience in any one of the ETL tools (Airflow would be preferable).
Should possess strong analytical and problem solving skills.
Good to have skills - Apache pyspark, CircleCI, Terraform
Motivated, self-directed, able to work with ambiguity and interested in emerging technologies, agile and collaborative processes.
Understanding & experience of agile / scrum delivery methodology

Azure Data Engineer

at Epik Solutions

Posted by Sakshi Sarraf

Bengaluru (Bangalore), Noida

4 - 13 yrs

₹7L - ₹18L / yr

Python

SQL

databricks

Scala

Spark

+2 more

Job Description:

As an Azure Data Engineer, your role will involve designing, developing, and maintaining data solutions on the Azure platform. You will be responsible for building and optimizing data pipelines, ensuring data quality and reliability, and implementing data processing and transformation logic. Your expertise in Azure Databricks, Python, SQL, Azure Data Factory (ADF), PySpark, and Scala will be essential for performing the following key responsibilities:

Designing and developing data pipelines: You will design and implement scalable and efficient data pipelines using Azure Databricks, PySpark, and Scala. This includes data ingestion, data transformation, and data loading processes.

Data modeling and database design: You will design and implement data models to support efficient data storage, retrieval, and analysis. This may involve working with relational databases, data lakes, or other storage solutions on the Azure platform.

Data integration and orchestration: You will leverage Azure Data Factory (ADF) to orchestrate data integration workflows and manage data movement across various data sources and targets. This includes scheduling and monitoring data pipelines.

Data quality and governance: You will implement data quality checks, validation rules, and data governance processes to ensure data accuracy, consistency, and compliance with relevant regulations and standards.

Performance optimization: You will optimize data pipelines and queries to improve overall system performance and reduce processing time. This may involve tuning SQL queries, optimizing data transformation logic, and leveraging caching techniques.

Monitoring and troubleshooting: You will monitor data pipelines, identify performance bottlenecks, and troubleshoot issues related to data ingestion, processing, and transformation. You will work closely with cross-functional teams to resolve data-related problems.

Documentation and collaboration: You will document data pipelines, data flows, and data transformation processes. You will collaborate with data scientists, analysts, and other stakeholders to understand their data requirements and provide data engineering support.

Skills and Qualifications:

Strong experience with Azure Databricks, Python, SQL, ADF, PySpark, and Scala.

Proficiency in designing and developing data pipelines and ETL processes.

Solid understanding of data modeling concepts and database design principles.

Familiarity with data integration and orchestration using Azure Data Factory.

Knowledge of data quality management and data governance practices.

Experience with performance tuning and optimization of data pipelines.

Strong problem-solving and troubleshooting skills related to data engineering.

Excellent collaboration and communication skills to work effectively in cross-functional teams.

Understanding of cloud computing principles and experience with Azure services.