ETL Engineer - Data Pipeline

at Data ToBiz

ETL Engineer - Data Pipeline

Data ToBiz

Company

Home

ETL Engineer - Data Pipeline

at Data ToBiz

Posted by PS Dhillon

2 - 6 yrs

₹7L - ₹15L / yr

Chandigarh, Delhi, Gurugram, Noida

Skills

ETL

Amazon Web Services (AWS)

Amazon Redshift

Python

Job Responsibilities : - Developing new data pipelines and ETL jobs for processing millions of records and it should be scalable with growth.
Pipelines should be optimised to handle both real time data, batch update data and historical data.
Establish scalable, efficient, automated processes for complex, large scale data analysis.
Write high quality code to gather and manage large data sets (both real time and batch data) from multiple sources, perform ETL and store it in a data warehouse.
Manipulate and analyse complex, high-volume, high-dimensional data from varying sources using a variety of tools and data analysis techniques.
Participate in data pipelines health monitoring and performance optimisations as well as quality documentation.
Interact with end users/clients and translate business language into technical requirements.
Acts independently to expose and resolve problems.

Job Requirements :-
2+ years experience working in software development & data pipeline development for enterprise analytics.
2+ years of working with Python with exposure to various warehousing tools
In-depth working with any of commercial tools like AWS Glue, Ta-lend, Informatica, Data-stage, etc.
Experience with various relational databases like MySQL, MSSql, Oracle etc. is a must.
Experience with analytics and reporting tools (Tableau, Power BI, SSRS, SSAS).
Experience in various DevOps practices helping the client to deploy and scale the systems as per requirement.
Strong verbal and written communication skills with other developers and business client.
Knowledge of Logistics and/or Transportation Domain is a plus.
Hands-on with traditional databases and ERP systems like Sybase and People-soft.

Users love Cutshort

Read about what our users have to say about finding their next opportunity on Cutshort.

Subodh Popalwar

Software Engineer, Memorres

For 2 years, I had trouble finding a company with good work culture and a role that will help me grow in my career. Soon after I started using Cutshort, I had access to information about the work culture, compensation and what each company was clearly offering.

Companies hiring on Cutshort

About Data ToBiz

Founded :

2017

Type

Size :

20-100

Stage :

Bootstrapped

About

With vision comes the insight and with insight comes the faith. We deliver the precise information/insights for your eye to visualize the facts and take the required decision with faith when it comes to making a business move. Everything now is based on assurity, the assurity that one gets from the information that is driven via collected raw data. We at DataToBiz help frame and explore that raw data to bring forth the facts behind it. These facts can help you fuel your business to rise above all others.

Connect with the team

Ankush Sharma

Connect

PS Dhillon

Connect

Company social profiles

Similar jobs

Data Scientist

at IT-Startup In Chennai

Agency job

via People First Consultants by Jayaraj E

Chennai

3 - 5 yrs

₹12L - ₹20L / yr

Data Science

Data Scientist

R Programming

Python

Machine Learning (ML)

+9 more

3+ years experience in practical implementation and deployment of ML based systems preferred.
BE/B Tech or M Tech (preferred) in CS/Engineering with strong mathematical/statistical background
Strong mathematical and analytical skills, especially statistical and ML techniques, with familiarity with different supervised and unsupervised learning algorithms
Implementation experiences and deep knowledge of Classification, Time Series Analysis, Pattern Recognition, Reinforcement Learning, Deep Learning, Dynamic Programming and Optimisation
Experience in working on modeling graph structures related to spatiotemporal systems
Programming skills in Python
Experience in developing and deploying on cloud (AWS or Google or Azure)
Good verbal and written communication skills
Familiarity with well-known ML frameworks such as Pandas, Keras, TensorFlow

3+ years experience in practical implementation and deployment of ML based systems preferred.
BE/B Tech or M Tech (preferred) in CS/Engineering with strong mathematical/statistical background
Strong mathematical and analytical skills, especially statistical and ML techniques, with familiarity with different supervised and unsupervised learning algorithms
Implementation experiences and deep knowledge of Classification, Time Series Analysis, Pattern Recognition, Reinforcement Learning, Deep Learning, Dynamic Programming and Optimisation
Experience in working on modeling graph structures related to spatiotemporal systems
Programming skills in Python
Experience in developing and deploying on cloud (AWS or Google or Azure)
Good verbal and written communication skills
Familiarity with well-known ML frameworks such as Pandas, Keras, TensorFlow

Data Engineer

at TEKsystems

1 recruiter

Posted by priyanka kanwar

Gurugram

5 - 10 yrs

₹15L - ₹25L / yr

Apache Spark

Amazon Web Services (AWS)

Python

airflow

Algorithms

TOP 3 SKILLS

Python (Language)

Spark Framework

Spark Streaming

Docker/Jenkins/ Spinakar

AWS

Hive Queries

He/She should be good coder.

Preff: - Airflow

Must have experience: -

Python

Spark framework and streaming

exposure to Machine Learning Lifecycle is mandatory.

Project:

This is searching domain project. Any searching activity which is happening on website this team create the model for the same, they create sorting/scored model for any search. This is done by the data

scientist This team is working more on the streaming side of data, the candidate would work extensively on Spark streaming and there will be a lot of work in Machine Learning.

INTERVIEW INFORMATION

3-4 rounds.

1st round based on data engineering batching experience.

2nd round based on data engineering streaming experience.

3rd round based on ML lifecycle (3rd round can be a techno-functional round based on previous

feedbacks otherwise 4th round will be a functional round if required.

TOP 3 SKILLS

Python (Language)

Spark Framework

Spark Streaming

Docker/Jenkins/ Spinakar

AWS

Hive Queries

He/She should be good coder.

Preff: - Airflow

Must have experience: -

Python

Spark framework and streaming

exposure to Machine Learning Lifecycle is mandatory.

Project:

scientist This team is working more on the streaming side of data, the candidate would work extensively on Spark streaming and there will be a lot of work in Machine Learning.

INTERVIEW INFORMATION

3-4 rounds.

1st round based on data engineering batching experience.

2nd round based on data engineering streaming experience.

3rd round based on ML lifecycle (3rd round can be a techno-functional round based on previous

feedbacks otherwise 4th round will be a functional round if required.

Snowflake developer

at Archwell

Agency job

via AVI Consulting LLP by Sravanthi Puppala

Mysore

2 - 8 yrs

₹1L - ₹15L / yr

Snowflake

Python

SQL

Amazon Web Services (AWS)

Windows Azure

+6 more

Title: Data Engineer – Snowflake

Location: Mysore (Hybrid model)

Exp-2-8 yrs

Type: Full Time

Walk-in date: 25th Jan 2023 @Mysore

Job Role: We are looking for an experienced Snowflake developer to join our team as a Data Engineer who will work as part of a team to help design and develop data-driven solutions that deliver insights to the business. The ideal candidate is a data pipeline builder and data wrangler who enjoys building data-driven systems that drive analytical solutions and building them from the ground up. You will be responsible for building and optimizing our data as well as building automated processes for production jobs. You will support our software developers, database architects, data analysts and data scientists on data initiatives

Key Roles & Responsibilities:

Use advanced complex Snowflake/Python and SQL to extract data from source systems for ingestion into a data pipeline.
Design, develop and deploy scalable and efficient data pipelines.
Analyze and assemble large, complex datasets that meet functional / non-functional business requirements.
Identify, design, and implement internal process improvements. For example: automating manual processes, optimizing data delivery, re-designing data platform infrastructure for greater scalability.
Build required infrastructure for optimal extraction, loading, and transformation (ELT) of data from various data sources using AWS and Snowflake leveraging Python or SQL technologies.
Monitor cloud-based systems and components for availability, performance, reliability, security and efficiency
Create and configure appropriate cloud resources to meet the needs of the end users.
As needed, document topology, processes, and solution architecture.
Share your passion for staying on top of tech trends, experimenting with and learning new technologies

Qualifications & Experience

Qualification & Experience Requirements:

Bachelor's degree in computer science, computer engineering, or a related field.
2-8 years of experience working with Snowflake
2+ years of experience with the AWS services.
Candidate should able to write the stored procedure and function in Snowflake.
At least 2 years’ experience in snowflake developer.
Strong SQL Knowledge.
Data injection in snowflake using Snowflake procedure.
ETL Experience is Must (Could be any tool)
Candidate should be aware of snowflake architecture.
Worked on the Migration project
DW Concept (Optional)
Experience with cloud data storage and compute components including lambda functions, EC2s, containers.
Experience with data pipeline and workflow management tools: Airflow, etc.
Experience cleaning, testing, and evaluating data quality from a wide variety of ingestible data sources
Experience working with Linux and UNIX environments.
Experience with profiling data, with and without data definition documentation

Familiar with Git
Familiar with issue tracking systems like JIRA (Project Management Tool) or Trello.
Experience working in an agile environment.

Desired Skills:

Experience in Snowflake. Must be willing to be Snowflake certified in the first 3 months of employment.
Experience with a stream-processing system: Snowpipe
Working knowledge of AWS or Azure
Experience in migrating from on-prem to cloud systems

Title: Data Engineer – Snowflake

Location: Mysore (Hybrid model)

Exp-2-8 yrs

Type: Full Time

Walk-in date: 25th Jan 2023 @Mysore

Key Roles & Responsibilities:

Use advanced complex Snowflake/Python and SQL to extract data from source systems for ingestion into a data pipeline.
Design, develop and deploy scalable and efficient data pipelines.
Analyze and assemble large, complex datasets that meet functional / non-functional business requirements.
Identify, design, and implement internal process improvements. For example: automating manual processes, optimizing data delivery, re-designing data platform infrastructure for greater scalability.
Build required infrastructure for optimal extraction, loading, and transformation (ELT) of data from various data sources using AWS and Snowflake leveraging Python or SQL technologies.
Monitor cloud-based systems and components for availability, performance, reliability, security and efficiency
Create and configure appropriate cloud resources to meet the needs of the end users.
As needed, document topology, processes, and solution architecture.
Share your passion for staying on top of tech trends, experimenting with and learning new technologies

Qualifications & Experience

Qualification & Experience Requirements:

Bachelor's degree in computer science, computer engineering, or a related field.
2-8 years of experience working with Snowflake
2+ years of experience with the AWS services.
Candidate should able to write the stored procedure and function in Snowflake.
At least 2 years’ experience in snowflake developer.
Strong SQL Knowledge.
Data injection in snowflake using Snowflake procedure.
ETL Experience is Must (Could be any tool)
Candidate should be aware of snowflake architecture.
Worked on the Migration project
DW Concept (Optional)
Experience with cloud data storage and compute components including lambda functions, EC2s, containers.
Experience with data pipeline and workflow management tools: Airflow, etc.
Experience cleaning, testing, and evaluating data quality from a wide variety of ingestible data sources
Experience working with Linux and UNIX environments.
Experience with profiling data, with and without data definition documentation

Familiar with Git
Familiar with issue tracking systems like JIRA (Project Management Tool) or Trello.
Experience working in an agile environment.

Desired Skills:

Experience in Snowflake. Must be willing to be Snowflake certified in the first 3 months of employment.
Experience with a stream-processing system: Snowpipe
Working knowledge of AWS or Azure
Experience in migrating from on-prem to cloud systems

Principal Data Engineer

at Monarch Tractors India

Posted by Sowmya Adepu

Hyderabad

5 - 8 yrs

Best in industry

Python

Amazon Web Services (AWS)

PostgreSQL

Ubuntu

Web Service Definition Language (WSDL)

Designation: Principal Data Engineer

Experience: Experienced

Position Type: Full Time Position

Location: Hyderabad

Office Timings: 9AM to 6PM

Compensation: As Per Industry standards

About Monarch:

At Monarch, we’re leading the digital transformation of farming. Monarch Tractor augments both muscle and mind with fully loaded hardware, software, and service machinery that will spur future generations of farming technologies. With our farmer-first mentality, we are building a smart tractor that will enhance (not replace) the existing farm ecosystem, alleviate labor availability, and cost issues, and provide an avenue for competitive organic and beyond farming by providing mechanical solutions to replace harmful chemical solutions. Despite all the cutting-edge technology we will incorporate, our tractor will still plow, till, and haul better than any other tractor in its class. We have all the necessary ingredients to develop, build and scale the Monarch Tractor and digitally transform farming around the world.

Description:

Monarch Tractor likes to invite an experience Python data engineer to lead our internal data engineering team in India. This is a unique opportunity to work on computer vision AI data pipelines for electric tractors. You will be dealing with data from a farm environment like videos, images, tractor logs, GPS coordinates and map polygons. You will be responsible for collecting data for research and development. For example, this includes setting up ETL data pipelines to extract data from tractors, loading these data into the cloud and recording AI training results.

This role includes, but not limited to, the following tasks:

● Lead data engineering team

● Own and contribute to more than 50% of the data engineering code base

● Scope out new project requirements

● Costing data pipeline solutions

● Create data engineering tooling

● Design custom data structures for efficient processing of data

Data engineering skills we are looking for:

● Able to work with large amounts of text log data, image data, and video data

● Fluently use AWS cloud solutions like S3, Lambda, and EC2

● Able to work with data from Robot Operating System

Required Experience:

● 3 to 5 years of experience using Python

● 3 to 5 years of experience using PostgreSQL

● 3 to 5 years of experience using AWS EC2, S3, Lambda

● 3 to 5 years of experience using Ubuntu OS or WSL

Good to have experience:

● Ray

● Robot Operating System

What you will get:

At Monarch Tractor, you’ll play a key role on a capable, dedicated, high-performing team of rock stars. Our compensation package includes a competitive salary, excellent health, dental and vision benefits, and company equity commensurate with the role you’ll play in our success.

Designation: Principal Data Engineer

Experience: Experienced

Position Type: Full Time Position

Location: Hyderabad

Office Timings: 9AM to 6PM

Compensation: As Per Industry standards

About Monarch:

Description:

This role includes, but not limited to, the following tasks:

● Lead data engineering team

● Own and contribute to more than 50% of the data engineering code base

● Scope out new project requirements

● Costing data pipeline solutions

● Create data engineering tooling

● Design custom data structures for efficient processing of data

Data engineering skills we are looking for:

● Able to work with large amounts of text log data, image data, and video data

● Fluently use AWS cloud solutions like S3, Lambda, and EC2

● Able to work with data from Robot Operating System

Required Experience:

● 3 to 5 years of experience using Python

● 3 to 5 years of experience using PostgreSQL

● 3 to 5 years of experience using AWS EC2, S3, Lambda

● 3 to 5 years of experience using Ubuntu OS or WSL

Good to have experience:

● Ray

● Robot Operating System

What you will get:

BI Developer

at Agiletech Info Solutions pvt ltd

Posted by Premkumar S

Chennai

4 - 7 yrs

₹7L - ₹16L / yr

SQL server

SSIS

ETL

ETL QA

ADF

+3 more

Proficient with SQL Server/T-SQL programming in creation and optimization of complex Stored Procedures, UDF, CTE and Triggers
Overall Experience should be between 4 to 7 years
Experience working in a data warehouse environment and a strong understanding of dimensional data modeling concepts. Experience in SQL server, DW principles and SSIS.
Should have strong experience in building data transformations with SSIS including importing data from files, and moving data from source to destination.
Creating new SSIS packages or modifying existing SSIS packages using SQL server
Debug and fine-tune SSIS processes to ensure accurate and efficient movement of data. Experience with ETL testing & data validation.
1+ years of experience with Azure services like Azure Data Factory, Data flow, Azure blob Storage, etc.
1+ years of experience with developing Azure Data Factory Objects - ADF pipeline, configuration, parameters, variables, Integration services runtime.
Must be able to build Business Intelligence solutions in a collaborative, agile development environment.
Reporting experience with Power BI or SSRS is a plus.
Experience working on an Agile/Scrum team preferred.
Proven strong problem-solving skills, troubleshooting, and root cause analysis.
Excellent written and verbal communication skills.

Proficient with SQL Server/T-SQL programming in creation and optimization of complex Stored Procedures, UDF, CTE and Triggers
Overall Experience should be between 4 to 7 years
Experience working in a data warehouse environment and a strong understanding of dimensional data modeling concepts. Experience in SQL server, DW principles and SSIS.
Should have strong experience in building data transformations with SSIS including importing data from files, and moving data from source to destination.
Creating new SSIS packages or modifying existing SSIS packages using SQL server
Debug and fine-tune SSIS processes to ensure accurate and efficient movement of data. Experience with ETL testing & data validation.
1+ years of experience with Azure services like Azure Data Factory, Data flow, Azure blob Storage, etc.
1+ years of experience with developing Azure Data Factory Objects - ADF pipeline, configuration, parameters, variables, Integration services runtime.
Must be able to build Business Intelligence solutions in a collaborative, agile development environment.
Reporting experience with Power BI or SSRS is a plus.
Experience working on an Agile/Scrum team preferred.
Proven strong problem-solving skills, troubleshooting, and root cause analysis.
Excellent written and verbal communication skills.

Data Engineering Manager

at Network Science

Posted by Leena Shirsale

Mumbai, Navi Mumbai

5 - 8 yrs

₹20L - ₹25L / yr

ETL

Informatica

Data Warehouse (DWH)

Data engineering

Data Science

+4 more

Collaborate with the business teams to understand the data environment in the organization; develop and lead the Data Scientists team to test and scale new algorithms through pilots and subsequent scaling up of the solutions
Influence, build and maintain the large-scale data infrastructure required for the AI projects, and integrate with external IT infrastructure/service
Act as the single point source for all data related queries; strong understanding of internal and external data sources; provide inputs in deciding data-schemas
Design, develop and maintain the framework for the analytics solutions pipeline
Provide inputs to the organization’s initiatives on data quality and help implement frameworks and tools for the various related initiatives
Work in cross-functional teams of software/machine learning engineers, data scientists, product managers, and others to build the AI ecosystem
Collaborate with the external organizations including vendors, where required, in respect of all data-related queries as well as implementation initiatives

Collaborate with the business teams to understand the data environment in the organization; develop and lead the Data Scientists team to test and scale new algorithms through pilots and subsequent scaling up of the solutions
Influence, build and maintain the large-scale data infrastructure required for the AI projects, and integrate with external IT infrastructure/service
Act as the single point source for all data related queries; strong understanding of internal and external data sources; provide inputs in deciding data-schemas
Design, develop and maintain the framework for the analytics solutions pipeline
Provide inputs to the organization’s initiatives on data quality and help implement frameworks and tools for the various related initiatives
Work in cross-functional teams of software/machine learning engineers, data scientists, product managers, and others to build the AI ecosystem
Collaborate with the external organizations including vendors, where required, in respect of all data-related queries as well as implementation initiatives

DBA SDE

at Play Games24x7

2 recruiters

Agency job

via zyoin by Deepana Shahabadi

Remote, Bengaluru (Bangalore)

4 - 8 yrs

₹15L - ₹30L / yr

Python

DBA

MongoDB

MySQL

Cassandra

+1 more

Games24x7 was one of the first entrants in the gaming industry in 2006, when India started showing the first signs of promise for online gaming. We turned profitable by 2010 in just four years and grew 200x in the next decade. We are a technology powered analytics and data science company that happens to love games!
In 2018-19, the mobile games market in India generated over $600 million in revenues. With close to 450 people in its Mumbai and Bangalore offices, Games24x7 is India’s largest mobile games business today and is very well positioned to become the 800-pound gorilla of what will be a $2 billion market by 2022. While Games24x7 continues to invest aggressively in its India centric mobile games, it is also diversifying its business by investing in international gaming and other tech opportunities.

Summary of Role
Position/Role Description :
The candidate will be part of a team managing databases (MySQL, MongoDB, Cassandra) and will be involved in designing, configuring and maintaining databases.
Job Responsibilities:
• Complete involvement in the database requirement starting from the design phase for every project.
• Deploying required database assets on production (DDL, DML)
• Good understanding of MySQL Replication (Master-slave, Master-Master, GTID-based)
• Understanding of MySQL partitioning.
• A better understanding of MySQL logs and Configuration.
• Ways to schedule backup and restoration.
• Good understanding of MySQL versions and their features.
• Good understanding of InnoDB-Engine.
• Exploring ways to optimize the current environment and also lay a good platform for new projects.
• Able to understand and resolve any database related production outages.

Job Requirements:
• BE/B.Tech from a reputed institute
• Experience in python scripting.
• Experience in shell scripting.
• General understanding of system hardware.
• Experience in MySQL is a must.
• Experience in MongoDB, Cassandra, Graph db will be preferred.
• Experience with Pecona MySQL tools.
• 6 - 8 years of experience.

Job Location: Bengaluru

Data Scientist

at El Corte Ingls

Posted by Saradhi Reddy

Hyderabad

3 - 7 yrs

₹10L - ₹25L / yr

Data Science

R Programming

Python

View the profiles of professionals named Vijaya More on LinkedIn. There are 20+ professionals named Vijaya More, who use LinkedIn to exchange information, ideas, and opportunities.

Talend Developer

at Product based company

Agency job

via Crewmates by Gowtham V

Coimbatore

4 - 15 yrs

₹5L - ₹20L / yr

ETL

talend

Talend

Hi Professionals,
Role : Talend developer
Location : Coimbatore
Experience : 4+Years
Skills : Talend, any DB
Notice period : Immediate to 15 Days

Data Engineer

at Codalyze Technologies

4 recruiters

Posted by Aishwarya Hire

Mumbai

3 - 7 yrs

₹7L - ₹20L / yr

Hadoop

Big Data

Scala

Spark

Amazon Web Services (AWS)

+3 more

Job Overview :

Your mission is to help lead team towards creating solutions that improve the way our business is run. Your knowledge of design, development, coding, testing and application programming will help your team raise their game, meeting your standards, as well as satisfying both business and functional requirements. Your expertise in various technology domains will be counted on to set strategic direction and solve complex and mission critical problems, internally and externally. Your quest to embracing leading-edge technologies and methodologies inspires your team to follow suit.

Responsibilities and Duties :

- As a Data Engineer you will be responsible for the development of data pipelines for numerous applications handling all kinds of data like structured, semi-structured &
unstructured. Having big data knowledge specially in Spark & Hive is highly preferred.

- Work in team and provide proactive technical oversight, advice development teams fostering re-use, design for scale, stability, and operational efficiency of data/analytical solutions

Education level :

- Bachelor's degree in Computer Science or equivalent

Experience :

- Minimum 5+ years relevant experience working on production grade projects experience in hands on, end to end software development

- Expertise in application, data and infrastructure architecture disciplines

- Expert designing data integrations using ETL and other data integration patterns

- Advanced knowledge of architecture, design and business processes

Proficiency in :

- Modern programming languages like Java, Python, Scala

- Big Data technologies Hadoop, Spark, HIVE, Kafka

- Writing decently optimized SQL queries

- Orchestration and deployment tools like Airflow & Jenkins for CI/CD (Optional)

- Responsible for design and development of integration solutions with Hadoop/HDFS, Real-Time Systems, Data Warehouses, and Analytics solutions

- Knowledge of system development lifecycle methodologies, such as waterfall and AGILE.

- An understanding of data architecture and modeling practices and concepts including entity-relationship diagrams, normalization, abstraction, denormalization, dimensional
modeling, and Meta data modeling practices.

- Experience generating physical data models and the associated DDL from logical data models.

- Experience developing data models for operational, transactional, and operational reporting, including the development of or interfacing with data analysis, data mapping,
and data rationalization artifacts.

- Experience enforcing data modeling standards and procedures.

- Knowledge of web technologies, application programming languages, OLTP/OLAP technologies, data strategy disciplines, relational databases, data warehouse development and Big Data solutions.

- Ability to work collaboratively in teams and develop meaningful relationships to achieve common goals

Skills :

Must Know :

- Core big-data concepts

- Spark - PySpark/Scala

- Data integration tool like Pentaho, Nifi, SSIS, etc (at least 1)

- Handling of various file formats

- Cloud platform - AWS/Azure/GCP

- Orchestration tool - Airflow