As AWS Data Engineer, you are a full-stack data engineer that loves solving business problems. You work with business leads, analysts, and data scientists to understand the business domain and engage with fellow engineers to build data products that empower better decision-making. You are passionate about the data quality of our business metrics and the flexibility of your solution that scales to respond to broader business questions.

If you love to solve problems using your skills, then come join the Team Mactores. We have a casual and fun office environment that actively steers clear of rigid "corporate" culture, focuses on productivity and creativity, and allows you to be part of a world-class team while still being yourself.

What you will do?

Write efficient code in - PySpark, Amazon Glue
Write SQL Queries in - Amazon Athena, Amazon Redshift
Explore new technologies and learn new techniques to solve business problems creatively
Collaborate with many teams - engineering and business, to build better data products and services
Deliver the projects along with the team collaboratively and manage updates to customers on time

What are we looking for?

1 to 3 years of experience in Apache Spark, PySpark, Amazon Glue
2+ years of experience in writing ETL jobs using pySpark, and SparkSQL
2+ years of experience in SQL queries and stored procedures
Have a deep understanding of all the Dataframe API with all the transformation functions supported by Spark 2.7+

You will be preferred if you have

Prior experience in working on AWS EMR, Apache Airflow
Certifications AWS Certified Big Data – Specialty OR Cloudera Certified Big Data Engineer OR Hortonworks Certified Big Data Engineer
Understanding of DataOps Engineering

Life at Mactores

We care about creating a culture that makes a real difference in the lives of every Mactorian. Our 10 Core Leadership Principles that honor Decision-making, Leadership, Collaboration, and Curiosity drive how we work.

1. Be one step ahead

2. Deliver the best

3. Be bold

4. Pay attention to the detail

5. Enjoy the challenge

6. Be curious and take action

7. Take leadership

8. Own it

9. Deliver value

10. Be collaborative

We would like you to read more details about the work culture on https://mactores.com/careers

The Path to Joining the Mactores Team

At Mactores, our recruitment process is structured around three distinct stages:

Pre-Employment Assessment:

You will be invited to participate in a series of pre-employment evaluations to assess your technical proficiency and suitability for the role.

Managerial Interview: The hiring manager will engage with you in multiple discussions, lasting anywhere from 30 minutes to an hour, to assess your technical skills, hands-on experience, leadership potential, and communication abilities.

HR Discussion: During this 30-minute session, you'll have the opportunity to discuss the offer and next steps with a member of the HR team.

At Mactores, we are committed to providing equal opportunities in all of our employment practices, and we do not discriminate based on race, religion, gender, national origin, age, disability, marital status, military status, genetic information, or any other category protected by federal, state, and local laws. This policy extends to all aspects of the employment relationship, including recruitment, compensation, promotions, transfers, disciplinary action, layoff, training, and social and recreational programs. All employment decisions will be made in compliance with these principles.

What you will do?

Write efficient code in - PySpark, Amazon Glue
Write SQL Queries in - Amazon Athena, Amazon Redshift
Explore new technologies and learn new techniques to solve business problems creatively
Collaborate with many teams - engineering and business, to build better data products and services
Deliver the projects along with the team collaboratively and manage updates to customers on time

What are we looking for?

1 to 3 years of experience in Apache Spark, PySpark, Amazon Glue
2+ years of experience in writing ETL jobs using pySpark, and SparkSQL
2+ years of experience in SQL queries and stored procedures
Have a deep understanding of all the Dataframe API with all the transformation functions supported by Spark 2.7+

You will be preferred if you have

Prior experience in working on AWS EMR, Apache Airflow
Certifications AWS Certified Big Data – Specialty OR Cloudera Certified Big Data Engineer OR Hortonworks Certified Big Data Engineer
Understanding of DataOps Engineering

Life at Mactores

1. Be one step ahead

2. Deliver the best

3. Be bold

4. Pay attention to the detail

5. Enjoy the challenge

6. Be curious and take action

7. Take leadership

8. Own it

9. Deliver value

10. Be collaborative

We would like you to read more details about the work culture on https://mactores.com/careers

The Path to Joining the Mactores Team

At Mactores, our recruitment process is structured around three distinct stages:

Pre-Employment Assessment:

You will be invited to participate in a series of pre-employment evaluations to assess your technical proficiency and suitability for the role.

HR Discussion: During this 30-minute session, you'll have the opportunity to discuss the offer and next steps with a member of the HR team.

Users love Cutshort

Read about what our users have to say about finding their next opportunity on Cutshort.

Subodh Popalwar

Software Engineer, Memorres

For 2 years, I had trouble finding a company with good work culture and a role that will help me grow in my career. Soon after I started using Cutshort, I had access to information about the work culture, compensation and what each company was clearly offering.

Companies hiring on Cutshort

About Mactores Cognition Private Limited

Founded :

2008

Type

Size :

20-100

Stage :

Bootstrapped

About

Mactores is a global technology consulting and product company with focus on delivering solutions on Cloud, Big Data, Deep Analytics, DevOps, IoT & AI

Company social profiles

Similar jobs

Machine Learning/AI Engineer

at ZeMoSo Technologies

11 recruiters

Posted by HR Team

Remote only

3 - 6 yrs

Best in industry

Machine Learning (ML)

Data Science

Natural Language Processing (NLP)

Computer Vision

recommendation algorithm

+5 more

Job Description:

Machine Learning / AI Engineer (with 3+ years of experience)

We are seeking a highly skilled and passionate Machine Learning / AI Engineer to join our newly established data science practice area. In this role, you will primarily focus on working with Large Language Models (LLMs) and contribute to building generative AI applications. This position offers an exciting opportunity to shape the future of AI technology while charting an interesting career path within our organization.

Responsibilities:

1. Develop and implement machine learning models: Utilize your expertise in machine learning and artificial intelligence to design, develop, and deploy cutting-edge models, with a particular emphasis on Large Language Models (LLMs). Apply your knowledge to solve complex problems and optimize performance.

2. Building generative AI applications: Collaborate with cross-functional teams to conceptualize, design, and build innovative generative AI applications. Work on projects that push the boundaries of AI technology and deliver impactful solutions to real-world problems.

3. Data preprocessing and analysis: Collect, clean, and preprocess large volumes of data for training and evaluation purposes. Conduct exploratory data analysis to gain insights and identify patterns that can enhance the performance of AI models.

4. Model training and evaluation: Develop robust training pipelines for machine learning models, incorporating best practices in model selection, feature engineering, and hyperparameter tuning. Evaluate model performance using appropriate metrics and iterate on the models to improve accuracy and efficiency.

5. Research and stay up to date: Keep abreast of the latest advancements in machine learning, natural language processing, and generative AI. Stay informed about industry trends, emerging techniques, and open-source libraries, and apply relevant findings to enhance the team's capabilities.

6. Collaborate and communicate effectively: Work closely with a multidisciplinary team of data scientists, software engineers, and domain experts to drive AI initiatives. Clearly communicate complex technical concepts and findings to both technical and non-technical stakeholders.

7. Experimentation and prototyping: Explore novel ideas, experiment with new algorithms, and prototype innovative solutions. Foster a culture of innovation and contribute to the continuous improvement of AI methodologies and practices within the organization.

Requirements:

1. Education: Bachelor's or Master's degree in Computer Science, Data Science, or a related field. Relevant certifications in machine learning, deep learning, or AI are a plus.

2. Experience: A minimum of 3+ years of professional experience as a Machine Learning / AI Engineer, with a proven track record of developing and deploying machine learning models in real-world applications.

3. Strong programming skills: Proficiency in Python and experience with machine learning frameworks (e.g., TensorFlow, PyTorch) and libraries (e.g., scikit-learn, pandas). Experience with cloud platforms (e.g., AWS, Azure, GCP) for model deployment is preferred.

4. Deep-learning expertise: Strong understanding of deep learning architectures (e.g., convolutional neural networks, recurrent neural networks, transformers) and familiarity with Large Language Models (LLMs) such as GPT-3, GPT-4, or equivalent.

5. Natural Language Processing (NLP) knowledge: Familiarity with NLP techniques, including tokenization, word embeddings, named entity recognition, sentiment analysis, text classification, and language generation.

6. Data manipulation and preprocessing skills: Proficiency in data manipulation using SQL and experience with data preprocessing techniques (e.g., cleaning, normalization, feature engineering). Familiarity with big data tools (e.g., Spark) is a plus.

7. Problem-solving and analytical thinking: Strong analytical and problem-solving abilities, with a keen eye for detail. Demonstrated experience in translating complex business requirements into practical machine learning solutions.

8. Communication and collaboration: Excellent verbal and written communication skills, with the ability to explain complex technical concepts to diverse stakeholders

Job Description:

Machine Learning / AI Engineer (with 3+ years of experience)

Responsibilities:

Requirements:

1. Education: Bachelor's or Master's degree in Computer Science, Data Science, or a related field. Relevant certifications in machine learning, deep learning, or AI are a plus.

8. Communication and collaboration: Excellent verbal and written communication skills, with the ability to explain complex technical concepts to diverse stakeholders

Data Engineer

at Consulting and Services company

Agency job

via Jobdost by Sathish Kumar

Hyderabad, Ahmedabad

5 - 10 yrs

₹5L - ₹30L / yr

Amazon Web Services (AWS)

Apache

Python

PySpark

Data Engineer

Mandatory Requirements 

Experience in AWS Glue

Experience in Apache Parquet 
Proficient in AWS S3 and data lake 
Knowledge of Snowflake
Understanding of file-based ingestion best practices.
Scripting language - Python & pyspark

CORE RESPONSIBILITIES

Create and manage cloud resources in AWS 
Data ingestion from different data sources which exposes data using different technologies, such as: RDBMS, REST HTTP API, flat files, Streams, and Time series data based on various proprietary systems. Implement data ingestion and processing with the help of Big Data technologies 
Data processing/transformation using various technologies such as Spark and Cloud Services. You will need to understand your part of business logic and implement it using the language supported by the base data platform

Develop automated data quality check to make sure right data enters the platform and verifying the results of the calculations 
Develop an infrastructure to collect, transform, combine and publish/distribute customer data.
Define process improvement opportunities to optimize data collection, insights and displays.
Ensure data and results are accessible, scalable, efficient, accurate, complete and flexible 
Identify and interpret trends and patterns from complex data sets

Construct a framework utilizing data visualization tools and techniques to present consolidated analytical and actionable results to relevant stakeholders. 
Key participant in regular Scrum ceremonies with the agile teams  
Proficient at developing queries, writing reports and presenting findings 
Mentor junior members and bring best industry practices

QUALIFICATIONS

5-7+ years’ experience as data engineer in consumer finance or equivalent industry (consumer loans, collections, servicing, optional product, and insurance sales) 
Strong background in math, statistics, computer science, data science or related discipline
Advanced knowledge one of language: Java, Scala, Python, C# 
Production experience with: HDFS, YARN, Hive, Spark, Kafka, Oozie / Airflow, Amazon Web Services (AWS), Docker / Kubernetes, Snowflake

Proficient with
Data mining/programming tools (e.g. SAS, SQL, R, Python)
Database technologies (e.g. PostgreSQL, Redshift, Snowflake. and Greenplum)
Data visualization (e.g. Tableau, Looker, MicroStrategy)
Comfortable learning about and deploying new technologies and tools.

Organizational skills and the ability to handle multiple projects and priorities simultaneously and meet established deadlines. 
Good written and oral communication skills and ability to present results to non-technical audiences 
Knowledge of business intelligence and analytical tools, technologies and techniques.

Familiarity and experience in the following is a plus: 

AWS certification
Spark Streaming 
Kafka Streaming / Kafka Connect 
ELK Stack 
Cassandra / MongoDB

CI/CD: Jenkins, GitLab, Jira, Confluence other related tools

Data Engineer

Mandatory Requirements 

Experience in AWS Glue

Experience in Apache Parquet 
Proficient in AWS S3 and data lake 
Knowledge of Snowflake
Understanding of file-based ingestion best practices.
Scripting language - Python & pyspark

CORE RESPONSIBILITIES

Create and manage cloud resources in AWS 
Data ingestion from different data sources which exposes data using different technologies, such as: RDBMS, REST HTTP API, flat files, Streams, and Time series data based on various proprietary systems. Implement data ingestion and processing with the help of Big Data technologies 
Data processing/transformation using various technologies such as Spark and Cloud Services. You will need to understand your part of business logic and implement it using the language supported by the base data platform

Develop automated data quality check to make sure right data enters the platform and verifying the results of the calculations 
Develop an infrastructure to collect, transform, combine and publish/distribute customer data.
Define process improvement opportunities to optimize data collection, insights and displays.
Ensure data and results are accessible, scalable, efficient, accurate, complete and flexible 
Identify and interpret trends and patterns from complex data sets

Construct a framework utilizing data visualization tools and techniques to present consolidated analytical and actionable results to relevant stakeholders. 
Key participant in regular Scrum ceremonies with the agile teams  
Proficient at developing queries, writing reports and presenting findings 
Mentor junior members and bring best industry practices

QUALIFICATIONS

5-7+ years’ experience as data engineer in consumer finance or equivalent industry (consumer loans, collections, servicing, optional product, and insurance sales) 
Strong background in math, statistics, computer science, data science or related discipline
Advanced knowledge one of language: Java, Scala, Python, C# 
Production experience with: HDFS, YARN, Hive, Spark, Kafka, Oozie / Airflow, Amazon Web Services (AWS), Docker / Kubernetes, Snowflake

Proficient with
Data mining/programming tools (e.g. SAS, SQL, R, Python)
Database technologies (e.g. PostgreSQL, Redshift, Snowflake. and Greenplum)
Data visualization (e.g. Tableau, Looker, MicroStrategy)
Comfortable learning about and deploying new technologies and tools.

Organizational skills and the ability to handle multiple projects and priorities simultaneously and meet established deadlines. 
Good written and oral communication skills and ability to present results to non-technical audiences 
Knowledge of business intelligence and analytical tools, technologies and techniques.

Familiarity and experience in the following is a plus: 

AWS certification
Spark Streaming 
Kafka Streaming / Kafka Connect 
ELK Stack 
Cassandra / MongoDB

CI/CD: Jenkins, GitLab, Jira, Confluence other related tools

Python + Data scientist

at A leading global information technology and business process

Agency job

via Jobdost by Mamatha A

Chennai

5 - 14 yrs

₹13L - ₹21L / yr

Python

Java

PySpark

Javascript

Hadoop

Python + Data scientist :
• Hands-on and sound knowledge of Python, Pyspark, Java script

• Build data-driven models to understand the characteristics of engineering systems

• Train, tune, validate, and monitor predictive models

• Sound knowledge on Statistics

• Experience in developing data processing tasks using PySpark such as reading,

merging, enrichment, loading of data from external systems to target data destinations

• Working knowledge on Big Data or/and Hadoop environments

• Experience creating CI/CD Pipelines using Jenkins or like tools

• Practiced in eXtreme Programming (XP) disciplines

Python + Data scientist :
• Hands-on and sound knowledge of Python, Pyspark, Java script

• Build data-driven models to understand the characteristics of engineering systems

• Train, tune, validate, and monitor predictive models

• Sound knowledge on Statistics

• Experience in developing data processing tasks using PySpark such as reading,

merging, enrichment, loading of data from external systems to target data destinations

• Working knowledge on Big Data or/and Hadoop environments

• Experience creating CI/CD Pipelines using Jenkins or like tools

• Practiced in eXtreme Programming (XP) disciplines

Data Analyst

at Srijan Technologies

6 recruiters

Posted by PriyaSaini

Remote only

3 - 8 yrs

₹5L - ₹12L / yr

Data Analytics

Data modeling

Python

PySpark

ETL

+3 more

Role Description:

You will be part of the data delivery team and will have the opportunity to develop a deep understanding of the domain/function.
You will design and drive the work plan for the optimization/automation and standardization of the processes incorporating best practices to achieve efficiency gains.
You will run data engineering pipelines, link raw client data with data model, conduct data assessment, perform data quality checks, and transform data using ETL tools.
You will perform data transformations, modeling, and validation activities, as well as configure applications to the client context. You will also develop scripts to validate, transform, and load raw data using programming languages such as Python and / or PySpark.
In this role, you will determine database structural requirements by analyzing client operations, applications, and programming.
You will develop cross-site relationships to enhance idea generation, and manage stakeholders.
Lastly, you will collaborate with the team to support ongoing business processes by delivering high-quality end products on-time and perform quality checks wherever required.

Job Requirement:

Bachelor’s degree in Engineering or Computer Science; Master’s degree is a plus
3+ years of professional work experience with a reputed analytics firm
Expertise in handling large amount of data through Python or PySpark
Conduct data assessment, perform data quality checks and transform data using SQL and ETL tools
Experience of deploying ETL / data pipelines and workflows in cloud technologies and architecture such as Azure and Amazon Web Services will be valued
Comfort with data modelling principles (e.g. database structure, entity relationships, UID etc.) and software development principles (e.g. modularization, testing, refactoring, etc.)
A thoughtful and comfortable communicator (verbal and written) with the ability to facilitate discussions and conduct training
Strong problem-solving, requirement gathering, and leading.
Track record of completing projects successfully on time, within budget and as per scope

Role Description:

You will be part of the data delivery team and will have the opportunity to develop a deep understanding of the domain/function.
You will design and drive the work plan for the optimization/automation and standardization of the processes incorporating best practices to achieve efficiency gains.
You will run data engineering pipelines, link raw client data with data model, conduct data assessment, perform data quality checks, and transform data using ETL tools.
You will perform data transformations, modeling, and validation activities, as well as configure applications to the client context. You will also develop scripts to validate, transform, and load raw data using programming languages such as Python and / or PySpark.
In this role, you will determine database structural requirements by analyzing client operations, applications, and programming.
You will develop cross-site relationships to enhance idea generation, and manage stakeholders.
Lastly, you will collaborate with the team to support ongoing business processes by delivering high-quality end products on-time and perform quality checks wherever required.

Job Requirement:

Bachelor’s degree in Engineering or Computer Science; Master’s degree is a plus
3+ years of professional work experience with a reputed analytics firm
Expertise in handling large amount of data through Python or PySpark
Conduct data assessment, perform data quality checks and transform data using SQL and ETL tools
Experience of deploying ETL / data pipelines and workflows in cloud technologies and architecture such as Azure and Amazon Web Services will be valued
Comfort with data modelling principles (e.g. database structure, entity relationships, UID etc.) and software development principles (e.g. modularization, testing, refactoring, etc.)
A thoughtful and comfortable communicator (verbal and written) with the ability to facilitate discussions and conduct training
Strong problem-solving, requirement gathering, and leading.
Track record of completing projects successfully on time, within budget and as per scope

Data Engineer (PySpark+SQL)

at Fragma Data Systems

8 recruiters

Posted by Evelyn Charles

Remote, Bengaluru (Bangalore)

3.5 - 8 yrs

₹5L - ₹18L / yr

PySpark

Data engineering

Data Warehouse (DWH)

SQL

Spark

+1 more

Must-Have Skills:

• Good experience in Pyspark - Including Dataframe core functions and Spark SQL

• Good experience in SQL DBs - Be able to write queries including fair complexity.

• Should have excellent experience in Big Data programming for data transformation and aggregations

• Good at ELT architecture. Business rules processing and data extraction from Data Lake into data streams for business consumption.

• Good customer communication.

• Good Analytical skill

Technology Skills (Good to Have):

Building and operationalizing large scale enterprise data solutions and applications using one or more of AZURE data and analytics services in combination with custom solutions - Azure Synapse/Azure SQL DWH, Azure Data Lake, Azure Blob Storage, Spark, HDInsights, Databricks, CosmosDB, EventHub/IOTHub.
Experience in migrating on-premise data warehouses to data platforms on AZURE cloud.
Designing and implementing data engineering, ingestion, and transformation functions
Azure Synapse or Azure SQL data warehouse
Spark on Azure is available in HD insights and data bricks

Good to Have:

Experience with Azure Analysis Services
Experience in Power BI
Experience with third-party solutions like Attunity/Stream sets, Informatica
Experience with PreSales activities (Responding to RFPs, Executing Quick POCs)
Capacity Planning and Performance Tuning on Azure Stack and Spark.

Must-Have Skills:

• Good experience in Pyspark - Including Dataframe core functions and Spark SQL

• Good experience in SQL DBs - Be able to write queries including fair complexity.

• Should have excellent experience in Big Data programming for data transformation and aggregations

• Good at ELT architecture. Business rules processing and data extraction from Data Lake into data streams for business consumption.

• Good customer communication.

• Good Analytical skill

Technology Skills (Good to Have):

Building and operationalizing large scale enterprise data solutions and applications using one or more of AZURE data and analytics services in combination with custom solutions - Azure Synapse/Azure SQL DWH, Azure Data Lake, Azure Blob Storage, Spark, HDInsights, Databricks, CosmosDB, EventHub/IOTHub.
Experience in migrating on-premise data warehouses to data platforms on AZURE cloud.
Designing and implementing data engineering, ingestion, and transformation functions
Azure Synapse or Azure SQL data warehouse
Spark on Azure is available in HD insights and data bricks

Good to Have:

Experience with Azure Analysis Services
Experience in Power BI
Experience with third-party solutions like Attunity/Stream sets, Informatica
Experience with PreSales activities (Responding to RFPs, Executing Quick POCs)
Capacity Planning and Performance Tuning on Azure Stack and Spark.

Lead Data Engineer

at Discite Analytics Private Limited

1 recruiter

Posted by Uma Sravya B

Ahmedabad

4 - 7 yrs

₹12L - ₹20L / yr

Hadoop

Big Data

Data engineering

Spark

Apache Beam

+13 more

Responsibilities:
1. Communicate with the clients and understand their business requirements.
2. Build, train, and manage your own team of junior data engineers.
3. Assemble large, complex data sets that meet the client’s business requirements.
4. Identify, design and implement internal process improvements: automating manual processes, optimizing data delivery, re-designing infrastructure for greater scalability, etc.
5. Build the infrastructure required for optimal extraction, transformation, and loading of data from a wide variety of data sources, including the cloud.
6. Assist clients with data-related technical issues and support their data infrastructure requirements.
7. Work with data scientists and analytics experts to strive for greater functionality.

Skills required: (experience with at least most of these)
1. Experience with Big Data tools-Hadoop, Spark, Apache Beam, Kafka etc.
2. Experience with object-oriented/object function scripting languages: Python, Java, C++, Scala, etc.
3. Experience in ETL and Data Warehousing.
4. Experience and firm understanding of relational and non-relational databases like MySQL, MS SQL Server, Postgres, MongoDB, Cassandra etc.
5. Experience with cloud platforms like AWS, GCP and Azure.
6. Experience with workflow management using tools like Apache Airflow.

Data Scientist

at Networking & Cybersecurity Solutions

Agency job

via Multi Recruit by Ashwini Miniyar

Bengaluru (Bangalore)

4 - 8 yrs

₹40L - ₹60L / yr

Data Science

Data Scientist

R Programming

Python

Amazon Web Services (AWS)

+2 more

Research and develop statistical learning models for data analysis
Collaborate with product management and engineering departments to understand company needs and devise possible solutions
Keep up-to-date with latest technology trends
Communicate results and ideas to key decision makers
Implement new statistical or other mathematical methodologies as needed for specific models or analysis
Optimize joint development efforts through appropriate database use and project design

Qualifications/Requirements:

Masters or PhD in Computer Science, Electrical Engineering, Statistics, Applied Math or equivalent fields with strong mathematical background
Excellent understanding of machine learning techniques and algorithms, including clustering, anomaly detection, optimization, neural network etc
3+ years experiences building data science-driven solutions including data collection, feature selection, model training, post-deployment validation
Strong hands-on coding skills (preferably in Python) processing large-scale data set and developing machine learning models
Familiar with one or more machine learning or statistical modeling tools such as Numpy, ScikitLearn, MLlib, Tensorflow
Good team worker with excellent communication skills written, verbal and presentation

Desired Experience:

Experience with AWS, S3, Flink, Spark, Kafka, Elastic Search
Knowledge and experience with NLP technology
Previous work in a start-up environment

Research and develop statistical learning models for data analysis
Collaborate with product management and engineering departments to understand company needs and devise possible solutions
Keep up-to-date with latest technology trends
Communicate results and ideas to key decision makers
Implement new statistical or other mathematical methodologies as needed for specific models or analysis
Optimize joint development efforts through appropriate database use and project design

Qualifications/Requirements:

Masters or PhD in Computer Science, Electrical Engineering, Statistics, Applied Math or equivalent fields with strong mathematical background
Excellent understanding of machine learning techniques and algorithms, including clustering, anomaly detection, optimization, neural network etc
3+ years experiences building data science-driven solutions including data collection, feature selection, model training, post-deployment validation
Strong hands-on coding skills (preferably in Python) processing large-scale data set and developing machine learning models
Familiar with one or more machine learning or statistical modeling tools such as Numpy, ScikitLearn, MLlib, Tensorflow
Good team worker with excellent communication skills written, verbal and presentation

Desired Experience:

Experience with AWS, S3, Flink, Spark, Kafka, Elastic Search
Knowledge and experience with NLP technology
Previous work in a start-up environment

Big Data Engineer

at YourHRfolks

6 recruiters

Posted by Bharat Saxena

Remote, Jaipur, NCR (Delhi | Gurgaon | Noida), Chennai, Bangarmau

5 - 10 yrs

₹15L - ₹30L / yr

Big Data

Hadoop

Spark

Apache Kafka

Amazon Web Services (AWS)

+2 more

Position: Big Data Engineer

What You'll Do

Punchh is seeking to hire Big Data Engineer at either a senior or tech lead level. Reporting to the Director of Big Data, he/she will play a critical role in leading Punchh’s big data innovations. By leveraging prior industrial experience in big data, he/she will help create cutting-edge data and analytics products for Punchh’s business partners.

This role requires close collaborations with data, engineering, and product organizations. His/her job functions include

Work with large data sets and implement sophisticated data pipelines with both structured and structured data.
Collaborate with stakeholders to design scalable solutions.
Manage and optimize our internal data pipeline that supports marketing, customer success and data science to name a few.
A technical leader of Punchh’s big data platform that supports AI and BI products.
Work with infra and operations team to monitor and optimize existing infrastructure
Occasional business travels are required.

What You'll Need

5+ years of experience as a Big Data engineering professional, developing scalable big data solutions.
Advanced degree in computer science, engineering or other related fields.
Demonstrated strength in data modeling, data warehousing and SQL.
Extensive knowledge with cloud technologies, e.g. AWS and Azure.
Excellent software engineering background. High familiarity with software development life cycle. Familiarity with GitHub/Airflow.
Advanced knowledge of big data technologies, such as programming language (Python, Java), relational (Postgres, mysql), NoSQL (Mongodb), Hadoop (EMR) and streaming (Kafka, Spark).
Strong problem solving skills with demonstrated rigor in building and maintaining a complex data pipeline.
Exceptional communication skills and ability to articulate a complex concept with thoughtful, actionable recommendations.

Position: Big Data Engineer

What You'll Do

This role requires close collaborations with data, engineering, and product organizations. His/her job functions include

Work with large data sets and implement sophisticated data pipelines with both structured and structured data.
Collaborate with stakeholders to design scalable solutions.
Manage and optimize our internal data pipeline that supports marketing, customer success and data science to name a few.
A technical leader of Punchh’s big data platform that supports AI and BI products.
Work with infra and operations team to monitor and optimize existing infrastructure
Occasional business travels are required.

What You'll Need

5+ years of experience as a Big Data engineering professional, developing scalable big data solutions.
Advanced degree in computer science, engineering or other related fields.
Demonstrated strength in data modeling, data warehousing and SQL.
Extensive knowledge with cloud technologies, e.g. AWS and Azure.
Excellent software engineering background. High familiarity with software development life cycle. Familiarity with GitHub/Airflow.
Advanced knowledge of big data technologies, such as programming language (Python, Java), relational (Postgres, mysql), NoSQL (Mongodb), Hadoop (EMR) and streaming (Kafka, Spark).
Strong problem solving skills with demonstrated rigor in building and maintaining a complex data pipeline.
Exceptional communication skills and ability to articulate a complex concept with thoughtful, actionable recommendations.

Data Engineer

at NSEIT

4 recruiters

Posted by Vishal Pednekar

Remote only

7 - 12 yrs

₹20L - ₹40L / yr

Data engineering

Big Data

Data Engineer

Amazon Web Services (AWS)

NOSQL Databases

+1 more

Design AWS data ingestion frameworks and pipelines based on the specific needs driven by the Product Owners and user stories…
Experience building Data Lake using AWS and Hands-on experience in S3, EKS, ECS, AWS Glue, AWS KMS, AWS Firehose, EMR
Experience Apache Spark Programming with Databricks
Experience working on NoSQL Databases such as Cassandra, HBase, and Elastic Search
Hands on experience with leveraging CI/CD to rapidly build & test application code
Expertise in Data governance and Data Quality
Experience working with PCI Data and working with data scientists is a plus
At least 4+ years of experience in the following Big Data frameworks: File Format (Parquet, AVRO, ORC), Resource Management, Distributed Processing and RDBMS
5+ years of experience on designing and developing Data Pipelines for Data Ingestion or Transformation using AWS technologies

Design AWS data ingestion frameworks and pipelines based on the specific needs driven by the Product Owners and user stories…
Experience building Data Lake using AWS and Hands-on experience in S3, EKS, ECS, AWS Glue, AWS KMS, AWS Firehose, EMR
Experience Apache Spark Programming with Databricks
Experience working on NoSQL Databases such as Cassandra, HBase, and Elastic Search
Hands on experience with leveraging CI/CD to rapidly build & test application code
Expertise in Data governance and Data Quality
Experience working with PCI Data and working with data scientists is a plus
At least 4+ years of experience in the following Big Data frameworks: File Format (Parquet, AVRO, ORC), Resource Management, Distributed Processing and RDBMS
5+ years of experience on designing and developing Data Pipelines for Data Ingestion or Transformation using AWS technologies

IOT/Streaming Developer

at MNC

Agency job

via Fragma Data Systems by Priyanka U

Remote, Bengaluru (Bangalore)

2 - 6 yrs

₹6L - ₹15L / yr

Spark

Apache Kafka

PySpark

Internet of Things (IOT)

Real time media streaming

JD for IOT DE:

The role requires experience in Azure core technologies – IoT Hub/ Event Hub, Stream Analytics, IoT Central, Azure Data Lake Storage, Azure Cosmos, Azure Data Factory, Azure SQL Database, Azure HDInsight / Databricks, SQL data warehouse.

You Have:

Minimum 2 years of software development experience
Minimum 2 years of experience in IoT/streaming data pipelines solution development
Bachelor's and/or Master’s degree in computer science
Strong Consulting skills in data management including data governance, data quality, security, data integration, processing, and provisioning
Delivered data management projects with real-time/near real-time data insights delivery on Azure Cloud
Translated complex analytical requirements into the technical design including data models, ETLs, and Dashboards / Reports
Experience deploying dashboards and self-service analytics solutions on both relational and non-relational databases
Experience with different computing paradigms in databases such as In-Memory, Distributed, Massively Parallel Processing
Successfully delivered large scale IOT data management initiatives covering Plan, Design, Build and Deploy phases leveraging different delivery methodologies including Agile
Experience in handling telemetry data with Spark Streaming, Kafka, Flink, Scala, Pyspark, Spark SQL.
Hands-on experience on containers and Dockers
Exposure to streaming protocols like MQTT and AMQP
Knowledge of OT network protocols like OPC UA, CAN Bus, and similar protocols
Strong knowledge of continuous integration, static code analysis, and test-driven development
Experience in delivering projects in a highly collaborative delivery model with teams at onsite and offshore
Must have excellent analytical and problem-solving skills
Delivered change management initiatives focused on driving data platforms adoption across the enterprise
Strong verbal and written communications skills are a must, as well as the ability to work effectively across internal and external organizations

Roles & Responsibilities

You Will:

Translate functional requirements into technical design
Interact with clients and internal stakeholders to understand the data and platform requirements in detail and determine core Azure services needed to fulfill the technical design
Design, Develop and Deliver data integration interfaces in ADF and Azure Databricks
Design, Develop and Deliver data provisioning interfaces to fulfill consumption needs
Deliver data models on Azure platform, it could be on Azure Cosmos, SQL DW / Synapse, or SQL
Advise clients on ML Engineering and deploying ML Ops at Scale on AKS
Automate core activities to minimize the delivery lead times and improve the overall quality
Optimize platform cost by selecting the right platform services and architecting the solution in a cost-effective manner
Deploy Azure DevOps and CI CD processes
Deploy logging and monitoring across the different integration points for critical alerts

JD for IOT DE:

You Have:

Minimum 2 years of software development experience
Minimum 2 years of experience in IoT/streaming data pipelines solution development
Bachelor's and/or Master’s degree in computer science
Strong Consulting skills in data management including data governance, data quality, security, data integration, processing, and provisioning
Delivered data management projects with real-time/near real-time data insights delivery on Azure Cloud
Translated complex analytical requirements into the technical design including data models, ETLs, and Dashboards / Reports
Experience deploying dashboards and self-service analytics solutions on both relational and non-relational databases
Experience with different computing paradigms in databases such as In-Memory, Distributed, Massively Parallel Processing
Successfully delivered large scale IOT data management initiatives covering Plan, Design, Build and Deploy phases leveraging different delivery methodologies including Agile
Experience in handling telemetry data with Spark Streaming, Kafka, Flink, Scala, Pyspark, Spark SQL.
Hands-on experience on containers and Dockers
Exposure to streaming protocols like MQTT and AMQP
Knowledge of OT network protocols like OPC UA, CAN Bus, and similar protocols
Strong knowledge of continuous integration, static code analysis, and test-driven development
Experience in delivering projects in a highly collaborative delivery model with teams at onsite and offshore
Must have excellent analytical and problem-solving skills
Delivered change management initiatives focused on driving data platforms adoption across the enterprise
Strong verbal and written communications skills are a must, as well as the ability to work effectively across internal and external organizations

Roles & Responsibilities

You Will:

Translate functional requirements into technical design
Interact with clients and internal stakeholders to understand the data and platform requirements in detail and determine core Azure services needed to fulfill the technical design
Design, Develop and Deliver data integration interfaces in ADF and Azure Databricks
Design, Develop and Deliver data provisioning interfaces to fulfill consumption needs
Deliver data models on Azure platform, it could be on Azure Cosmos, SQL DW / Synapse, or SQL
Advise clients on ML Engineering and deploying ML Ops at Scale on AKS
Automate core activities to minimize the delivery lead times and improve the overall quality
Optimize platform cost by selecting the right platform services and architecting the solution in a cost-effective manner
Deploy Azure DevOps and CI CD processes
Deploy logging and monitoring across the different integration points for critical alerts