Cutshort logo

50+ PySpark Jobs in India

Apply to 50+ PySpark Jobs on CutShort.io. Find your next job, effortlessly. Browse PySpark Jobs and apply today!

icon
mazosol
kirthick murali
Posted by kirthick murali
Mumbai
10 - 20 yrs
₹30L - ₹58L / yr
Python
R Programming
PySpark
Google Cloud Platform (GCP)
SQL Azure

Data Scientist – Program Embedded 

Job Description:   

We are seeking a highly skilled and motivated senior data scientist to support a big data program. The successful candidate will play a pivotal role in supporting multiple projects in this program covering traditional tasks from revenue management, demand forecasting, improving customer experience to testing/using new tools/platforms such as Copilot Fabric for different purpose. The expected candidate would have deep expertise in machine learning methodology and applications. And he/she should have completed multiple large scale data science projects (full cycle from ideation to BAU). Beyond technical expertise, problem solving in complex set-up will be key to the success for this role. This is a data science role directly embedded into the program/projects, stake holder management and collaborations with patterner are crucial to the success on this role (on top of the deep expertise). 

What we are looking for: 

  1. Highly efficient in Python/Pyspark/R. 
  2. Understand MLOps concepts, working experience in product industrialization (from Data Science point of view). Experience in building product for live deployment, and continuous development and continuous integration. 
  3. Familiar with cloud platforms such as Azure, GCP, and the data management systems on such platform. Familiar with Databricks and product deployment on Databricks. 
  4. Experience in ML projects involving techniques: Regression, Time Series, Clustering, Classification, Dimension Reduction, Anomaly detection with traditional ML approaches and DL approaches. 
  5. Solid background in statistics, probability distributions, A/B testing validation, univariate/multivariate analysis, hypothesis test for different purpose, data augmentation etc. 
  6. Familiar with designing testing framework for different modelling practice/projects based on business needs. 
  7. Exposure to Gen AI tools and enthusiastic about experimenting and have new ideas on what can be done. 
  8. If they have improved an internal company process using an AI tool, that would be great (e.g. process simplification, manual task automation, auto emails) 
  9. Ideally, 10+ years of experience, and have been on independent business facing roles. 
  10. CPG or retail as a data scientist would be nice, but not number one priority, especially for those who have navigated through multiple industries. 
  11. Being proactive and collaborative would be essential. 

 

Some projects examples within the program: 

  1. Test new tools/platforms such as Copilo, Fabric for commercial reporting. Testing, validation and build trust. 
  2. Building algorithms for predicting trend in category, consumptions to support dashboards. 
  3. Revenue Growth Management, create/understand the algorithms behind the tools (can be built by 3rd parties) we need to maintain or choose to improve. Able to prioritize and build product roadmap. Able to design new solutions and articulate/quantify the limitation of the solutions. 
  4. Demand forecasting, create localized forecasts to improve in store availability. Proper model monitoring for early detection of potential issues in the forecast focusing particularly on improving the end user experience. 


Read more
DataGrokr

at DataGrokr

4 candid answers
5 recruiters
Reshika Mendiratta
Posted by Reshika Mendiratta
Bengaluru (Bangalore)
5yrs+
Upto ₹30L / yr (Varies
)
Data engineering
Python
SQL
ETL
Data Warehouse (DWH)
+12 more

Lightning Job By Cutshort⚡

 

As part of this feature, you can expect status updates about your application and replies within 72 hours (once the screening questions are answered).


About DataGrokr

DataGrokr (https://www.datagrokr.com) is a cloud native technology consulting organization providing the next generation of big data analytics, cloud and enterprise solutions. We solve complex technology problems for our global clients who rely on us for our deep technical knowledge and delivery excellence.

If you are unafraid of technology, believe in your learning ability and are looking to work amongst smart, driven colleagues whom you can look up to and learn from, you might want to check us out. 


About the role

We are looking for a Senior Data Engineer to join our growing engineering team. As a member of the team,

• You will work on enterprise data platforms, architect and implement data lakes both on-prem and in the cloud.

• You will be responsible for evolving technical architecture, design and implementation of data solutions using a variety of big data technologies. You will work extensively on all major public cloud platforms - AWS, Azure and GCP.

• You will work with senior technical architects on our client side to evolve an effective technology architecture and development strategy to implement the solution.

• You will work with extremely talented peers and follow modern engineering practices using agile methodologies.

• You will coach, mentor and lead other engineers and provide guidance to ensure the quality of and consistency of the solution.


Must-have skills and attitudes:

• Passion for data engineering, in-depth knowledge of some of the following technologies – SQL (expert level), Python (expert level), Spark (intermediate level), Big data stack of one of AWS/GCP.

• Hands on experience in data wrangling, data munging and ETL. Should be able to source data from anywhere and transform data to any shape using SQL, Python or Spark.

• Hands on experience working with variable data structures like XML/JSON/AVRO etc

• Ability to create data models and architect data warehouse components

• Experience with Version control (GIT/BIT BUCKET etc)

• Strong understanding of Agile, experience with CI/CD pipelines and processes

• Ability to communicate with technical as well as non-technical audience

• Collaborating with various stakeholders

• Have led scrum teams, participated in Sprint grooming and planning sessions, work / effort sizing and estimation


Desired Skills & Experience:

• At least 5 years of industry experience

• Working knowledge of any of the following - AWS Big Data Stack (S3, Redshift, Athena, Glue, etc.), GCP Big Data Stack (Cloud Storage, Workflow, Dataflow, Cloud Functions, Big Query, Pub Sub, etc.).

• Working knowledge of traditional enterprise data warehouse architectures and migrating them to the Cloud.

• Experience with Data Visualization tool (Tableau / Power BI etc)

• Experience with JIRA / Azure DevOps etc


How will DataGrokr support you in your growth:

• You will be groomed and mentored by senior leaders to take on leadership positions in the company

• You will be actively encouraged to attain certifications, lead technical workshops and conduct meetups to grow your own technology acumen and personal brand

• You will work in an open culture that promotes commitment over compliance, individual responsibility over rules and bringing out the best in everyone.

Read more
one-to-one, one-to-many, and many-to-many
Chennai
5 - 10 yrs
₹1L - ₹15L / yr
AWS CloudFormation
Python
PySpark
AWS Lambda

5-7 years of experience in Data Engineering with solid experience in design, development and implementation of end-to-end data ingestion and data processing system in AWS platform.

2-3 years of experience in AWS Glue, Lambda, Appflow, EventBridge, Python, PySpark, Lake House, S3, Redshift, Postgres, API Gateway, CloudFormation, Kinesis, Athena, KMS, IAM.

Experience in modern data architecture, Lake House, Enterprise Data Lake, Data Warehouse, API interfaces, solution patterns, standards and optimizing data ingestion.

Experience in build of data pipelines from source systems like SAP Concur, Veeva Vault, Azure Cost, various social media platforms or similar source systems.

Expertise in analyzing source data and designing a robust and scalable data ingestion framework and pipelines adhering to client Enterprise Data Architecture guidelines.

Proficient in design and development of solutions for real-time (or near real time) stream data processing as well as batch processing on the AWS platform.

Work closely with business analysts, data architects, data engineers, and data analysts to ensure that the data ingestion solutions meet the needs of the business.

Troubleshoot and provide support for issues related to data quality and data ingestion solutions. This may involve debugging data pipeline processes, optimizing queries, or troubleshooting application performance issues.

Experience in working in Agile/Scrum methodologies, CI/CD tools and practices, coding standards, code reviews, source management (GITHUB), JIRA, JIRA Xray and Confluence.

Experience or exposure to design and development using Full Stack tools.

Strong analytical and problem-solving skills, excellent communication (written and oral), and interpersonal skills.

Bachelor's or master's degree in computer science or related field.

 

 

Read more
Publicis Sapient

at Publicis Sapient

10 recruiters
Mohit Singh
Posted by Mohit Singh
Bengaluru (Bangalore), Pune, Hyderabad, Gurugram, Noida
5 - 11 yrs
₹20L - ₹36L / yr
PySpark
Data engineering
Big Data
Hadoop
Spark
+7 more

Publicis Sapient Overview:

The Senior Associate People Senior Associate L1 in Data Engineering, you will translate client requirements into technical design, and implement components for data engineering solution. Utilize deep understanding of data integration and big data design principles in creating custom solutions or implementing package solutions. You will independently drive design discussions to insure the necessary health of the overall solution 

.

Job Summary:

As Senior Associate L2 in Data Engineering, you will translate client requirements into technical design, and implement components for data engineering solution. Utilize deep understanding of data integration and big data design principles in creating custom solutions or implementing package solutions. You will independently drive design discussions to insure the necessary health of the overall solution

The role requires a hands-on technologist who has strong programming background like Java / Scala / Python, should have experience in Data Ingestion, Integration and data Wrangling, Computation, Analytics pipelines and exposure to Hadoop ecosystem components. You are also required to have hands-on knowledge on at least one of AWS, GCP, Azure cloud platforms.


Role & Responsibilities:

Your role is focused on Design, Development and delivery of solutions involving:

• Data Integration, Processing & Governance

• Data Storage and Computation Frameworks, Performance Optimizations

• Analytics & Visualizations

• Infrastructure & Cloud Computing

• Data Management Platforms

• Implement scalable architectural models for data processing and storage

• Build functionality for data ingestion from multiple heterogeneous sources in batch & real-time mode

• Build functionality for data analytics, search and aggregation

Experience Guidelines:

Mandatory Experience and Competencies:

# Competency

1.Overall 5+ years of IT experience with 3+ years in Data related technologies

2.Minimum 2.5 years of experience in Big Data technologies and working exposure in at least one cloud platform on related data services (AWS / Azure / GCP)

3.Hands-on experience with the Hadoop stack – HDFS, sqoop, kafka, Pulsar, NiFi, Spark, Spark Streaming, Flink, Storm, hive, oozie, airflow and other components required in building end to end data pipeline.

4.Strong experience in at least of the programming language Java, Scala, Python. Java preferable

5.Hands-on working knowledge of NoSQL and MPP data platforms like Hbase, MongoDb, Cassandra, AWS Redshift, Azure SQLDW, GCP BigQuery etc

6.Well-versed and working knowledge with data platform related services on at least 1 cloud platform, IAM and data security


Preferred Experience and Knowledge (Good to Have):

# Competency

1.Good knowledge of traditional ETL tools (Informatica, Talend, etc) and database technologies (Oracle, MySQL, SQL Server, Postgres) with hands on experience

2.Knowledge on data governance processes (security, lineage, catalog) and tools like Collibra, Alation etc

3.Knowledge on distributed messaging frameworks like ActiveMQ / RabbiMQ / Solace, search & indexing and Micro services architectures

4.Performance tuning and optimization of data pipelines

5.CI/CD – Infra provisioning on cloud, auto build & deployment pipelines, code quality

6.Cloud data specialty and other related Big data technology certifications


Personal Attributes:

• Strong written and verbal communication skills

• Articulation skills

• Good team player

• Self-starter who requires minimal oversight

• Ability to prioritize and manage multiple tasks

• Process orientation and the ability to define and set up processes


Read more
Publicis Sapient

at Publicis Sapient

10 recruiters
Mohit Singh
Posted by Mohit Singh
Bengaluru (Bangalore), Gurugram, Pune, Hyderabad, Noida
4 - 10 yrs
Best in industry
PySpark
Data engineering
Big Data
Hadoop
Spark
+6 more

Publicis Sapient Overview:

The Senior Associate People Senior Associate L1 in Data Engineering, you will translate client requirements into technical design, and implement components for data engineering solution. Utilize deep understanding of data integration and big data design principles in creating custom solutions or implementing package solutions. You will independently drive design discussions to insure the necessary health of the overall solution 

.

Job Summary:

As Senior Associate L1 in Data Engineering, you will do technical design, and implement components for data engineering solution. Utilize deep understanding of data integration and big data design principles in creating custom solutions or implementing package solutions. You will independently drive design discussions to insure the necessary health of the overall solution

The role requires a hands-on technologist who has strong programming background like Java / Scala / Python, should have experience in Data Ingestion, Integration and data Wrangling, Computation, Analytics pipelines and exposure to Hadoop ecosystem components. Having hands-on knowledge on at least one of AWS, GCP, Azure cloud platforms will be preferable.


Role & Responsibilities:

Job Title: Senior Associate L1 – Data Engineering

Your role is focused on Design, Development and delivery of solutions involving:

• Data Ingestion, Integration and Transformation

• Data Storage and Computation Frameworks, Performance Optimizations

• Analytics & Visualizations

• Infrastructure & Cloud Computing

• Data Management Platforms

• Build functionality for data ingestion from multiple heterogeneous sources in batch & real-time

• Build functionality for data analytics, search and aggregation


Experience Guidelines:

Mandatory Experience and Competencies:

# Competency

1.Overall 3.5+ years of IT experience with 1.5+ years in Data related technologies

2.Minimum 1.5 years of experience in Big Data technologies

3.Hands-on experience with the Hadoop stack – HDFS, sqoop, kafka, Pulsar, NiFi, Spark, Spark Streaming, Flink, Storm, hive, oozie, airflow and other components required in building end to end data pipeline. Working knowledge on real-time data pipelines is added advantage.

4.Strong experience in at least of the programming language Java, Scala, Python. Java preferable

5.Hands-on working knowledge of NoSQL and MPP data platforms like Hbase, MongoDb, Cassandra, AWS Redshift, Azure SQLDW, GCP BigQuery etc


Preferred Experience and Knowledge (Good to Have):

# Competency

1.Good knowledge of traditional ETL tools (Informatica, Talend, etc) and database technologies (Oracle, MySQL, SQL Server, Postgres) with hands on experience

2.Knowledge on data governance processes (security, lineage, catalog) and tools like Collibra, Alation etc

3.Knowledge on distributed messaging frameworks like ActiveMQ / RabbiMQ / Solace, search & indexing and Micro services architectures

4.Performance tuning and optimization of data pipelines

5.CI/CD – Infra provisioning on cloud, auto build & deployment pipelines, code quality

6.Working knowledge with data platform related services on at least 1 cloud platform, IAM and data security

7.Cloud data specialty and other related Big data technology certifications


Job Title: Senior Associate L1 – Data Engineering

Personal Attributes:

• Strong written and verbal communication skills

• Articulation skills

• Good team player

• Self-starter who requires minimal oversight

• Ability to prioritize and manage multiple tasks

• Process orientation and the ability to define and set up processes

Read more
Arting Digital
Pragati Bhardwaj
Posted by Pragati Bhardwaj
Bengaluru (Bangalore)
10 - 16 yrs
₹10L - ₹15L / yr
databricks
Data modeling
SQL
Python
AWS Lambda
+2 more

Title:- Lead Data Engineer 


Experience: 10+y

Budget: 32-36 LPA

Location: Bangalore 

Work of Mode: Work from office

Primary Skills: Data Bricks, Spark, Pyspark,Sql, Python, AWS

Qualification: Any Engineering degree


Roles and Responsibilities:


• 8 - 10+ years’ experience in developing scalable Big Data applications or solutions on

 distributed platforms.

• Able to partner with others in solving complex problems by taking a broad

 perspective to identify.

• innovative solutions.

• Strong skills building positive relationships across Product and Engineering.

• Able to influence and communicate effectively, both verbally and written, with team

  members and business stakeholders

• Able to quickly pick up new programming languages, technologies, and frameworks.

• Experience working in Agile and Scrum development process.

• Experience working in a fast-paced, results-oriented environment.

• Experience in Amazon Web Services (AWS) mainly S3, Managed Airflow, EMR/ EC2,

  IAM etc.

• Experience working with Data Warehousing tools, including SQL database, Presto,

  and Snowflake

• Experience architecting data product in Streaming, Serverless and Microservices

  Architecture and platform.

• Experience working with Data platforms, including EMR, Airflow, Databricks (Data

  Engineering & Delta

• Lake components, and Lakehouse Medallion architecture), etc.

• Experience with creating/ configuring Jenkins pipeline for smooth CI/CD process for

  Managed Spark jobs, build Docker images, etc.

• Experience working with distributed technology tools, including Spark, Python, Scala

• Working knowledge of Data warehousing, Data modelling, Governance and Data

  Architecture

• Working knowledge of Reporting & Analytical tools such as Tableau, Quicksite

  etc.

• Demonstrated experience in learning new technologies and skills.

• Bachelor’s degree in computer science, Information Systems, Business, or other

  relevant subject area

Read more
Quinnox

at Quinnox

2 recruiters
MidhunKumar T
Posted by MidhunKumar T
Bengaluru (Bangalore), Mumbai
10 - 15 yrs
₹30L - ₹35L / yr
ADF
azure data lake services
SQL Azure
azure synapse
Spark
+4 more

Mandatory Skills: Azure Data Lake Storage, Azure SQL databases, Azure Synapse, Data Bricks (Pyspark/Spark), Python, SQL, Azure Data Factory.


Good to have: Power BI, Azure IAAS services, Azure Devops, Microsoft Fabric


Ø Very strong understanding on ETL and ELT

Ø Very strong understanding on Lakehouse architecture.

Ø Very strong knowledge in Pyspark and Spark architecture.

Ø Good knowledge in Azure data lake architecture and access controls

Ø Good knowledge in Microsoft Fabric architecture

Ø Good knowledge in Azure SQL databases

Ø Good knowledge in T-SQL

Ø Good knowledge in CI /CD process using Azure devops

Ø Power BI

Read more
Bengaluru (Bangalore), Hyderabad, Delhi, Gurugram
5 - 10 yrs
₹14L - ₹15L / yr
Google Cloud Platform (GCP)
Spark
PySpark
Apache Spark
"DATA STREAMING"

Data Engineering : Senior Engineer / Manager


As Senior Engineer/ Manager in Data Engineering, you will translate client requirements into technical design, and implement components for a data engineering solutions. Utilize a deep understanding of data integration and big data design principles in creating custom solutions or implementing package solutions. You will independently drive design discussions to insure the necessary health of the overall solution.


Must Have skills :


1. GCP


2. Spark streaming : Live data streaming experience is desired.


3. Any 1 coding language: Java/Pyhton /Scala



Skills & Experience :


- Overall experience of MINIMUM 5+ years with Minimum 4 years of relevant experience in Big Data technologies


- Hands-on experience with the Hadoop stack - HDFS, sqoop, kafka, Pulsar, NiFi, Spark, Spark Streaming, Flink, Storm, hive, oozie, airflow and other components required in building end to end data pipeline. Working knowledge on real-time data pipelines is added advantage.


- Strong experience in at least of the programming language Java, Scala, Python. Java preferable


- Hands-on working knowledge of NoSQL and MPP data platforms like Hbase, MongoDb, Cassandra, AWS Redshift, Azure SQLDW, GCP BigQuery etc.


- Well-versed and working knowledge with data platform related services on GCP


- Bachelor's degree and year of work experience of 6 to 12 years or any combination of education, training and/or experience that demonstrates the ability to perform the duties of the position


Your Impact :


- Data Ingestion, Integration and Transformation


- Data Storage and Computation Frameworks, Performance Optimizations


- Analytics & Visualizations


- Infrastructure & Cloud Computing


- Data Management Platforms


- Build functionality for data ingestion from multiple heterogeneous sources in batch & real-time


- Build functionality for data analytics, search and aggregation

Read more
Wallero technologies
Hyderabad
8 - 20 yrs
₹15L - ₹35L / yr
PySpark

Please find the below job specifications,

 

Position: . Data Engineer

Location: Hyderabad, Telangana, India

Job Type: Permanent (full-time)


Company Description:


We are a Seattle based product engineering, software development and technology services firm with offices in the U.S., Canada, Bulgaria, and India (Manjeera Trinity Corporate, JNTU-Hitech City Road, beside LULU Mall, Hyderabad) . Wallero is a Microsoft Gold partner company. Please find detailed overview About Wallero: https://wallero.com/aboutus/ and Wallero Culture: https://wallero.com/careers/


Job Description:


  • Tech stack: Python, Pyspark, Databricks.
  • Excellent in the Supply Chain domain.
  • Technical expert in the field with the ability to think out of the box.
  • Excellent communicator.
  • Work autonomously with minimal instructions from JNJ involvement.
  • Should be able to guide the team on the best practices (reusable, modularized coding, design patterns, and so on).


If you believe you have the skills and experience necessary for this role and are excited about contributing to our team, we would love to hear from you.


Thank you,

 

Manu Nakka

Lead Technical Recruiter

Read more
Career Forge

at Career Forge

2 candid answers
Mohammad Faiz
Posted by Mohammad Faiz
Delhi, Gurugram, Noida, Ghaziabad, Faridabad
5 - 7 yrs
₹12L - ₹15L / yr
Python
Apache Spark
PySpark
Data engineering
ETL
+10 more

🚀 Exciting Opportunity: Data Engineer Position in Gurugram 🌐


Hello 


We are actively seeking a talented and experienced Data Engineer to join our dynamic team at Reality Motivational Venture in Gurugram (Gurgaon). If you're passionate about data, thrive in a collaborative environment, and possess the skills we're looking for, we want to hear from you!


Position: Data Engineer  

Location: Gurugram (Gurgaon)  

Experience: 5+ years 


Key Skills:

- Python

- Spark, Pyspark

- Data Governance

- Cloud (AWS/Azure/GCP)


Main Responsibilities:

- Define and set up analytics environments for "Big Data" applications in collaboration with domain experts.

- Implement ETL processes for telemetry-based and stationary test data.

- Support in defining data governance, including data lifecycle management.

- Develop large-scale data processing engines and real-time search and analytics based on time series data.

- Ensure technical, methodological, and quality aspects.

- Support CI/CD processes.

- Foster know-how development and transfer, continuous improvement of leading technologies within Data Engineering.

- Collaborate with solution architects on the development of complex on-premise, hybrid, and cloud solution architectures.


Qualification Requirements:

- BSc, MSc, MEng, or PhD in Computer Science, Informatics/Telematics, Mathematics/Statistics, or a comparable engineering degree.

- Proficiency in Python and the PyData stack (Pandas/Numpy).

- Experience in high-level programming languages (C#/C++/Java).

- Familiarity with scalable processing environments like Dask (or Spark).

- Proficient in Linux and scripting languages (Bash Scripts).

- Experience in containerization and orchestration of containerized services (Kubernetes).

- Education in database technologies (SQL/OLAP and Non-SQL).

- Interest in Big Data storage technologies (Elastic, ClickHouse).

- Familiarity with Cloud technologies (Azure, AWS, GCP).

- Fluent English communication skills (speaking and writing).

- Ability to work constructively with a global team.

- Willingness to travel for business trips during development projects.


Preferable:

- Working knowledge of vehicle architectures, communication, and components.

- Experience in additional programming languages (C#/C++/Java, R, Scala, MATLAB).

- Experience in time-series processing.


How to Apply:

Interested candidates, please share your updated CV/resume with me.


Thank you for considering this exciting opportunity.

Read more
A fast growing Big Data company
Noida, Bengaluru (Bangalore), Chennai, Hyderabad
6 - 8 yrs
₹10L - ₹15L / yr
AWS Glue
SQL
Python
PySpark
Data engineering
+6 more

AWS Glue Developer 

Work Experience: 6 to 8 Years

Work Location:  Noida, Bangalore, Chennai & Hyderabad

Must Have Skills: AWS Glue, DMS, SQL, Python, PySpark, Data integrations and Data Ops, 

Job Reference ID:BT/F21/IND


Job Description:

Design, build and configure applications to meet business process and application requirements.


Responsibilities:

7 years of work experience with ETL, Data Modelling, and Data Architecture Proficient in ETL optimization, designing, coding, and tuning big data processes using Pyspark Extensive experience to build data platforms on AWS using core AWS services Step function, EMR, Lambda, Glue and Athena, Redshift, Postgres, RDS etc and design/develop data engineering solutions. Orchestrate using Airflow.


Technical Experience:

Hands-on experience on developing Data platform and its components Data Lake, cloud Datawarehouse, APIs, Batch and streaming data pipeline Experience with building data pipelines and applications to stream and process large datasets at low latencies.


➢ Enhancements, new development, defect resolution and production support of Big data ETL development using AWS native services.

➢ Create data pipeline architecture by designing and implementing data ingestion solutions.

➢ Integrate data sets using AWS services such as Glue, Lambda functions/ Airflow.

➢ Design and optimize data models on AWS Cloud using AWS data stores such as Redshift, RDS, S3, Athena.

➢ Author ETL processes using Python, Pyspark.

➢ Build Redshift Spectrum direct transformations and data modelling using data in S3.

➢ ETL process monitoring using CloudWatch events.

➢ You will be working in collaboration with other teams. Good communication must.

➢ Must have experience in using AWS services API, AWS CLI and SDK


Professional Attributes:

➢ Experience operating very large data warehouses or data lakes Expert-level skills in writing and optimizing SQL Extensive, real-world experience designing technology components for enterprise solutions and defining solution architectures and reference architectures with a focus on cloud technology.

➢ Must have 6+ years of big data ETL experience using Python, S3, Lambda, Dynamo DB, Athena, Glue in AWS environment.

➢ Expertise in S3, RDS, Redshift, Kinesis, EC2 clusters highly desired.


Qualification:

➢ Degree in Computer Science, Computer Engineering or equivalent.


Salary: Commensurate with experience and demonstrated competence

Read more
dataeaze systems

at dataeaze systems

1 recruiter
Ankita Kale
Posted by Ankita Kale
Remote only
5 - 8 yrs
₹12L - ₹22L / yr
Amazon Web Services (AWS)
Python
PySpark
ETL

POST - SENIOR DATA ENGINEER WITH AWS


Experience : 5 years


Must-have:

• Highly skilled in Python and PySpark

• Have expertise in writing Glue jobs ETL script, AWS

• Experience in working with Kafka

• Extensive SQL DB experience – Postgres

Good-to-have:

• Experience in working with data analytics and modelling

• Hands on Experience of PowerBI visualization tool

• Knowledge and hands-on on version control system - Git Common:

• Excellent communication and presentation skills (written and verbal) to all levels

of an organization

• Should be results oriented with ability to prioritize and drive multiple initiatives to

complete work you're doing on time

• Proven ability to influence a diverse geographically dispersed group of

individuals to facilitate, moderate, and influence productive design and implementation

discussions driving towards results


Shifts - Flexible ( might have to work as per US Shift timings for meetings ).

Employment Type - Any

Read more
hopscotch
Bengaluru (Bangalore)
5 - 8 yrs
₹6L - ₹15L / yr
Python
Amazon Redshift
Amazon Web Services (AWS)
PySpark
Data engineering
+3 more

About the role:

 Hopscotch is looking for a passionate Data Engineer to join our team. You will work closely with other teams like data analytics, marketing, data science and individual product teams to specify, validate, prototype, scale, and deploy data pipelines features and data architecture.


Here’s what will be expected out of you:

➢ Ability to work in a fast-paced startup mindset. Should be able to manage all aspects of data extraction transfer and load activities.

➢ Develop data pipelines that make data available across platforms.

➢ Should be comfortable in executing ETL (Extract, Transform and Load) processes which include data ingestion, data cleaning and curation into a data warehouse, database, or data platform.

➢ Work on various aspects of the AI/ML ecosystem – data modeling, data and ML pipelines.

➢ Work closely with Devops and senior Architect to come up with scalable system and model architectures for enabling real-time and batch services.


What we want:

➢ 5+ years of experience as a data engineer or data scientist with a focus on data engineering and ETL jobs.

➢ Well versed with the concept of Data warehousing, Data Modelling and/or Data Analysis.

➢ Experience using & building pipelines and performing ETL with industry-standard best practices on Redshift (more than 2+ years).

➢ Ability to troubleshoot and solve performance issues with data ingestion, data processing & query execution on Redshift.

➢ Good understanding of orchestration tools like Airflow.

 ➢ Strong Python and SQL coding skills.

➢ Strong Experience in distributed systems like spark.

➢ Experience with AWS Data and ML Technologies (AWS Glue,MWAA, Data Pipeline,EMR,Athena, Redshift,Lambda etc).

➢ Solid hands on with various data extraction techniques like CDC or Time/batch based and the related tools (Debezium, AWS DMS, Kafka Connect, etc) for near real time and batch data extraction.


Note :

Product based companies, Ecommerce companies is added advantage

Read more
Staffbee Solutions INC
Remote only
6 - 10 yrs
₹1L - ₹1.5L / yr
Spotfire
Qlikview
Tableau
PowerBI
Data Visualization
+11 more

Looking for freelance?

We are seeking a freelance Data Engineer with 7+ years of experience

 

Skills Required: Deep knowledge in any cloud (AWS, Azure , Google cloud), Data bricks, Data lakes, Data Ware housing Python/Scala , SQL, BI, and other analytics systems

 

What we are looking for

We are seeking an experienced Senior Data Engineer with experience in architecture, design, and development of highly scalable data integration and data engineering processes

 

  • The Senior Consultant must have a strong understanding and experience with data & analytics solution architecture, including data warehousing, data lakes, ETL/ELT workload patterns, and related BI & analytics systems
  • Strong in scripting languages like Python, Scala
  • 5+ years of hands-on experience with one or more of these data integration/ETL tools.
  • Experience building on-prem data warehousing solutions.
  • Experience with designing and developing ETLs, Data Marts, Star Schema
  • Designing a data warehouse solution using Synapse or Azure SQL DB
  • Experience building pipelines using Synapse or Azure Data Factory to ingest data from various sources
  • Understanding of integration run times available in Azure.
  • Advanced working SQL knowledge and experience working with relational databases, and queries. authoring (SQL) as well as working familiarity with a variety of database


Read more
Mitibase
Vaidehi Ghangurde
Posted by Vaidehi Ghangurde
Pune
2 - 4 yrs
₹6L - ₹8L / yr
Vue.js
AngularJS (1.x)
React.js
Angular (2+)
Javascript
+6 more

·      The Objective:

You will play a crucial role in designing, implementing, and maintaining our data infrastructure, run tests and update the systems


·      Job function and requirements

 

o  Expert in Python, Pandas and Numpy with knowledge of Python web Framework such as Django and Flask.

o  Able to integrate multiple data sources and databases into one system.

o  Basic understanding of frontend technologies like HTML, CSS, JavaScript.

o  Able to build data pipelines.

o  Strong unit test and debugging skills.

o  Understanding of fundamental design principles behind a scalable application

o  Good understanding of RDBMS databases among Mysql or Postgresql.

o  Able to analyze and transform raw data.

 

·      About us

Mitibase helps companies find warm prospects every month that are most relevant, and then helps their team to act on those with automation. We do so by automatically tracking key accounts and contacts for job changes and relationships triggers and surfaces them as warm leads in your sales pipeline.

Read more
A Product Based Client,Chennai
Chennai
4 - 8 yrs
₹10L - ₹15L / yr
Data Warehouse (DWH)
Informatica
ETL
Spark
PySpark
+2 more

Analytics Job Description

We are hiring an Analytics Engineer to help drive our Business Intelligence efforts. You will

partner closely with leaders across the organization, working together to understand the how

and why of people, team and company challenges, workflows and culture. The team is

responsible for delivering data and insights that drive decision-making, execution, and

investments for our product initiatives.

You will work cross-functionally with product, marketing, sales, engineering, finance, and our

customer-facing teams enabling them with data and narratives about the customer journey.

You’ll also work closely with other data teams, such as data engineering and product analytics,

to ensure we are creating a strong data culture at Blend that enables our cross-functional partners

to be more data-informed.


Role : DataEngineer 

Please find below the JD for the DataEngineer Role..

  Location: Guindy,Chennai

How you’ll contribute:

• Develop objectives and metrics, ensure priorities are data-driven, and balance short-

term and long-term goals


• Develop deep analytical insights to inform and influence product roadmaps and

business decisions and help improve the consumer experience

• Work closely with GTM and supporting operations teams to author and develop core

data sets that empower analyses

• Deeply understand the business and proactively spot risks and opportunities

• Develop dashboards and define metrics that drive key business decisions

• Build and maintain scalable ETL pipelines via solutions such as Fivetran, Hightouch,

and Workato

• Design our Analytics and Business Intelligence architecture, assessing and

implementing new technologies that fitting


• Work with our engineering teams to continually make our data pipelines and tooling

more resilient


Who you are:

• Bachelor’s degree or equivalent required from an accredited institution with a

quantitative focus such as Economics, Operations Research, Statistics, Computer Science OR 1-3 Years of Experience as a Data Analyst, Data Engineer, Data Scientist

• Must have strong SQL and data modeling skills, with experience applying skills to

thoughtfully create data models in a warehouse environment.

• A proven track record of using analysis to drive key decisions and influence change

• Strong storyteller and ability to communicate effectively with managers and

executives

• Demonstrated ability to define metrics for product areas, understand the right

questions to ask and push back on stakeholders in the face of ambiguous, complex

problems, and work with diverse teams with different goals

• A passion for documentation.

• A solution-oriented growth mindset. You’ll need to be a self-starter and thrive in a

dynamic environment.

• A bias towards communication and collaboration with business and technical

stakeholders.

• Quantitative rigor and systems thinking.

• Prior startup experience is preferred, but not required.

• Interest or experience in machine learning techniques (such as clustering, decision

tree, and segmentation)

• Familiarity with a scientific computing language, such as Python, for data wrangling

and statistical analysis

• Experience with a SQL focused data transformation framework such as dbt

• Experience with a Business Intelligence Tool such as Mode/Tableau


Mandatory Skillset:


-Very Strong in SQL

-Spark OR pyspark OR Python

-Shell Scripting


Read more
Epik Solutions
Sakshi Sarraf
Posted by Sakshi Sarraf
Bengaluru (Bangalore), Noida
4 - 13 yrs
₹7L - ₹18L / yr
Python
SQL
databricks
Scala
Spark
+2 more

Job Description:


As an Azure Data Engineer, your role will involve designing, developing, and maintaining data solutions on the Azure platform. You will be responsible for building and optimizing data pipelines, ensuring data quality and reliability, and implementing data processing and transformation logic. Your expertise in Azure Databricks, Python, SQL, Azure Data Factory (ADF), PySpark, and Scala will be essential for performing the following key responsibilities:


Designing and developing data pipelines: You will design and implement scalable and efficient data pipelines using Azure Databricks, PySpark, and Scala. This includes data ingestion, data transformation, and data loading processes.


Data modeling and database design: You will design and implement data models to support efficient data storage, retrieval, and analysis. This may involve working with relational databases, data lakes, or other storage solutions on the Azure platform.


Data integration and orchestration: You will leverage Azure Data Factory (ADF) to orchestrate data integration workflows and manage data movement across various data sources and targets. This includes scheduling and monitoring data pipelines.


Data quality and governance: You will implement data quality checks, validation rules, and data governance processes to ensure data accuracy, consistency, and compliance with relevant regulations and standards.


Performance optimization: You will optimize data pipelines and queries to improve overall system performance and reduce processing time. This may involve tuning SQL queries, optimizing data transformation logic, and leveraging caching techniques.


Monitoring and troubleshooting: You will monitor data pipelines, identify performance bottlenecks, and troubleshoot issues related to data ingestion, processing, and transformation. You will work closely with cross-functional teams to resolve data-related problems.


Documentation and collaboration: You will document data pipelines, data flows, and data transformation processes. You will collaborate with data scientists, analysts, and other stakeholders to understand their data requirements and provide data engineering support.


Skills and Qualifications:


Strong experience with Azure Databricks, Python, SQL, ADF, PySpark, and Scala.

Proficiency in designing and developing data pipelines and ETL processes.

Solid understanding of data modeling concepts and database design principles.

Familiarity with data integration and orchestration using Azure Data Factory.

Knowledge of data quality management and data governance practices.

Experience with performance tuning and optimization of data pipelines.

Strong problem-solving and troubleshooting skills related to data engineering.

Excellent collaboration and communication skills to work effectively in cross-functional teams.

Understanding of cloud computing principles and experience with Azure services.


Read more
RandomTrees

at RandomTrees

1 recruiter
Amareswarreddt yaddula
Posted by Amareswarreddt yaddula
Hyderabad
5 - 16 yrs
₹1L - ₹30L / yr
ETL
Informatica
Data Warehouse (DWH)
Amazon Web Services (AWS)
SQL
+3 more

We are #hiring for AWS Data Engineer expert to join our team


Job Title: AWS Data Engineer

Experience: 5 Yrs to 10Yrs

Location: Remote

Notice: Immediate or Max 20 Days

Role: Permanent Role


Skillset: AWS, ETL, SQL, Python, Pyspark, Postgres DB, Dremio.


Job Description:

 Able to develop ETL jobs.

Able to help with data curation/cleanup, data transformation, and building ETL pipelines.

Strong Postgres DB exp and knowledge of Dremio data visualization/semantic layer between DB and the application is a plus.

Sql, Python, and Pyspark is a must.

Communication should be good





Read more
Kloud9 Technologies
Bengaluru (Bangalore)
3 - 6 yrs
₹5L - ₹20L / yr
Amazon Web Services (AWS)
Amazon EMR
EMR
Spark
PySpark
+9 more

About Kloud9:

 

Kloud9 exists with the sole purpose of providing cloud expertise to the retail industry. Our team of cloud architects, engineers and developers help retailers launch a successful cloud initiative so you can quickly realise the benefits of cloud technology. Our standardised, proven cloud adoption methodologies reduce the cloud adoption time and effort so you can directly benefit from lower migration costs.

 

Kloud9 was founded with the vision of bridging the gap between E-commerce and cloud. The E-commerce of any industry is limiting and poses a huge challenge in terms of the finances spent on physical data structures.

 

At Kloud9, we know migrating to the cloud is the single most significant technology shift your company faces today. We are your trusted advisors in transformation and are determined to build a deep partnership along the way. Our cloud and retail experts will ease your transition to the cloud.

 

Our sole focus is to provide cloud expertise to retail industry giving our clients the empowerment that will take their business to the next level. Our team of proficient architects, engineers and developers have been designing, building and implementing solutions for retailers for an average of more than 20 years.

 

We are a cloud vendor that is both platform and technology independent. Our vendor independence not just provides us with a unique perspective into the cloud market but also ensures that we deliver the cloud solutions available that best meet our clients' requirements.


What we are looking for:

● 3+ years’ experience developing Data & Analytic solutions

● Experience building data lake solutions leveraging one or more of the following AWS, EMR, S3, Hive& Spark

● Experience with relational SQL

● Experience with scripting languages such as Shell, Python

● Experience with source control tools such as GitHub and related dev process

● Experience with workflow scheduling tools such as Airflow

● In-depth knowledge of scalable cloud

● Has a passion for data solutions

● Strong understanding of data structures and algorithms

● Strong understanding of solution and technical design

● Has a strong problem-solving and analytical mindset

● Experience working with Agile Teams.

● Able to influence and communicate effectively, both verbally and written, with team members and business stakeholders

● Able to quickly pick up new programming languages, technologies, and frameworks

● Bachelor’s Degree in computer science


Why Explore a Career at Kloud9:

 

With job opportunities in prime locations of US, London, Poland and Bengaluru, we help build your career paths in cutting edge technologies of AI, Machine Learning and Data Science. Be part of an inclusive and diverse workforce that's changing the face of retail technology with their creativity and innovative solutions. Our vested interest in our employees translates to deliver the best products and solutions to our customers.

Read more
Kloud9 Technologies
Bengaluru (Bangalore)
4 - 7 yrs
₹10L - ₹30L / yr
Google Cloud Platform (GCP)
PySpark
Python
Scala

About Kloud9:

 

Kloud9 exists with the sole purpose of providing cloud expertise to the retail industry. Our team of cloud architects, engineers and developers help retailers launch a successful cloud initiative so you can quickly realise the benefits of cloud technology. Our standardised, proven cloud adoption methodologies reduce the cloud adoption time and effort so you can directly benefit from lower migration costs.

 

Kloud9 was founded with the vision of bridging the gap between E-commerce and cloud. The E-commerce of any industry is limiting and poses a huge challenge in terms of the finances spent on physical data structures.

 

At Kloud9, we know migrating to the cloud is the single most significant technology shift your company faces today. We are your trusted advisors in transformation and are determined to build a deep partnership along the way. Our cloud and retail experts will ease your transition to the cloud.

 

Our sole focus is to provide cloud expertise to retail industry giving our clients the empowerment that will take their business to the next level. Our team of proficient architects, engineers and developers have been designing, building and implementing solutions for retailers for an average of more than 20 years.

 

We are a cloud vendor that is both platform and technology independent. Our vendor independence not just provides us with a unique perspective into the cloud market but also ensures that we deliver the cloud solutions available that best meet our clients' requirements.


●    Overall 8+ Years of Experience in Web Application development.

●    5+ Years of development experience with JAVA8 , Springboot, Microservices and middleware

●    3+ Years of Designing Middleware using Node JS platform.

●    good to have 2+ Years of Experience in using NodeJS along with AWS Serverless platform.

●    Good Experience with Javascript / TypeScript, Event Loops, ExpressJS, GraphQL, SQL DB (MySQLDB), NoSQL DB(MongoDB) and YAML templates.

●    Good Experience with TDD Driven Development and Automated Unit Testing.

●    Good Experience with exposing and consuming Rest APIs in Java 8, Springboot platform and Swagger API contracts.

●    Good Experience in building NodeJS middleware performing Transformations, Routing, Aggregation, Orchestration and Authentication(JWT/OAUTH).

●    Experience supporting and working with cross-functional teams in a dynamic environment.

●    Experience working in Agile Scrum Methodology.

●    Very good Problem-Solving Skills.

●    Very good learner and passion for technology.

●     Excellent verbal and written communication skills in English

●     Ability to communicate effectively with team members and business stakeholders


Secondary Skill Requirements:

 

● Experience working with any of Loopback, NestJS, Hapi.JS, Sails.JS, Passport.JS


Why Explore a Career at Kloud9:

 

With job opportunities in prime locations of US, London, Poland and Bengaluru, we help build your career paths in cutting edge technologies of AI, Machine Learning and Data Science. Be part of an inclusive and diverse workforce that's changing the face of retail technology with their creativity and innovative solutions. Our vested interest in our employees translates to deliver the best products and solutions to our customers.

Read more
TensorGo Software Private Limited
Deepika Agarwal
Posted by Deepika Agarwal
Remote only
5 - 8 yrs
₹5L - ₹15L / yr
Python
PySpark
apache airflow
Spark
Hadoop
+4 more

Requirements:

● Understanding our data sets and how to bring them together.

● Working with our engineering team to support custom solutions offered to the product development.

● Filling the gap between development, engineering and data ops.

● Creating, maintaining and documenting scripts to support ongoing custom solutions.

● Excellent organizational skills, including attention to precise details

● Strong multitasking skills and ability to work in a fast-paced environment

● 5+ years experience with Python to develop scripts.

● Know your way around RESTFUL APIs.[Able to integrate not necessary to publish]

● You are familiar with pulling and pushing files from SFTP and AWS S3.

● Experience with any Cloud solutions including GCP / AWS / OCI / Azure.

● Familiarity with SQL programming to query and transform data from relational Databases.

● Familiarity to work with Linux (and Linux work environment).

● Excellent written and verbal communication skills

● Extracting, transforming, and loading data into internal databases and Hadoop

● Optimizing our new and existing data pipelines for speed and reliability

● Deploying product build and product improvements

● Documenting and managing multiple repositories of code

● Experience with SQL and NoSQL databases (Casendra, MySQL)

● Hands-on experience in data pipelining and ETL. (Any of these frameworks/tools: Hadoop, BigQuery,

RedShift, Athena)

● Hands-on experience in AirFlow

● Understanding of best practices, common coding patterns and good practices around

● storing, partitioning, warehousing and indexing of data

● Experience in reading the data from Kafka topic (both live stream and offline)

● Experience in PySpark and Data frames

Responsibilities:

You’ll

● Collaborating across an agile team to continuously design, iterate, and develop big data systems.

● Extracting, transforming, and loading data into internal databases.

● Optimizing our new and existing data pipelines for speed and reliability.

● Deploying new products and product improvements.

● Documenting and managing multiple repositories of code.

Read more
Mactores Cognition Private Limited
Remote only
5 - 15 yrs
₹5L - ₹21L / yr
ETL
Informatica
Data Warehouse (DWH)
Amazon Web Services (AWS)
Amazon S3
+3 more

Mactores is a trusted leader among businesses in providing modern data platform solutions. Since 2008, Mactores have been enabling businesses to accelerate their value through automation by providing End-to-End Data Solutions that are automated, agile, and secure. We collaborate with customers to strategize, navigate, and accelerate an ideal path forward with a digital transformation via assessments, migration, or modernization.


We are looking for a DataOps Engineer with expertise while operating a data lake. Amazon S3, Amazon EMR, and Apache Airflow for workflow management are used to build the data lake.


You have experience of building and running data lake platforms on AWS. You have exposure to operating PySpark-based ETL Jobs in Apache Airflow and Amazon EMR. Expertise in monitoring services like Amazon CloudWatch.


If you love solving problems using yo, professional services background, usual and fun office environment that actively steers clear of rigid "corporate" culture, focuses on productivity and creativity, and allows you to be part of a world-class team while still being yourself.


What you will do?


  • Operate the current data lake deployed on AWS with Amazon S3, Amazon EMR, and Apache Airflow
  • Debug and fix production issues in PySpark.
  • Determine the RCA (Root cause analysis) for production issues.
  • Collaborate with product teams for L3/L4 production issues in PySpark.
  • Contribute to enhancing the ETL efficiency
  • Build CloudWatch dashboards for optimizing the operational efficiencies
  • Handle escalation tickets from L1 Monitoring engineers
  • Assign the tickets to L1 engineers based on their expertise


What are we looking for?


  • AWS data Ops engineer.
  • Overall 5+ years of exp in the software industry Exp in developing architecture data applications using python or scala, Airflow, and Kafka on AWS Data platform Experience and expertise.
  • Must have set up or led the project to enable Data Ops on AWS or any other cloud data platform.
  • Strong data engineering experience on Cloud platform, preferably AWS.
  • Experience with data pipelines designed for reuse and use parameterization.
  • Experience of pipelines was designed to solve common ETL problems.
  • Understanding or experience on various AWS services can be codified for enabling DataOps like Amazon EMR, Apache Airflow.
  • Experience in building data pipelines using CI/CD infrastructure.
  • Understanding of Infrastructure as code for DataOps ennoblement.
  • Ability to work with ambiguity and create quick PoCs.


You will be preferred if


  • Expertise in Amazon EMR, Apache Airflow, Terraform, CloudWatch
  • Exposure to MLOps using Amazon Sagemaker is a plus.
  • AWS Solutions Architect Professional or Associate Level Certificate
  • AWS DevOps Professional Certificate


Life at Mactores


We care about creating a culture that makes a real difference in the lives of every Mactorian. Our 10 Core Leadership Principles that honor Decision-making, Leadership, Collaboration, and Curiosity drive how we work.


1. Be one step ahead

2. Deliver the best

3. Be bold

4. Pay attention to the detail

5. Enjoy the challenge

6. Be curious and take action

7. Take leadership

8. Own it

9. Deliver value

10. Be collaborative


We would like you to read more details about the work culture on https://mactores.com/careers 


The Path to Joining the Mactores Team

At Mactores, our recruitment process is structured around three distinct stages:


Pre-Employment Assessment: 

You will be invited to participate in a series of pre-employment evaluations to assess your technical proficiency and suitability for the role.


Managerial Interview: The hiring manager will engage with you in multiple discussions, lasting anywhere from 30 minutes to an hour, to assess your technical skills, hands-on experience, leadership potential, and communication abilities.


HR Discussion: During this 30-minute session, you'll have the opportunity to discuss the offer and next steps with a member of the HR team.


At Mactores, we are committed to providing equal opportunities in all of our employment practices, and we do not discriminate based on race, religion, gender, national origin, age, disability, marital status, military status, genetic information, or any other category protected by federal, state, and local laws. This policy extends to all aspects of the employment relationship, including recruitment, compensation, promotions, transfers, disciplinary action, layoff, training, and social and recreational programs. All employment decisions will be made in compliance with these principles.


Read more
Mactores Cognition Private Limited
Remote only
2 - 15 yrs
₹6L - ₹40L / yr
Amazon Web Services (AWS)
PySpark
athena
Data engineering

As AWS Data Engineer, you are a full-stack data engineer that loves solving business problems. You work with business leads, analysts, and data scientists to understand the business domain and engage with fellow engineers to build data products that empower better decision-making. You are passionate about the data quality of our business metrics and the flexibility of your solution that scales to respond to broader business questions. 


If you love to solve problems using your skills, then come join the Team Mactores. We have a casual and fun office environment that actively steers clear of rigid "corporate" culture, focuses on productivity and creativity, and allows you to be part of a world-class team while still being yourself.

What you will do?

  • Write efficient code in - PySpark, Amazon Glue
  • Write SQL Queries in - Amazon Athena, Amazon Redshift
  • Explore new technologies and learn new techniques to solve business problems creatively
  • Collaborate with many teams - engineering and business, to build better data products and services 
  • Deliver the projects along with the team collaboratively and manage updates to customers on time


What are we looking for?

  • 1 to 3 years of experience in Apache Spark, PySpark, Amazon Glue
  • 2+ years of experience in writing ETL jobs using pySpark, and SparkSQL
  • 2+ years of experience in SQL queries and stored procedures
  • Have a deep understanding of all the Dataframe API with all the transformation functions supported by Spark 2.7+


You will be preferred if you have

  • Prior experience in working on AWS EMR, Apache Airflow
  • Certifications AWS Certified Big Data – Specialty OR Cloudera Certified Big Data Engineer OR Hortonworks Certified Big Data Engineer
  • Understanding of DataOps Engineering


Life at Mactores


We care about creating a culture that makes a real difference in the lives of every Mactorian. Our 10 Core Leadership Principles that honor Decision-making, Leadership, Collaboration, and Curiosity drive how we work.


1. Be one step ahead

2. Deliver the best

3. Be bold

4. Pay attention to the detail

5. Enjoy the challenge

6. Be curious and take action

7. Take leadership

8. Own it

9. Deliver value

10. Be collaborative


We would like you to read more details about the work culture on https://mactores.com/careers 


The Path to Joining the Mactores Team

At Mactores, our recruitment process is structured around three distinct stages:


Pre-Employment Assessment: 

You will be invited to participate in a series of pre-employment evaluations to assess your technical proficiency and suitability for the role.


Managerial Interview: The hiring manager will engage with you in multiple discussions, lasting anywhere from 30 minutes to an hour, to assess your technical skills, hands-on experience, leadership potential, and communication abilities.


HR Discussion: During this 30-minute session, you'll have the opportunity to discuss the offer and next steps with a member of the HR team.


At Mactores, we are committed to providing equal opportunities in all of our employment practices, and we do not discriminate based on race, religion, gender, national origin, age, disability, marital status, military status, genetic information, or any other category protected by federal, state, and local laws. This policy extends to all aspects of the employment relationship, including recruitment, compensation, promotions, transfers, disciplinary action, layoff, training, and social and recreational programs. All employment decisions will be made in compliance with these principles.

Read more
Ascendeum

at Ascendeum

3 recruiters
Swezelle Esteves
Posted by Swezelle Esteves
Remote only
1 - 5 yrs
₹8L - ₹10L / yr
Python
Data Analytics
Data Science
Machine Learning (ML)
Natural Language Processing (NLP)
+4 more

Job Responsibilities: 

 

  • Identify valuable data sources and automate collection processes 
  • Undertake preprocessing of structured and unstructured data. 
  • Analyze large amounts of information to discover trends and patterns 
  • Helping develop reports and analysis. 
  • Present information using data visualization techniques. 
  • Assessing tests and implementing new or upgraded software and assisting with strategic decisions on new systems. 
  • Evaluating changes and updates to source production systems. 
  • Develop, implement, and maintain leading-edge analytic systems, taking complicated problems and building simple frameworks 
  • Providing technical expertise in data storage structures, data mining, and data cleansing. 
  • Propose solutions and strategies to business challenges 

 

Desired Skills and Experience: 

 

  • At least 1 year of experience in Data Analysis 
  • Complete understanding of Operations Research, Data Modelling, ML, and AI concepts. 
  • Knowledge of Python is mandatory, familiarity with MySQL, SQL, Scala, Java or C++ is an asset 
  • Experience using visualization tools (e.g. Jupyter Notebook) and data frameworks (e.g. Hadoop) 
  • Analytical mind and business acumen 
  • Strong math skills (e.g. statistics, algebra) 
  • Problem-solving aptitude 
  • Excellent communication and presentation skills. 
  • Bachelor’s / Master's Degree in Computer Science, Engineering, Data Science or other quantitative or relevant field is preferred  
Read more
Pune
5 - 9 yrs
₹5L - ₹15L / yr
PySpark
Data engineering
Big Data
Hadoop
Spark
+2 more
This role is for a developer with strong core application or system programming skills in Scala, java and
good exposure to concepts and/or technology across the broader spectrum. Enterprise Risk Technology
covers a variety of existing systems and green-field projects.
A Full stack Hadoop development experience with Scala development
A Full stack Java development experience covering Core Java (including JDK 1.8) and good understanding
of design patterns.
Requirements:-
• Strong hands-on development in Java technologies.
• Strong hands-on development in Hadoop technologies like Spark, Scala and experience on Avro.
• Participation in product feature design and documentation
• Requirement break-up, ownership and implantation.
• Product BAU deliveries and Level 3 production defects fixes.
Qualifications & Experience
• Degree holder in numerate subject
• Hands on Experience on Hadoop, Spark, Scala, Impala, Avro and messaging like Kafka
• Experience across a core compiled language – Java
• Proficiency in Java related frameworks like Springs, Hibernate, JPA
• Hands on experience in JDK 1.8 and strong skillset covering Collections, Multithreading with

For internal use only
For internal use only
experience working on Distributed applications.
• Strong hands-on development track record with end-to-end development cycle involvement
• Good exposure to computational concepts
• Good communication and interpersonal skills
• Working knowledge of risk and derivatives pricing (optional)
• Proficiency in SQL (PL/SQL), data modelling.
• Understanding of Hadoop architecture and Scala program language is a good to have.
Read more
Chennai, Hyderabad
5 - 10 yrs
₹10L - ₹25L / yr
PySpark
Data engineering
Big Data
Hadoop
Spark
+2 more

Bigdata with cloud:

 

Experience : 5-10 years

 

Location : Hyderabad/Chennai

 

Notice period : 15-20 days Max

 

1.  Expertise in building AWS Data Engineering pipelines with AWS Glue -> Athena -> Quick sight

2.  Experience in developing lambda functions with AWS Lambda

3.  Expertise with Spark/PySpark – Candidate should be hands on with PySpark code and should be able to do transformations with Spark

4.  Should be able to code in Python and Scala.

5.  Snowflake experience will be a plus

Read more
Hyderabad
5 - 15 yrs
₹4L - ₹14L / yr
Spark
Hadoop
Big Data
Data engineering
PySpark
+4 more
Big Data Engineer:-


-Expertise in building AWS Data Engineering pipelines with AWS Glue -> Athena -> Quick sight.

-Experience in developing lambda functions with AWS Lambda.

-
Expertise with Spark/PySpark

– Candidate should be hands on with PySpark code and should be able to do transformations with Spark

-Should be able to code in Python and Scala.

-
Snowflake experience will be a plus
Read more
Hyderabad
4 - 8 yrs
₹5L - ₹14L / yr
Spark
Hadoop
Big Data
Data engineering
PySpark
+4 more
Expertise in building AWS Data Engineering pipelines with AWS Glue -> Athena -> Quick sight
Experience in developing lambda functions with AWS Lambda
Expertise with Spark/PySpark – Candidate should be hands on with PySpark code and should be able to do transformations with Spark
Should be able to code in Python and Scala.
Snowflake experience will be a plus
Read more
Hyderabad
4 - 8 yrs
₹6L - ₹25L / yr
PySpark
Data engineering
Big Data
Hadoop
Spark
+4 more
  1. Expertise in building AWS Data Engineering pipelines with AWS Glue -> Athena -> Quick sight
  2. Experience in developing lambda functions with AWS Lambda
  3. Expertise with Spark/PySpark – Candidate should be hands on with PySpark code and should be able to do transformations with Spark
  4. Should be able to code in Python and Scala.
  5. Snowflake experience will be a plus

 

Read more
Hyderabad
3 - 7 yrs
₹1L - ₹15L / yr
Big Data
Spark
Hadoop
PySpark
Amazon Web Services (AWS)
+3 more

Big data Developer

Exp: 3yrs to 7 yrs.
Job Location: Hyderabad
Notice: Immediate / within 30 days

1. Expertise in building AWS Data Engineering pipelines with AWS Glue -> Athena -> Quick sight
2. Experience in developing lambda functions with AWS Lambda
3. Expertise with Spark/PySpark Candidate should be hands on with PySpark code and should be able to do transformations with Spark
4. Should be able to code in Python and Scala.
5. Snowflake experience will be a plus

We can start keeping Hadoop and Hive requirements as good to have or understanding of is enough rather than keeping it as a desirable requirement.

Read more
Tata Digital Pvt Ltd
Agency job
via Seven N Half by Priya Singh
Bengaluru (Bangalore)
8 - 13 yrs
₹10L - ₹15L / yr
PySpark
Data engineering
Big Data
Hadoop
Spark
+2 more

 

              Data Engineer

 

-          High Skilled and proficient on Azure Data Engineering Tech stacks (ADF, Databricks)

-          Should be well experienced in design and development of Big data integration platform (Kafka, Hadoop).

-          Highly skilled and experienced in building medium to complex data integration pipelines for Data at Rest and streaming data using Spark.

-          Strong knowledge in R/Python.

-          Advanced proficiency in solution design and implementation through Azure Data Lake, SQL and NoSQL Databases.

-          Strong in Data Warehousing concepts

-          Expertise in SQL, SQL tuning, Data Management (Data Security), schema design, Python and ETL processes

-          Highly Motivated, Self-Starter and quick learner

-          Must have Good knowledge on Data modelling and understating of Data analytics

-          Exposure to Statistical procedures, Experiments and Machine Learning techniques is an added advantage.

-          Experience in leading small team of 6/7 Data Engineers.

-          Excellent written and verbal communication skills

 

Read more
codersbrain

at codersbrain

1 recruiter
Tanuj Uppal
Posted by Tanuj Uppal
Delhi
4 - 8 yrs
₹2L - ₹15L / yr
Spark
Hadoop
Big Data
Data engineering
PySpark
+5 more
  • Mandatory - Hands on experience in Python and PySpark.

 

  • Build pySpark applications using Spark Dataframes in Python using Jupyter notebook and PyCharm(IDE).

 

  • Worked on optimizing spark jobs that processes huge volumes of data.

 

  • Hands on experience in version control tools like Git.

 

  • Worked on Amazon’s Analytics services like Amazon EMR, Lambda function etc

 

  • Worked on Amazon’s Compute services like Amazon Lambda, Amazon EC2 and Amazon’s Storage service like S3 and few other services like SNS.

 

  • Experience/knowledge of bash/shell scripting will be a plus.

 

  • Experience in working with fixed width, delimited , multi record file formats etc.

 

  • Hands on experience in tools like Jenkins to build, test and deploy the applications

 

  • Awareness of Devops concepts and be able to work in an automated release pipeline environment.

 

  • Excellent debugging skills.
Read more
Bengaluru (Bangalore), Gurugram
2 - 8 yrs
₹10L - ₹35L / yr
Data Science
Machine Learning (ML)
Natural Language Processing (NLP)
Computer Vision
Python
+11 more
Greetings!!

We are looking for a Machine Learning engineer for on of our premium client.
Experience: 2-9 years
Location: Gurgaon/Bangalore
Tech Stack:

Python, PySpark, the Python Scientific Stack; MLFlow, Grafana, Prometheus for machine learning pipeline management and monitoring; SQL, Airflow, Databricks, our own open-source data pipelining framework called Kedro, Dask/RAPIDS; Django, GraphQL and ReactJS for horizontal product development; container technologies such as Docker and Kubernetes, CircleCI/Jenkins for CI/CD, cloud solutions such as AWS, GCP, and Azure as well as Terraform and Cloudformation for deployment
Read more
Aureus Tech Systems

at Aureus Tech Systems

3 recruiters
Naveen Yelleti
Posted by Naveen Yelleti
Kolkata, Hyderabad, Chennai, Bengaluru (Bangalore), Bhubaneswar, Visakhapatnam, Vijayawada, Trichur, Thiruvananthapuram, Mysore, Delhi, Noida, Gurugram, Nagpur
1 - 7 yrs
₹4L - ₹15L / yr
PySpark
Data engineering
Big Data
Hadoop
Spark
+2 more

Skills and requirements

  • Experience analyzing complex and varied data in a commercial or academic setting.
  • Desire to solve new and complex problems every day.
  • Excellent ability to communicate scientific results to both technical and non-technical team members.


Desirable

  • A degree in a numerically focused discipline such as, Maths, Physics, Chemistry, Engineering or Biological Sciences..
  • Hands on experience on Python, Pyspark, SQL
  • Hands on experience on building End to End Data Pipelines.
  • Hands on Experience on Azure Data Factory, Azure Data Bricks, Data Lake - added advantage
  • Hands on Experience in building data pipelines.
  • Experience with Bigdata Tools, Hadoop, Hive, Sqoop, Spark, SparkSQL
  • Experience with SQL or NoSQL databases for the purposes of data retrieval and management.
  • Experience in data warehousing and business intelligence tools, techniques and technology, as well as experience in diving deep on data analysis or technical issues to come up with effective solutions.
  • BS degree in math, statistics, computer science or equivalent technical field.
  • Experience in data mining structured and unstructured data (SQL, ETL, data warehouse, Machine Learning etc.) in a business environment with large-scale, complex data sets.
  • Proven ability to look at solutions in unconventional ways. Sees opportunities to innovate and can lead the way.
  • Willing to learn and work on Data Science, ML, AI.
Read more
Deltacubes

at Deltacubes

6 recruiters
Bavithra Kanniyappan
Posted by Bavithra Kanniyappan
Remote only
5 - 12 yrs
₹10L - ₹15L / yr
Python
Amazon Web Services (AWS)
PySpark
Scala
Spark
+3 more

Hiring - Python Developer Freelance Consultant (WFH-Remote)

Greetings from Deltacubes Technology!!

 

Skillset Required:

Python

Pyspark

AWS

Scala

 

Experience:

5+ years

 

Thanks

Bavithra

 

Read more
Ahmedabad, Hyderabad, Pune, Delhi
5 - 7 yrs
₹18L - ₹25L / yr
AWS Lambda
AWS Simple Notification Service (SNS)
AWS Simple Queuing Service (SQS)
Python
PySpark
+9 more
  1. Data Engineer

 Required skill set: AWS GLUE, AWS LAMBDA, AWS SNS/SQS, AWS ATHENA, SPARK, SNOWFLAKE, PYTHON

Mandatory Requirements  

  • Experience in AWS Glue
  • Experience in Apache Parquet 
  • Proficient in AWS S3 and data lake 
  • Knowledge of Snowflake
  • Understanding of file-based ingestion best practices.
  • Scripting language - Python & pyspark 

CORE RESPONSIBILITIES 

  • Create and manage cloud resources in AWS 
  • Data ingestion from different data sources which exposes data using different technologies, such as: RDBMS, REST HTTP API, flat files, Streams, and Time series data based on various proprietary systems. Implement data ingestion and processing with the help of Big Data technologies 
  • Data processing/transformation using various technologies such as Spark and Cloud Services. You will need to understand your part of business logic and implement it using the language supported by the base data platform 
  • Develop automated data quality check to make sure right data enters the platform and verifying the results of the calculations 
  • Develop an infrastructure to collect, transform, combine and publish/distribute customer data.
  • Define process improvement opportunities to optimize data collection, insights and displays.
  • Ensure data and results are accessible, scalable, efficient, accurate, complete and flexible 
  • Identify and interpret trends and patterns from complex data sets 
  • Construct a framework utilizing data visualization tools and techniques to present consolidated analytical and actionable results to relevant stakeholders. 
  • Key participant in regular Scrum ceremonies with the agile teams  
  • Proficient at developing queries, writing reports and presenting findings 
  • Mentor junior members and bring best industry practices 

QUALIFICATIONS 

  • 5-7+ years’ experience as data engineer in consumer finance or equivalent industry (consumer loans, collections, servicing, optional product, and insurance sales) 
  • Strong background in math, statistics, computer science, data science or related discipline
  • Advanced knowledge one of language: Java, Scala, Python, C# 
  • Production experience with: HDFS, YARN, Hive, Spark, Kafka, Oozie / Airflow, Amazon Web Services (AWS), Docker / Kubernetes, Snowflake  
  • Proficient with
  • Data mining/programming tools (e.g. SAS, SQL, R, Python)
  • Database technologies (e.g. PostgreSQL, Redshift, Snowflake. and Greenplum)
  • Data visualization (e.g. Tableau, Looker, MicroStrategy)
  • Comfortable learning about and deploying new technologies and tools. 
  • Organizational skills and the ability to handle multiple projects and priorities simultaneously and meet established deadlines. 
  • Good written and oral communication skills and ability to present results to non-technical audiences 
  • Knowledge of business intelligence and analytical tools, technologies and techniques.

  

Familiarity and experience in the following is a plus:  

  • AWS certification
  • Spark Streaming 
  • Kafka Streaming / Kafka Connect 
  • ELK Stack 
  • Cassandra / MongoDB 
  • CI/CD: Jenkins, GitLab, Jira, Confluence other related tools
Read more
Numantra Technologies

at Numantra Technologies

2 recruiters
Vandana Saxena
Posted by Vandana Saxena
Mumbai, Navi Mumbai
2 - 8 yrs
₹5L - ₹12L / yr
Microsoft Windows Azure
ADF
NumPy
PySpark
Databricks
+1 more
Experience and expertise in using Azure cloud services. Azure certification will be a plus.

- Experience and expertise in Python Development and its different libraries like Pyspark, pandas, NumPy

- Expertise in ADF, Databricks.

- Creating and maintaining data interfaces across a number of different protocols (file, API.).

- Creating and maintaining internal business process solutions to keep our corporate system data in sync and reduce manual processes where appropriate.

- Creating and maintaining monitoring and alerting workflows to improve system transparency.

- Facilitate the development of our Azure cloud infrastructure relative to Data and Application systems.

- Design and lead development of our data infrastructure including data warehouses, data marts, and operational data stores.

- Experience in using Azure services such as ADLS Gen 2, Azure Functions, Azure messaging services, Azure SQL Server, Azure KeyVault, Azure Cognitive services etc.
Read more
GradMener Technology Pvt. Ltd.
Pune, Chennai
5 - 9 yrs
₹15L - ₹20L / yr
Scala
PySpark
Spark
SQL Azure
Hadoop
+4 more
  • 5+ years of experience in a Data Engineering role on cloud environment
  • Must have good experience in Scala/PySpark (preferably on data-bricks environment)
  • Extensive experience with Transact-SQL.
  • Experience in Data-bricks/Spark.
  • Strong experience in Dataware house projects
  • Expertise in database development projects with ETL processes.
  • Manage and maintain data engineering pipelines
  • Develop batch processing, streaming and integration solutions
  • Experienced in building and operationalizing large-scale enterprise data solutions and applications
  • Using one or more of Azure data and analytics services in combination with custom solutions
  • Azure Data Lake, Azure SQL DW (Synapse), and SQL Database products or equivalent products from other cloud services providers
  • In-depth understanding of data management (e. g. permissions, security, and monitoring).
  • Cloud repositories for e.g. Azure GitHub, Git
  • Experience in an agile environment (Prefer Azure DevOps).

Good to have

  • Manage source data access security
  • Automate Azure Data Factory pipelines
  • Continuous Integration/Continuous deployment (CICD) pipelines, Source Repositories
  • Experience in implementing and maintaining CICD pipelines
  • Power BI understanding, Delta Lake house architecture
  • Knowledge of software development best practices.
  • Excellent analytical and organization skills.
  • Effective working in a team as well as working independently.
  • Strong written and verbal communication skills.
  • Expertise in database development projects and ETL processes.
Read more
EnterpriseMinds

at EnterpriseMinds

2 recruiters
phani kalyan
Posted by phani kalyan
Bengaluru (Bangalore)
3 - 7.5 yrs
₹10L - ₹25L / yr
Machine Learning (ML)
Data Science
Natural Language Processing (NLP)
Spark
Software deployment
+1 more
Job ID: ZS0701

Hi,

We are hiring for Data Scientist for Bangalore.

Req Skills:

  • NLP 
  • ML programming
  • Spark
  • Model Deployment
  • Experience processing unstructured data and building NLP models
  • Experience with big data tools pyspark
  • Pipeline orchestration using Airflow and model deployment experience is preferred
Read more
EnterpriseMinds

at EnterpriseMinds

2 recruiters
phani kalyan
Posted by phani kalyan
Pune
9 - 14 yrs
₹20L - ₹40L / yr
Spark
Hadoop
Big Data
Data engineering
PySpark
+3 more
Job Id: SG0601

Hi,

Enterprise Minds is looking for Data Architect for Pune Location.

Req Skills:
Python,Pyspark,Hadoop,Java,Scala
Read more
EnterpriseMinds

at EnterpriseMinds

2 recruiters
phani kalyan
Posted by phani kalyan
Bengaluru (Bangalore)
3 - 6 yrs
Best in industry
Python
PySpark
Data Science
Job ID: ZS070

Hi,

Enterprise minds is looking for Data Scientist. 

Strong in Python,Pyspark.

Prefer immediate joiners
Read more
RedSeer Consulting

at RedSeer Consulting

2 recruiters
Raunak Swarnkar
Posted by Raunak Swarnkar
Bengaluru (Bangalore)
0 - 2 yrs
₹10L - ₹15L / yr
Python
PySpark
SQL
pandas
Cloud Computing
+2 more

BRIEF DESCRIPTION:

At-least 1 year of Python, Spark, SQL, data engineering experience

Primary Skillset: PySpark, Scala/Python/Spark, Azure Synapse, S3, RedShift/Snowflake

Relevant Experience: Legacy ETL job Migration to AWS Glue / Python & Spark combination

 

ROLE SCOPE:

Reverse engineer the existing/legacy ETL jobs

Create the workflow diagrams and review the logic diagrams with Tech Leads

Write equivalent logic in Python & Spark

Unit test the Glue jobs and certify the data loads before passing to system testing

Follow the best practices, enable appropriate audit & control mechanism

Analytically skillful, identify the root causes quickly and efficiently debug issues

Take ownership of the deliverables and support the deployments

 

REQUIREMENTS:

Create data pipelines for data integration into Cloud stacks eg. Azure Synapse

Code data processing jobs in Azure Synapse Analytics, Python, and Spark

Experience in dealing with structured, semi-structured, and unstructured data in batch and real-time environments.

Should be able to process .json, .parquet and .avro files

 

PREFERRED BACKGROUND:

Tier1/2 candidates from IIT/NIT/IIITs

However, relevant experience, learning attitude takes precedence

Read more
BDI Plus Lab

at BDI Plus Lab

2 recruiters
Puja Kumari
Posted by Puja Kumari
Remote only
2 - 6 yrs
₹6L - ₹20L / yr
Apache Hive
Spark
Scala
PySpark
Data engineering
+4 more
We are looking for big data engineers to join our transformational consulting team serving one of our top US clients in the financial sector. You'd get an opportunity to develop big data pipelines and convert business requirements to production grade services and products. With
lesser concentration on enforcing how to do a particular task, we believe in giving people the opportunity to think out of the box and come up with their own innovative solution to problem solving.
You will primarily be developing, managing and executing handling multiple prospect campaigns as part of Prospect Marketing Journey to ensure best conversion rates and retention rates. Below are the roles, responsibilities and skillsets we are looking for and if you feel these resonate with you, please get in touch with us by applying to this role.
Roles and Responsibilities:
• You'd be responsible for development and maintenance of applications with technologies involving Enterprise Java and Distributed technologies.
• You'd collaborate with developers, product manager, business analysts and business users in conceptualizing, estimating and developing new software applications and enhancements.
• You'd Assist in the definition, development, and documentation of software’s objectives, business requirements, deliverables, and specifications in collaboration with multiple cross-functional teams.
• Assist in the design and implementation process for new products, research and create POC for possible solutions.
Skillset:
• Bachelors or Masters Degree in a technology related field preferred.
• Overall experience of 2-3 years on the Big Data Technologies.
• Hands on experience with Spark (Java/ Scala)
• Hands on experience with Hive, Shell Scripting
• Knowledge on Hbase, Elastic Search
• Development experience In Java/ Python is preferred
• Familiar with profiling, code coverage, logging, common IDE’s and other
development tools.
• Demonstrated verbal and written communication skills, and ability to interface with Business, Analytics and IT organizations.
• Ability to work effectively in short-cycle, team oriented environment, managing multiple priorities and tasks.
• Ability to identify non-obvious solutions to complex problems
Read more
SenecaGlobal

at SenecaGlobal

6 recruiters
Shiva V
Posted by Shiva V
Remote, Hyderabad
4 - 6 yrs
₹15L - ₹20L / yr
Python
PySpark
Spark
Scala
Microsoft Azure Data factory
Should have good experience with Python or Scala/PySpark/Spark/
• Experience with Advanced SQL
• Experience with Azure data factory, data bricks,
• Experience with Azure IOT, Cosmos DB, BLOB Storage
• API management, FHIR API development,
• Proficient with Git and CI/CD best practices
• Experience working with Snowflake is a plus
Read more
Top 3 Fintech Startup
Agency job
via Jobdost by Sathish Kumar
Bengaluru (Bangalore)
6 - 9 yrs
₹16L - ₹24L / yr
SQL
Amazon Web Services (AWS)
Spark
PySpark
Apache Hive

We are looking for an exceptionally talented Lead data engineer who has exposure in implementing AWS services to build data pipelines, api integration and designing data warehouse. Candidate with both hands-on and leadership capabilities will be ideal for this position.

 

Qualification: At least a bachelor’s degree in Science, Engineering, Applied Mathematics. Preferred Masters degree

 

Job Responsibilities:

• Total 6+ years of experience as a Data Engineer and 2+ years of experience in managing a team

• Have minimum 3 years of AWS Cloud experience.

• Well versed in languages such as Python, PySpark, SQL, NodeJS etc

• Has extensive experience in the real-timeSpark ecosystem and has worked on both real time and batch processing

• Have experience in AWS Glue, EMR, DMS, Lambda, S3, DynamoDB, Step functions, Airflow, RDS, Aurora etc.

• Experience with modern Database systems such as Redshift, Presto, Hive etc.

• Worked on building data lakes in the past on S3 or Apache Hudi

• Solid understanding of Data Warehousing Concepts

• Good to have experience on tools such as Kafka or Kinesis

• Good to have AWS Developer Associate or Solutions Architect Associate Certification

• Have experience in managing a team

Read more
Intuitive Technology Partners
shalu Jain
Posted by shalu Jain
Remote only
9 - 20 yrs
Best in industry
Architecture
Presales
Postsales
Amazon Web Services (AWS)
databricks
+13 more

Intuitive cloud (http://www.intuitive.cloud">www.intuitive.cloud) is one of the fastest growing top-tier Cloud Solutions and SDx Engineering solution and service company supporting 80+ Global Enterprise Customer across Americas, Europe and Middle East.

Intuitive is a recognized professional and manage service partner for core superpowers in cloud(public/ Hybrid), security, GRC, DevSecOps, SRE, Application modernization/ containers/ K8 -as-a- service and cloud application delivery.


Data Engineering:

  • 9+ years’ experience as data engineer.
  • Must have 4+ Years in implementing data engineering solutions with Databricks.
  • This is hands on role building data pipelines using Databricks. Hands-on technical experience with Apache Spark.
  • Must have deep expertise in one of the programming languages for data processes (Python, Scala). Experience with Python, PySpark, Hadoop, Hive and/or Spark to write data pipelines and data processing layers
  • Must have worked with relational databases like Snowflake. Good SQL experience for writing complex SQL transformation.
  • Performance Tuning of Spark SQL running on S3/Data Lake/Delta Lake/ storage and Strong Knowledge on Databricks and Cluster Configurations.
  • Hands on architectural experience
  • Nice to have Databricks administration including security and infrastructure features of Databricks.
Read more
Ganit Business Solutions

at Ganit Business Solutions

3 recruiters
Vijitha VS
Posted by Vijitha VS
Remote only
4 - 7 yrs
₹10L - ₹30L / yr
Scala
ETL
Informatica
Data Warehouse (DWH)
Big Data
+4 more

Job Description:

We are looking for a Big Data Engineer who have worked across the entire ETL stack. Someone who has ingested data in a batch and live stream format, transformed large volumes of daily and built Data-warehouse to store the transformed data and has integrated different visualization dashboards and applications with the data stores.    The primary focus will be on choosing optimal solutions to use for these purposes, then maintaining, implementing, and monitoring them.

Responsibilities:

  • Develop, test, and implement data solutions based on functional / non-functional business requirements.
  • You would be required to code in Scala and PySpark daily on Cloud as well as on-prem infrastructure
  • Build Data Models to store the data in a most optimized manner
  • Identify, design, and implement process improvements: automating manual processes, optimizing data delivery, re-designing infrastructure for greater scalability, etc.
  • Implementing the ETL process and optimal data pipeline architecture
  • Monitoring performance and advising any necessary infrastructure changes.
  • Create data tools for analytics and data scientist team members that assist them in building and optimizing our product into an innovative industry leader.
  • Work with data and analytics experts to strive for greater functionality in our data systems.
  • Proactively identify potential production issues and recommend and implement solutions
  • Must be able to write quality code and build secure, highly available systems.
  • Create design documents that describe the functionality, capacity, architecture, and process.
  • Review peer-codes and pipelines before deploying to Production for optimization issues and code standards

Skill Sets:

  • Good understanding of optimal extraction, transformation, and loading of data from a wide variety of data sources using SQL and ‘big data’ technologies.
  • Proficient understanding of distributed computing principles
  • Experience in working with batch processing/ real-time systems using various open-source technologies like NoSQL, Spark, Pig, Hive, Apache Airflow.
  • Implemented complex projects dealing with the considerable data size (PB).
  • Optimization techniques (performance, scalability, monitoring, etc.)
  • Experience with integration of data from multiple data sources
  • Experience with NoSQL databases, such as HBase, Cassandra, MongoDB, etc.,
  • Knowledge of various ETL techniques and frameworks, such as Flume
  • Experience with various messaging systems, such as Kafka or RabbitMQ
  • Creation of DAGs for data engineering
  • Expert at Python /Scala programming, especially for data engineering/ ETL purposes

 

 

 

Read more
Fragma Data Systems

at Fragma Data Systems

8 recruiters

Vamsikrishna G
Posted by Vamsikrishna G
Bengaluru (Bangalore)
2 - 10 yrs
₹5L - ₹15L / yr
PySpark
Data engineering
Big Data
Hadoop
Spark
+1 more
Job Description:

Must Have Skills:
• Good experience in Pyspark - Including Dataframe core functions and Spark SQL
• Good experience in SQL DBs - Be able to write queries including fair complexity.
• Should have excellent experience in Big Data programming for data transformation and aggregations
• Good at ELT architecture. Business rules processing and data extraction from Data Lake into data streams for business consumption.
• Good customer communication.
• Good Analytical skills
Read more
MNC

at MNC

Agency job
via Eurka IT SOL by Srikanth a
Chennai
5 - 11 yrs
₹10L - ₹15L / yr
PySpark
SQL
Test Automation (QA)
Big Data
Data Science

Lead QA: more than 5 years experience , led the team of more than 5 people in big data platform, should have experience in Test Automation framework, should have experience of Test process documentation

Read more
Indium Software

at Indium Software

16 recruiters
Karunya P
Posted by Karunya P
Bengaluru (Bangalore), Hyderabad
1 - 9 yrs
₹1L - ₹15L / yr
SQL
Python
Hadoop
HiveQL
Spark
+1 more

Responsibilities:

 

* 3+ years of Data Engineering Experience - Design, develop, deliver and maintain data infrastructures.

SQL Specialist – Strong knowledge and Seasoned experience with SQL Queries

Languages: Python

* Good communicator, shows initiative, works well with stakeholders.

* Experience working closely with Data Analysts and provide the data they need and guide them on the issues.

* Solid ETL experience and Hadoop/Hive/Pyspark/Presto/ SparkSQL

* Solid communication and articulation skills

* Able to handle stakeholders independently with less interventions of reporting manager.

* Develop strategies to solve problems in logical yet creative ways.

* Create custom reports and presentations accompanied by strong data visualization and storytelling

 

We would be excited if you have:

 

* Excellent communication and interpersonal skills

* Ability to meet deadlines and manage project delivery

* Excellent report-writing and presentation skills

* Critical thinking and problem-solving capabilities

Read more
Get to hear about interesting companies hiring right now
Company logo
Company logo
Company logo
Company logo
Company logo
Linkedin iconFollow Cutshort
Why apply via Cutshort?
Connect with actual hiring teams and get their fast response. No spam.
Find more jobs
Get to hear about interesting companies hiring right now
Company logo
Company logo
Company logo
Company logo
Company logo
Linkedin iconFollow Cutshort