Data Engineer (PySpark)

at Persistent Systems

icon
Bengaluru (Bangalore), Pune, Mumbai, Nagpur, Goa, Indore, Hyderabad
icon
5 - 10 yrs
icon
₹20L - ₹30L / yr
icon
Full time
Skills
PySpark
Spark
Hadoop
Big Data
Data engineering
Amazon Web Services (AWS)
Python

Responsibilities

 

  • Develop process workflows for data preparations, modeling, and mining Manage configurations to build reliable datasets for analysis Troubleshooting services, system bottlenecks, and application integration.
  • Designing, integrating, and documenting technical components, and dependencies of big data platform Ensuring best practices that can be adopted in the Big Data stack and shared across teams.
  • Design and Development of Data pipeline on AWS Cloud
  • Data Pipeline development using Pyspark, AWS, and Python.
  • Developing Pyspark streaming applications

Eligibility

 

  • Hands-on experience in Spark, Python, and Cloud
  • Highly analytical and data-oriented
  • Good to have - Databricks
Read more

About Persistent Systems

Founded
1991
Type
Size
100-1000
Stage
Profitable
About
Software Driven Business | Persistent Systems
Read more
Company video
Connect with the team
icon
asdasdg asdasdad
Company social profiles
icon
icon
icon
icon
Why apply to jobs via Cutshort
Personalized job matches
Stop wasting time. Get matched with jobs that meet your skills, aspirations and preferences.
Verified hiring teams
See actual hiring teams, find common social connections or connect with them directly. No 3rd party agencies here.
Move faster with AI
We use AI to get you faster responses, recommendations and unmatched user experience.
2101133
Matches delivered
3712187
Network size
15000
Companies hiring

Similar jobs

DP
Posted by Tanuj Uppal
Delhi
4 - 8 yrs
₹2L - ₹15L / yr
Spark
Hadoop
Big Data
Data engineering
PySpark
+5 more
  • Mandatory - Hands on experience in Python and PySpark.

 

  • Build pySpark applications using Spark Dataframes in Python using Jupyter notebook and PyCharm(IDE).

 

  • Worked on optimizing spark jobs that processes huge volumes of data.

 

  • Hands on experience in version control tools like Git.

 

  • Worked on Amazon’s Analytics services like Amazon EMR, Lambda function etc

 

  • Worked on Amazon’s Compute services like Amazon Lambda, Amazon EC2 and Amazon’s Storage service like S3 and few other services like SNS.

 

  • Experience/knowledge of bash/shell scripting will be a plus.

 

  • Experience in working with fixed width, delimited , multi record file formats etc.

 

  • Hands on experience in tools like Jenkins to build, test and deploy the applications

 

  • Awareness of Devops concepts and be able to work in an automated release pipeline environment.

 

  • Excellent debugging skills.
Read more
Chennai
4 - 8 yrs
₹4L - ₹15L / yr
ETL
Informatica
Data Warehouse (DWH)
Spark
SQL
+1 more
We are looking for a Data Engineer to join our growing team of analytics experts. The hire will be responsible for expanding and optimizing our data and data pipeline architecture, as well as optimizing data flow and collection for cross functional teams. The ideal candidate is an experienced data pipeline builder and data wrangler who enjoy optimizing data systems and building them from the ground up.

The Data Engineer will support our software developers, database architects, data analysts and data scientists on data initiatives and will ensure optimal data delivery architecture is consistent throughout ongoing projects. They must be self-directed and comfortable supporting the data needs of multiple teams, systems and products.
Responsibilities for Data Engineer
• Create and maintain optimal data pipeline architecture,
• Assemble large, complex data sets that meet functional / non-functional business requirements.
• Identify, design, and implement internal process improvements: automating manual processes,
optimizing data delivery, re-designing infrastructure for greater scalability, etc.
• Build the infrastructure required for optimal extraction, transformation, and loading of data
from a wide variety of data sources using SQL and AWS big data technologies.
• Build analytics tools that utilize the data pipeline to provide actionable insights into customer
acquisition, operational efficiency and other key business performance metrics.
• Work with stakeholders including the Executive, Product, Data and Design teams to assist with
data-related technical issues and support their data infrastructure needs.
• Create data tools for analytics and data scientist team members that assist them in building and
optimizing our product into an innovative industry leader.
• Work with data and analytics experts to strive for greater functionality in our data systems.
Qualifications for Data Engineer
• Experience building and optimizing big data ETL pipelines, architectures and data sets.
• Advanced working SQL knowledge and experience working with relational databases, query
authoring (SQL) as well as working familiarity with a variety of databases.
• Experience performing root cause analysis on internal and external data and processes to
answer specific business questions and identify opportunities for improvement.
• Strong analytic skills related to working with unstructured datasets.
• Build processes supporting data transformation, data structures, metadata, dependency and
workload management.
• A successful history of manipulating, processing and extracting value from large disconnected
datasets.
Read more
Brand Manufacturer for Bearded Men
Agency job
via Qrata by Prajakta Kulkarni
Ahmedabad
3 - 10 yrs
₹15L - ₹30L / yr
Analytics
Business Intelligence (BI)
Business Analysis
Python
SQL
+2 more
Analytics Head

Technical must haves:

● Extensive exposure to at least one Business Intelligence Platform (if possible, QlikView/Qlik
Sense) – if not Qlik, ETL tool knowledge, ex- Informatica/Talend
● At least 1 Data Query language – SQL/Python
● Experience in creating breakthrough visualizations
● Understanding of RDMS, Data Architecture/Schemas, Data Integrations, Data Models and Data Flows is a must
● A technical degree like BE/B. Tech a must

Technical Ideal to have:

● Exposure to our tech stack – PHP
● Microsoft workflows knowledge

Behavioural Pen Portrait:

● Must Have: Enthusiastic, aggressive, vigorous, high achievement orientation, strong command
over spoken and written English
● Ideal: Ability to Collaborate

Preferred location is Ahmedabad, however, if we find exemplary talent then we are open to remote working model- can be discussed.
Read more
DP
Posted by Kajal Jain
Remote only
1 - 4 yrs
₹5L - ₹15L / yr
Python
SQL
Spark
Hadoop
Big Data
+2 more

Big Data Engineer/Data Engineer


What we are solving
Welcome to today’s business data world where:
• Unification of all customer data into one platform is a challenge

• Extraction is expensive
• Business users do not have the time/skill to write queries
• High dependency on tech team for written queries

These facts may look scary but there are solutions with real-time self-serve analytics:
• Fully automated data integration from any kind of a data source into a universal schema
• Analytics database that streamlines data indexing, query and analysis into a single platform.
• Start generating value from Day 1 through deep dives, root cause analysis and micro segmentation

At Propellor.ai, this is what we do.
• We help our clients reduce effort and increase effectiveness quickly
• By clearly defining the scope of Projects
• Using Dependable, scalable, future proof technology solution like Big Data Solutions and Cloud Platforms
• Engaging with Data Scientists and Data Engineers to provide End to End Solutions leading to industrialisation of Data Science Model Development and Deployment

What we have achieved so far
Since we started in 2016,
• We have worked across 9 countries with 25+ global brands and 75+ projects
• We have 50+ clients, 100+ Data Sources and 20TB+ data processed daily

Work culture at Propellor.ai
We are a small, remote team that believes in
• Working with a few, but only with highest quality team members who want to become the very best in their fields.
• With each member's belief and faith in what we are solving, we collectively see the Big Picture
• No hierarchy leads us to believe in reaching the decision maker without any hesitation so that our actions can have fruitful and aligned outcomes.
• Each one is a CEO of their domain.So, the criteria while making a choice is so our employees and clients can succeed together!

To read more about us click here:
https://bit.ly/3idXzs0

About the role
We are building an exceptional team of Data engineers who are passionate developers and wants to push the boundaries to solve complex business problems using the latest tech stack. As a Big Data Engineer, you will work with various Technology and Business teams to deliver our Data Engineering offerings to our clients across the globe.

Role Description

• The role would involve big data pre-processing & reporting workflows including collecting, parsing, managing, analysing, and visualizing large sets of data to turn information into business insights
• Develop the software and systems needed for end-to-end execution on large projects
• Work across all phases of SDLC, and use Software Engineering principles to build scalable solutions
• Build the knowledge base required to deliver increasingly complex technology projects
• The role would also involve testing various machine learning models on Big Data and deploying learned models for ongoing scoring and prediction.

Education & Experience
• B.Tech. or Equivalent degree in CS/CE/IT/ECE/EEE 3+ years of experience designing technological solutions to complex data problems, developing & testing modular, reusable, efficient and scalable code to implement those solutions.

Must have (hands-on) experience
• Python and SQL expertise
• Distributed computing frameworks (Hadoop Ecosystem & Spark components)
• Must be proficient in any Cloud computing platforms (AWS/Azure/GCP)  • Experience in in any cloud platform would be preferred - GCP (Big Query/Bigtable, Pub sub, Data Flow, App engine )/ AWS/ Azure

• Linux environment, SQL and Shell scripting Desirable
• Statistical or machine learning DSL like R
• Distributed and low latency (streaming) application architecture
• Row store distributed DBMSs such as Cassandra, CouchDB, MongoDB, etc
. • Familiarity with API design

Hiring Process:
1. One phone screening round to gauge your interest and knowledge of fundamentals
2. An assignment to test your skills and ability to come up with solutions in a certain time
3. Interview 1 with our Data Engineer lead
4. Final Interview with our Data Engineer Lead and the Business Teams

Preferred Immediate Joiners

Read more
DP
Posted by Enrich Braz
Remote only
1 - 3 yrs
₹6L - ₹12L / yr
Python
PL/SQL
API
Tableau
R Language
+1 more
Numadic is hiring a Data Analyst
We are hiring a Data Analyst to join our team in undisrupting movement. Your role will include working with development teams and product managers to ideate software solutions. To succeed in this role you must have an analytical mind with a keen interest in problem solving and communication.

 

 

About the role:
  1. Apply advanced predictive modeling and statistical techniques to design, build, maintain, and improve upon multiple real-time decision systems.
  2. Visualize and show complex data-sets via multidimensional visualization tools.
  3. Perform data cleansing, transformation & feature engineering.
  4. Design scalable automated data mining, modelling and validation processes.
  5. Produce scalable, reusable, efficient feature code to be implemented on clusters and standalone data servers.
  6. Contribute to the development/ deployment of machine learning algorithms, operational research, semantic analysis, and statistical methods for finding structure in large data sets.

 

 

Role requirements:
  1. An absolute minimum of 1-3 years of relevant data science experience.
  2. Have an Engineering or comparable math / physics degree.
  3. Knowledge & working proficiency in Excel, Python, R
  4. Expert proficiency in at least 2 structured programming languages.
  5. Deep understanding about statistical and analytical models.
  6. Bias for action - Ability to move quickly while taking time out to review the details.
  7. Clear communicator - Ability to synthesise and clearly articulate complex information, highlighting key takeaways and actionable insights.
  8. Team player - Working mostly autonomously, yet being a team player keeping your crews looped-in.
  9. Mindset - Ability to take responsibility for your life and that of your people and projects.
  10. Mindfulness - Ability to maintain practices that keep you grounded.

 

 

Numadic operates at the intersection of finance and logistics.
From number one rated mobile apps to industry leading approaches to API finance and transactions, our team does what it takes to simplify movement of goods and people. But what matters even more, is that our customers love the way we do what we do and our end users have uncluttered and uncomplicated experiences.

 

 

Join Numadic
From the founders to our investors and advisors, what we share is a common respect for the value of human life and of meaningful relationships. We are full-stack humans, who work with full-stack humans and seek to do business with full-stack humans. We have turned down projects, when we found misalignment of values at the other end of the table. We do not believe that the customer is always right. We believe that all humans are equal and that the direction of the flow of money should not define the way people are treated. This is life at Numadic.
Read more
DP
Posted by Vignesh M
Remote, Delhi, Gurugram, Noida, Ghaziabad, Faridabad
2 - 4 yrs
₹12L - ₹15L / yr
Data Warehouse (DWH)
Informatica
ETL
Python
SQL
+2 more

Novelship is seeking a Data Engineer to be based in India or Remote in South East Asia to join our Tech Team.

 

Brief Description of the Role:

As a Data Engineer, you will be responsible for Building & Maintaining our Analytics Infrastructure, Data Taxononmy, Data Ingestion and aggregation to provide Business Intelligence to different teams and support Data Dependent tools like ERP and CRM.

 

In this role you will:

  • Analyze and design ETL solutions to store/fetch data from multiple systems like Postgres, Airtable, Google Analytics and Mixpanel.
  • Drive the implementation of new data management projects such as Finance ERP and re-structure of the current data architecture.
  • Participate in the building of a single source of Data Sytems and Data Taxonomy projects.
  • Engage in problem definition and resolution and collaborate with a diverse group of engineers and business owners from across the company.
  • Work with stakeholders including the Strategy, Product and Marketing teams to assist with data-related technical issues, support their data analytics needs and work on data collection and aggregation solutions.
  • Act as a technical resource for the Data team and be involved in creating and implementing current and future Analytics projects like data lake design and data warehouse design.
  • Ensure quality and consistency of the data in the Data warehouse and follow best data governance practices.
  • Analyze large amounts of information to discover trends and patterns to provide Business Intelligence.
  • Mine and analyse data from databases to drive optimization and improvement of product development, marketing techniques and business strategies.
  • Design and build reusable components, frameworks and libraries at scale to support analytics data products
  • Build and maintain optimal data pipeline architecture and data systems. Assemble large, complex data sets that meet functional / non-functional business requirements.
  • Identify, design, and implement internal process improvements: automating manual processes, optimising data delivery, re-designing infrastructure for greater scalability, etc.
  • Streamline existing and introduce enhanced reporting and analysis solutions that leverage complex data sources derived from multiple internal systems

 

 

Requirements:

  • 2 to 4 years of professional experience as a Data Engineer.
  • Proficiency in either Python, Scala or R.
  • Proficiency in SQL, Relational & Non-Relational Databases.
  • Excellent analytical and problem-solving skills.
  • Experience with Business Intelligence tools like Data Studio, Power BI and Tableau.
  • Experience in Data Cleaning, Creating Data Pipelines, Data Modelling, Storytelling and Dashboarding.
  • Bachelors or Masters's education in Computer Science
Read more
DP
Posted by Swaathipriya P
Bengaluru (Bangalore), Hyderabad
2 - 5 yrs
₹1L - ₹15L / yr
Spotfire
Qlikview
Tableau
PowerBI
Data Visualization
+6 more
2+ years of Analytics with predominant experience in SQL, SAS, Statistics, R , Python, Visualization
Experienced in writing complex SQL select queries (window functions & CTE’s) with advanced SQL experience
Should be an individual contributor for initial few months based on project movement team will be aligned
Strong in querying logic and data interpretation
Solid communication and articulation skills
Able to handle stakeholders independently with less interventions of reporting manager
Develop strategies to solve problems in logical yet creative ways
Create custom reports and presentations accompanied by strong data visualization and storytelling
Read more
Pune, Chennai
5 - 9 yrs
₹15L - ₹20L / yr
Scala
PySpark
Spark
SQL Azure
Hadoop
+4 more
  • 5+ years of experience in a Data Engineering role on cloud environment
  • Must have good experience in Scala/PySpark (preferably on data-bricks environment)
  • Extensive experience with Transact-SQL.
  • Experience in Data-bricks/Spark.
  • Strong experience in Dataware house projects
  • Expertise in database development projects with ETL processes.
  • Manage and maintain data engineering pipelines
  • Develop batch processing, streaming and integration solutions
  • Experienced in building and operationalizing large-scale enterprise data solutions and applications
  • Using one or more of Azure data and analytics services in combination with custom solutions
  • Azure Data Lake, Azure SQL DW (Synapse), and SQL Database products or equivalent products from other cloud services providers
  • In-depth understanding of data management (e. g. permissions, security, and monitoring).
  • Cloud repositories for e.g. Azure GitHub, Git
  • Experience in an agile environment (Prefer Azure DevOps).

Good to have

  • Manage source data access security
  • Automate Azure Data Factory pipelines
  • Continuous Integration/Continuous deployment (CICD) pipelines, Source Repositories
  • Experience in implementing and maintaining CICD pipelines
  • Power BI understanding, Delta Lake house architecture
  • Knowledge of software development best practices.
  • Excellent analytical and organization skills.
  • Effective working in a team as well as working independently.
  • Strong written and verbal communication skills.
  • Expertise in database development projects and ETL processes.
Read more
DP
Posted by Deepika Toppo
NCR (Delhi | Gurgaon | Noida)
5 - 10 yrs
₹9L - ₹20L / yr
Data Science
R Programming
Python
What you will do:
As a Data Science Lead, you will be working on creating industry first analytical and propensity models to
help discover the information hidden in vast amounts of data, and make smarter decisions to deliver
even better customer experience. Your primary focus will be in applying data mining techniques, doing
statistical analysis, and building high quality prediction systems integrated with our products.

➢ Working with business and leadership teams to gathering and analyse structured and unstructured data
➢ Data mining using state-of-the-art methods
➢ Enhancing data collection procedures to include information that is relevant for building analytic
systems
➢ Processing, cleansing, and verifying the integrity of data used for analysis
➢ Doing ad-hoc analysis and presenting results in a clear manner
➢ Creating automated anomaly detection systems and constant tracking of its performance
➢ Creation and evolution of an efficient BI pipeline into a multi-faceted pipeline to support various
modelling needs.

What we are looking for:

➢ 5-8 years of relevant experience, preferably in financial services industry.
➢ A bachelors / master’s degree in the field of Statistics, Mathematics, Computer Science or
Management from Tier 1 Institutes.
➢ Data warehousing experience will be a plus.
➢ Good conceptual understanding of statistics and probability.
➢ Experience in developing dashboards and reports using BI tools.
Read more
market-leading fintech company dedicated to providing credit
Agency job
via Talent Socio Bizcon LLP by Hema Latha N
Noida, NCR (Delhi | Gurgaon | Noida)
1 - 4 yrs
₹8L - ₹18L / yr
Analytics
Predictive analytics
Linear regression
Logistic regression
Python
+1 more
Job Description : Role : Analytics Scientist - Risk Analytics Experience Range : 1 to 4 Years Job Location : Noida Key responsibilities include •Building models to predict risk and other key metrics •Coming up with data driven solutions to control risk •Finding opportunities to acquire more customers by modifying/optimizing existing rules •Doing periodic upgrades of the underwriting strategy based on business requirements •Evaluating 3rd party solutions for predicting/controlling risk of the portfolio •Running periodic controlled tests to optimize underwriting •Monitoring key portfolio metrics and take data driven actions based on the performance Business Knowledge: Develop an understanding of the domain/function. Manage business process (es) in the work area. The individual is expected to develop domain expertise in his/her work area. Teamwork: Develop cross site relationships to enhance leverage of ideas. Set and manage partner expectations. Drive implementation of projects with Engineering team while partnering seamlessly with cross site team members Communication: Responsibly perform end to end project communication across the various levels in the organization. Candidate Specification: Skills: • Knowledge of analytical tool - R Language or Python • Established competency in Predictive Analytics (Logistic & Regression) • Experience in handling complex data sources •Dexterity with MySQL, MS Excel is good to have •Strong Analytical aptitude and logical reasoning ability •Strong presentation and communication skills Preferred: •1 - 3 years of experience in Financial Services/Analytics Industry •Understanding of the financial services business • Experience in working on advanced machine learning techniques If interested, please send your updated profile in word format with below details for further discussion at the earliest. 1. Current Company 2. Current Designation 3. Total Experience 4. Current CTC( Fixed & Variable) 5. Expected CTC 6. Notice Period 7. Current Location 8. Reason for Change 9. Availability for face to face interview on weekdays 10.Education Degreef the financial services business Thanks & Regards, Hema Talent Socio
Read more
Did not find a job you were looking for?
icon
Search for relevant jobs from 10000+ companies such as Google, Amazon & Uber actively hiring on Cutshort.
Get to hear about interesting companies hiring right now
iconFollow Cutshort
Want to apply to this role at Persistent Systems?
Why apply via Cutshort?
Connect with actual hiring teams and get their fast response. No spam.
Learn more
Get to hear about interesting companies hiring right now
iconFollow Cutshort