Cutshort logo
Fragma Data Systems logo
Data Engineer (PySpark+SQL)
Data Engineer (PySpark+SQL)
Fragma Data Systems's logo

Data Engineer (PySpark+SQL)

Evelyn Charles's profile picture
Posted by Evelyn Charles
3.5 - 8 yrs
₹5L - ₹18L / yr
Remote, Bengaluru (Bangalore)
Skills
PySpark
Data engineering
Data Warehouse (DWH)
SQL
Spark
PowerBI
Must-Have Skills:
• Good experience in Pyspark - Including Dataframe core functions and Spark SQL
• Good experience in SQL DBs - Be able to write queries including fair complexity.
• Should have excellent experience in Big Data programming for data transformation and aggregations
• Good at ELT architecture. Business rules processing and data extraction from Data Lake into data streams for business consumption.
• Good customer communication.
• Good Analytical skill
 
 
Technology Skills (Good to Have):
  • Building and operationalizing large scale enterprise data solutions and applications using one or more of AZURE data and analytics services in combination with custom solutions - Azure Synapse/Azure SQL DWH, Azure Data Lake, Azure Blob Storage, Spark, HDInsights, Databricks, CosmosDB, EventHub/IOTHub.
  • Experience in migrating on-premise data warehouses to data platforms on AZURE cloud. 
  • Designing and implementing data engineering, ingestion, and transformation functions
  • Azure Synapse or Azure SQL data warehouse
  • Spark on Azure is available in HD insights and data bricks
 
Good to Have: 
  • Experience with Azure Analysis Services
  • Experience in Power BI
  • Experience with third-party solutions like Attunity/Stream sets, Informatica
  • Experience with PreSales activities (Responding to RFPs, Executing Quick POCs)
  • Capacity Planning and Performance Tuning on Azure Stack and Spark.
Read more
Users love Cutshort
Read about what our users have to say about finding their next opportunity on Cutshort.
Subodh Popalwar's profile image

Subodh Popalwar

Software Engineer, Memorres
For 2 years, I had trouble finding a company with good work culture and a role that will help me grow in my career. Soon after I started using Cutshort, I had access to information about the work culture, compensation and what each company was clearly offering.
Companies hiring on Cutshort
companies logos

About Fragma Data Systems

Founded :
2015
Type
Size
Stage :
Profitable
About

Fragma is a leading Big data, AI and Advanced analytics company provideing services global clients.

Read more
Connect with the team
Profile picture
Mallikarjun Degul
Profile picture
Sandhya JD
Profile picture
Varun Reddy
Profile picture
Priyanka U
Profile picture
Simpy kumari
Profile picture
Minakshi Kumari
Profile picture
Latha Yuvaraj
Profile picture
Vamsikrishna G
Company social profiles
bloglinkedintwitter

Similar jobs

Quinnox
at Quinnox
2 recruiters
MidhunKumar T
Posted by MidhunKumar T
Bengaluru (Bangalore), Mumbai
10 - 15 yrs
₹30L - ₹35L / yr
ADF
azure data lake services
SQL Azure
azure synapse
Spark
+4 more

Mandatory Skills: Azure Data Lake Storage, Azure SQL databases, Azure Synapse, Data Bricks (Pyspark/Spark), Python, SQL, Azure Data Factory.


Good to have: Power BI, Azure IAAS services, Azure Devops, Microsoft Fabric


Ø Very strong understanding on ETL and ELT

Ø Very strong understanding on Lakehouse architecture.

Ø Very strong knowledge in Pyspark and Spark architecture.

Ø Good knowledge in Azure data lake architecture and access controls

Ø Good knowledge in Microsoft Fabric architecture

Ø Good knowledge in Azure SQL databases

Ø Good knowledge in T-SQL

Ø Good knowledge in CI /CD process using Azure devops

Ø Power BI

Read more
TekClan
Tanu Shree
Posted by Tanu Shree
Chennai
4 - 7 yrs
Best in industry
MS SQLServer
SQL Programming
SQL
ETL
ETL management
+5 more

Company - Tekclan Software Solutions

Position – SQL Developer

Experience – Minimum 4+ years of experience in MS SQL server, SQL Programming, ETL development.

Location - Chennai


We are seeking a highly skilled SQL Developer with expertise in MS SQL Server, SSRS, SQL programming, writing stored procedures, and proficiency in ETL using SSIS. The ideal candidate will have a strong understanding of database concepts, query optimization, and data modeling.


Responsibilities:

1. Develop, optimize, and maintain SQL queries, stored procedures, and functions for efficient data retrieval and manipulation.

2. Design and implement ETL processes using SSIS for data extraction, transformation, and loading from various sources.

3. Collaborate with cross-functional teams to gather business requirements and translate them into technical specifications.

4. Create and maintain data models, ensuring data integrity, normalization, and performance.

5. Generate insightful reports and dashboards using SSRS to facilitate data-driven decision making.

6. Troubleshoot and resolve database performance issues, bottlenecks, and data inconsistencies.

7. Conduct thorough testing and debugging of SQL code to ensure accuracy and reliability.

8. Stay up-to-date with emerging trends and advancements in SQL technologies and provide recommendations for improvement.

9. Should be an independent and individual contributor.


Requirements:

1. Minimum of 4+ years of experience in MS SQL server, SQL Programming, ETL development.

2. Proven experience as a SQL Developer with a strong focus on MS SQL Server.

3. Proficiency in SQL programming, including writing complex queries, stored procedures, and functions.

4. In-depth knowledge of ETL processes and hands-on experience with SSIS.

5. Strong expertise in creating reports and dashboards using SSRS.

6. Familiarity with database design principles, query optimization, and data modeling.

7. Experience with performance tuning and troubleshooting SQL-related issues.

8. Excellent problem-solving skills and attention to detail.

9. Strong communication and collaboration abilities.

10. Ability to work independently and handle multiple tasks simultaneously.


Preferred Skills:

1. Certification in MS SQL Server or related technologies.

2. Knowledge of other database systems such as Oracle or MySQL.

3. Familiarity with data warehousing concepts and tools.

4. Experience with version control systems.

Read more
Indium Software
at Indium Software
16 recruiters
Swaathipriya P
Posted by Swaathipriya P
Bengaluru (Bangalore), Hyderabad
2 - 5 yrs
₹1L - ₹15L / yr
Spotfire
Qlikview
Tableau
PowerBI
Data Visualization
+6 more
2+ years of Analytics with predominant experience in SQL, SAS, Statistics, R , Python, Visualization
Experienced in writing complex SQL select queries (window functions & CTE’s) with advanced SQL experience
Should be an individual contributor for initial few months based on project movement team will be aligned
Strong in querying logic and data interpretation
Solid communication and articulation skills
Able to handle stakeholders independently with less interventions of reporting manager
Develop strategies to solve problems in logical yet creative ways
Create custom reports and presentations accompanied by strong data visualization and storytelling
Read more
IDfy
at IDfy
6 recruiters
Stuti Srivastava
Posted by Stuti Srivastava
Mumbai
3 - 10 yrs
₹15L - ₹45L / yr
Data Warehouse (DWH)
Informatica
ETL
ETL architecture
Responsive Design
+4 more

Who is IDfy?

 

IDfy is the Fintech ScaleUp of the Year 2021. We build technology products that identify people accurately. This helps businesses prevent fraud and engage with the genuine with the least amount of friction. If you have opened an account with HDFC Bank or ordered from Amazon and Zomato or transacted through Paytm and BharatPe or played on Dream11 and MPL, you might have already experienced IDfy. Without even knowing it. Well…that’s just how we roll. Global credit rating giant TransUnion is an investor in IDfy. So are international venture capitalists like MegaDelta Capital, BEENEXT, and Dream Incubator. Blume Ventures is an early investor and continues to place its faith in us. We have kept our 500 clients safe from fraud while helping the honest get the opportunities they deserve. Our 350-people strong family works and plays out of our offices in suburban Mumbai. IDfy has run verifications on 100 million people. In the next 2 years, we want to touch a billion users. If you wish to be part of this journey filled with lots of action and learning, we welcome you to be part of the team!

 

What are we looking for?

 

As a senior software engineer in Data Fabric POD, you would be responsible for producing and implementing functional software solutions. You will work with upper management to define software requirements and take the lead on operational and technical projects. You would be working with a data management and science platform which provides Data as a service (DAAS) and Insight as a service (IAAS) to internal employees and external stakeholders.

 

You are eager to learn technology-agnostic who loves working with data and drawing insights from it. You have excellent organization and problem-solving skills and are looking to build the tools of the future. You have exceptional communication skills and leadership skills and the ability to make quick decisions.

 

YOE: 3 - 10 yrs

Position: Sr. Software Engineer/Module Lead/Technical Lead

 

Responsibilities:

  • Work break-down and orchestrating the development of components for each sprint.
  • Identifying risks and forming contingency plans to mitigate them.
  • Liaising with team members, management, and clients to ensure projects are completed to standard.
  • Inventing new approaches to detecting existing fraud. You will also stay ahead of the game by predicting future fraud techniques and building solutions to prevent them.
  • Developing Zero Defect Software that is secured, instrumented, and resilient.
  • Creating design artifacts before implementation.
  • Developing Test Cases before or in parallel with implementation.
  • Ensuring software developed passes static code analysis, performance, and load test.
  • Developing various kinds of components (such as UI Components, APIs, Business Components, image Processing, etc. ) that define the IDfy Platforms which drive cutting-edge Fraud Detection and Analytics.
  • Developing software using Agile Methodology and tools that support the same.

 

Requirements:

  • Apache BEAM, Clickhouse, Grafana, InfluxDB, Elixir, BigQuery, Logstash.
  • An understanding of Product Development Methodologies.
  • Strong understanding of relational databases especially SQL and hands-on experience with OLAP.
  • Experience in the creation of data ingestion pipelines and ETL pipeline (Good to have Apache Beam or Apache Airflow experience).
  • Strong design skills in defining API Data Contracts / OOAD / Microservices / Data Models.

 

Good to have:

  • Experience with TimeSeries DBs (we use InfluxDB) and Alerting / Anomaly Detection Frameworks.
  • Visualization Layers: Metabase, PowerBI, Tableau.
  • Experience in developing software in the Cloud such as GCP / AWS.
  • A passion to explore new technologies and express yourself through technical blogs.
Read more
Blue Sky Analytics
at Blue Sky Analytics
3 recruiters
Balahun Khonglanoh
Posted by Balahun Khonglanoh
Remote only
1 - 5 yrs
Best in industry
NumPy
SciPy
Data Science
Python
pandas
+8 more

About the Company

Blue Sky Analytics is a Climate Tech startup that combines the power of AI & Satellite data to aid in the creation of a global environmental data stack. Our funders include Beenext and Rainmatter. Over the next 12 months, we aim to expand to 10 environmental data-sets spanning water, land, heat, and more!


We are looking for a data scientist to join its growing team. This position will require you to think and act on the geospatial architecture and data needs (specifically geospatial data) of the company. This position is strategic and will also require you to collaborate closely with data engineers, data scientists, software developers and even colleagues from other business functions. Come save the planet with us!


Your Role

Manage: It goes without saying that you will be handling large amounts of image and location datasets. You will develop dataframes and automated pipelines of data from multiple sources. You are expected to know how to visualize them and use machine learning algorithms to be able to make predictions. You will be working across teams to get the job done.

Analyze: You will curate and analyze vast amounts of geospatial datasets like satellite imagery, elevation data, meteorological datasets, openstreetmaps, demographic data, socio-econometric data and topography to extract useful insights about the events happening on our planet.

Develop: You will be required to develop processes and tools to monitor and analyze data and its accuracy. You will develop innovative algorithms which will be useful in tracking global environmental problems like depleting water levels, illegal tree logging, and even tracking of oil-spills.

Demonstrate: A familiarity with working in geospatial libraries such as GDAL/Rasterio for reading/writing of data, and use of QGIS in making visualizations. This will also extend to using advanced statistical techniques and applying concepts like regression, properties of distribution, and conduct other statistical tests.

Produce: With all the hard work being put into data creation and management, it has to be used! You will be able to produce maps showing (but not limited to) spatial distribution of various kinds of data, including emission statistics and pollution hotspots. In addition, you will produce reports that contain maps, visualizations and other resources developed over the course of managing these datasets.

Requirements

These are must have skill-sets that we are looking for:

  • Excellent coding skills in Python (including deep familiarity with NumPy, SciPy, pandas).
  • Significant experience with git, GitHub, SQL, AWS (S3 and EC2).
  • Worked on GIS and is familiar with geospatial libraries such as GDAL and rasterio to read/write the data, a GIS software such as QGIS for visualisation and query, and basic machine learning algorithms to make predictions.
  • Demonstrable experience implementing efficient neural network models and deploying them in a production environment.
  • Knowledge of advanced statistical techniques and concepts (regression, properties of distributions, statistical tests and proper usage, etc.) and experience with applications.
  • Capable of writing clear and lucid reports and demystifying data for the rest of us.
  • Be curious and care about the planet!
  • Minimum 2 years of demonstrable industry experience working with large and noisy datasets.

Benefits

  • Work from anywhere: Work by the beach or from the mountains.
  • Open source at heart: We are building a community where you can use, contribute and collaborate on.
  • Own a slice of the pie: Possibility of becoming an owner by investing in ESOPs.
  • Flexible timings: Fit your work around your lifestyle.
  • Comprehensive health cover: Health cover for you and your dependents to keep you tension free.
  • Work Machine of choice: Buy a device and own it after completing a year at BSA.
  • Quarterly Retreats: Yes there's work-but then there's all the non-work+fun aspect aka the retreat!
  • Yearly vacations: Take time off to rest and get ready for the next big assignment by availing the paid leaves.
Read more
Scry AI
at Scry AI
1 recruiter
Siddarth Thakur
Posted by Siddarth Thakur
Remote only
3 - 8 yrs
₹15L - ₹20L / yr
PySpark
Data engineering
Big Data
Hadoop
Spark
+6 more

Title: Data Engineer (Azure) (Location: Gurgaon/Hyderabad)

Salary: Competitive as per Industry Standard

We are expanding our Data Engineering Team and hiring passionate professionals with extensive

knowledge and experience in building and managing large enterprise data and analytics platforms. We

are looking for creative individuals with strong programming skills, who can understand complex

business and architectural problems and develop solutions. The individual will work closely with the rest

of our data engineering and data science team in implementing and managing Scalable Smart Data

Lakes, Data Ingestion Platforms, Machine Learning and NLP based Analytics Platforms, Hyper-Scale

Processing Clusters, Data Mining and Search Engines.

What You’ll Need:

  • 3+ years of industry experience in creating and managing end-to-end Data Solutions, Optimal

Data Processing Pipelines and Architecture dealing with large volume, big data sets of varied

data types.

  • Proficiency in Python, Linux and shell scripting.
  • Strong knowledge of working with PySpark dataframes, Pandas dataframes for writing efficient pre-processing and other data manipulation tasks.
    ● Strong experience in developing the infrastructure required for data ingestion, optimal

extraction, transformation, and loading of data from a wide variety of data sources using tools like Azure Data Factory,  Azure Databricks (or Jupyter notebooks/ Google Colab) (or other similiar tools).

  • Working knowledge of github or other version control tools.
  • Experience with creating Restful web services and API platforms.
  • Work with data science and infrastructure team members to implement practical machine

learning solutions and pipelines in production.

  • Experience with cloud providers like Azure/AWS/GCP.
  • Experience with SQL and NoSQL databases. MySQL/ Azure Cosmosdb / Hbase/MongoDB/ Elasticsearch etc.
  • Experience with stream-processing systems: Spark-Streaming, Kafka etc and working experience with event driven architectures.
  • Strong analytic skills related to working with unstructured datasets.

 

Good to have (to filter or prioritize candidates)

  • Experience with testing libraries such as pytest for writing unit-tests for the developed code.
  • Knowledge of Machine Learning algorithms and libraries would be good to have,

implementation experience would be an added advantage.

  • Knowledge and experience of Datalake, Dockers and Kubernetes would be good to have.
  • Knowledge of Azure functions , Elastic search etc will be good to have.

 

  • Having experience with model versioning (mlflow) and data versioning will be beneficial
  • Having experience with microservices libraries or with python libraries such as flask for hosting ml services and models would be great.
Read more
Infonex Technologies
at Infonex Technologies
1 recruiter
Vinay Ramesh
Posted by Vinay Ramesh
Bengaluru (Bangalore)
4 - 7 yrs
₹6L - ₹30L / yr
Informatica
ETL
SQL
Linux/Unix
Oracle
+1 more
  • Experience implementing large-scale ETL processes using Informatica PowerCenter.
  • Design high-level ETL process and data flow from the source system to target databases.
  • Strong experience with Oracle databases and strong SQL.
  • Develop & unit test Informatica ETL processes for optimal performance utilizing best practices.
  • Performance tune Informatica ETL mappings and report queries.
  • Develop database objects like Stored Procedures, Functions, Packages, and Triggers using SQL and PL/SQL.
  • Hands-on Experience in Unix.
  • Experience in Informatica Cloud (IICS).
  • Work with appropriate leads and review high-level ETL design, source to target data mapping document, and be the point of contact for any ETL-related questions.
  • Good understanding of project life cycle, especially tasks within the ETL phase.
  • Ability to work independently and multi-task to meet critical deadlines in a rapidly changing environment.
  • Excellent communication and presentation skills.
  • Effectively worked on the Onsite and Offshore work model.
Read more
MNC
at MNC
Agency job
via Fragma Data Systems by Priyanka U
Bengaluru (Bangalore)
3 - 7 yrs
₹8L - ₹16L / yr
PySpark
Python
Spark
Roles and Responsibilities:

• Responsible for developing and maintaining applications with PySpark 
• Contribute to the overall design and architecture of the application developed and deployed.
• Performance Tuning wrt to executor sizing and other environmental parameters, code optimization, partitions tuning, etc.
• Interact with business users to understand requirements and troubleshoot issues.
• Implement Projects based on functional specifications.

Must-Have Skills:

• Good experience in Pyspark - Including Dataframe core functions and Spark SQL
• Good experience in SQL DBs - Be able to write queries including fair complexity.
• Should have excellent experience in Big Data programming for data transformation and aggregations
• Good at ETL architecture. Business rules processing and data extraction from Data Lake into data streams for business consumption.
• Good customer communication.
• Good Analytical skills
Read more
Bengaluru (Bangalore)
5 - 7 yrs
₹14.5L - ₹16.5L / yr
Data Science
Data scientist
Data Analytics
Machine Learning (ML)
Python
+2 more
  • Actively engage with internal business teams to understand their challenges and deliver robust, data-driven solutions.
  • Work alongside global counterparts to solve data-intensive problems using standard analytical frameworks and tools.
  • Be encouraged and expected to innovate and be creative in your data analysis, problem-solving, and presentation of solutions.
  • Network and collaborate with a broad range of internal business units to define and deliver joint solutions.
  • Work alongside customers to leverage cutting-edge technology (machine learning, streaming analytics, and ‘real’ big data) to creatively solve problems and disrupt existing business models.

In this role, we are looking for:

  • A problem-solving mindset with the ability to understand business challenges and how to apply your analytics expertise to solve them.
  • The unique person who can present complex mathematical solutions in a simple manner that most will understand, including customers.
  • An individual excited by innovation and new technology and eager to finds ways to employ these innovations in practice.
  • A team mentality, empowered by the ability to work with a diverse set of individuals.

Basic Qualifications

  • A Bachelor’s degree in Data Science, Math, Statistics, Computer Science or related field with an emphasis on analytics.
  • 5+ Years professional experience in a data scientist/analyst role or similar.
  • Proficiency in your statistics/analytics/visualization tool of choice, but preferably in the Microsoft Azure Suite, including Azure ML Studio and PowerBI as well as R, Python, SQL.

Preferred Qualifications

  • Excellent communication, organizational transformation, and leadership skills
  • Demonstrated excellence in Data Science, Business Analytics and Engineering

 

 

 

 

 

Read more
DemandMatrix
at DemandMatrix
4 recruiters
Harwinder Singh
Posted by Harwinder Singh
Remote only
9 - 12 yrs
₹25L - ₹30L / yr
Big Data
PySpark
Apache Hadoop
Spark
Python
+3 more

Only a solid grounding in computer engineering, Unix, data structures and algorithms would enable you to meet this challenge.

7+ years of experience architecting, developing, releasing, and maintaining large-scale big data platforms on AWS or GCP

Understanding of how Big Data tech and NoSQL stores like MongoDB, HBase/HDFS, ElasticSearch synergize to power applications in analytics, AI and knowledge graphs

Understandingof how data processing models, data location patterns, disk IO, network IO, shuffling affect large scale text processing - feature extraction, searching etc

Expertise with a variety of data processing systems, including streaming, event, and batch (Spark,  Hadoop/MapReduce)

5+ years proficiency in configuring and deploying applications on Linux-based systems

5+ years of experience Spark - especially Pyspark for transforming large non-structured text data, creating highly optimized pipelines

Experience with RDBMS, ETL techniques and frameworks (Sqoop, Flume) and big data querying tools (Pig, Hive)

Stickler of world class best practices, uncompromising on the quality of engineering, understand standards and reference architectures and deep in Unix philosophy with appreciation of big data design patterns, orthogonal code design and functional computation models
Read more
Why apply to jobs via Cutshort
people_solving_puzzle
Personalized job matches
Stop wasting time. Get matched with jobs that meet your skills, aspirations and preferences.
people_verifying_people
Verified hiring teams
See actual hiring teams, find common social connections or connect with them directly. No 3rd party agencies here.
ai_chip
Move faster with AI
We use AI to get you faster responses, recommendations and unmatched user experience.
21,01,133
Matches delivered
37,12,187
Network size
15,000
Companies hiring
Did not find a job you were looking for?
icon
Search for relevant jobs from 10000+ companies such as Google, Amazon & Uber actively hiring on Cutshort.
companies logo
companies logo
companies logo
companies logo
companies logo
Get to hear about interesting companies hiring right now
Company logo
Company logo
Company logo
Company logo
Company logo
Linkedin iconFollow Cutshort
Users love Cutshort
Read about what our users have to say about finding their next opportunity on Cutshort.
Subodh Popalwar's profile image

Subodh Popalwar

Software Engineer, Memorres
For 2 years, I had trouble finding a company with good work culture and a role that will help me grow in my career. Soon after I started using Cutshort, I had access to information about the work culture, compensation and what each company was clearly offering.
Companies hiring on Cutshort
companies logos