Data Engineer_Scala

at Ganit Business Solutions

DP
Posted by Vijitha VS
icon
Remote only
icon
4 - 7 yrs
icon
₹10L - ₹30L / yr
icon
Full time
Skills
Data Warehouse (DWH)
Informatica
ETL
Big Data
Scala
Hadoop
Apache Hive
PySpark
Spark

Job Description:

We are looking for a Big Data Engineer who have worked across the entire ETL stack. Someone who has ingested data in a batch and live stream format, transformed large volumes of daily and built Data-warehouse to store the transformed data and has integrated different visualization dashboards and applications with the data stores.    The primary focus will be on choosing optimal solutions to use for these purposes, then maintaining, implementing, and monitoring them.

Responsibilities:

  • Develop, test, and implement data solutions based on functional / non-functional business requirements.
  • You would be required to code in Scala and PySpark daily on Cloud as well as on-prem infrastructure
  • Build Data Models to store the data in a most optimized manner
  • Identify, design, and implement process improvements: automating manual processes, optimizing data delivery, re-designing infrastructure for greater scalability, etc.
  • Implementing the ETL process and optimal data pipeline architecture
  • Monitoring performance and advising any necessary infrastructure changes.
  • Create data tools for analytics and data scientist team members that assist them in building and optimizing our product into an innovative industry leader.
  • Work with data and analytics experts to strive for greater functionality in our data systems.
  • Proactively identify potential production issues and recommend and implement solutions
  • Must be able to write quality code and build secure, highly available systems.
  • Create design documents that describe the functionality, capacity, architecture, and process.
  • Review peer-codes and pipelines before deploying to Production for optimization issues and code standards

Skill Sets:

  • Good understanding of optimal extraction, transformation, and loading of data from a wide variety of data sources using SQL and ‘big data’ technologies.
  • Proficient understanding of distributed computing principles
  • Experience in working with batch processing/ real-time systems using various open-source technologies like NoSQL, Spark, Pig, Hive, Apache Airflow.
  • Implemented complex projects dealing with the considerable data size (PB).
  • Optimization techniques (performance, scalability, monitoring, etc.)
  • Experience with integration of data from multiple data sources
  • Experience with NoSQL databases, such as HBase, Cassandra, MongoDB, etc.,
  • Knowledge of various ETL techniques and frameworks, such as Flume
  • Experience with various messaging systems, such as Kafka or RabbitMQ
  • Creation of DAGs for data engineering
  • Expert at Python /Scala programming, especially for data engineering/ ETL purposes

 

 

 

About Ganit Business Solutions

Founded
2017
Type
Products & Services
Size
100-1000 employees
Stage
Bootstrapped
View full company details
Why apply to jobs via Cutshort
Personalized job matches
Stop wasting time. Get matched with jobs that meet your skills, aspirations and preferences.
Verified hiring teams
See actual hiring teams, find common social connections or connect with them directly. No 3rd party agencies here.
Move faster with AI
We use AI to get you faster responses, recommendations and unmatched user experience.
2101133
Matches delivered
3712187
Network size
15000
Companies hiring

Similar jobs

Data Engineer

at Impact Guru

Founded 2014  •  Products & Services  •  100-1000 employees  •  Raised funding
ETL
Informatica
Data Warehouse (DWH)
SQL
Python
Big Data
icon
Mumbai
icon
1 - 4 yrs
icon
₹4L - ₹8L / yr

Job Responsibilities: 

 

  • Developing highly reliable web crawlers and parsers across various websites
  • Extract structured/unstructured data and store them into SQL/No SQL database
  • Work closely with Product/Research/Technology teams to provide data for analysis
  • Develop frameworks for automating and maintaining constant flow of data from multiple sources
  • Develop and maintain data pipelines for batch/incremental as well as real-time requirements.
  • Develop a deep understanding of the data sources on the web and know exactly how, when, and which data to parse and store this data
  • Create a monitoring framework to identify anomalies in web crawlers and resolve for contingencies
  • Implement best practices in-house to detect / prevent crawlers on internal systems and websites
  • Writing and running queries on large datasets to support analytics team or data sharing requirements.
  • Dealing well with ambiguity, prioritizing needs, and delivering results in a dynamic environment

 

Must-Have:

 

  • Proficient knowledge in Python language and excellent knowledge on Web Crawling in Python Scrapy / Beautifulsoup / URLlib / Selenium / WebHarvest etc.
  • Experience in Data parsing and understanding of document structure in HTML – CSS/DOM/XPATH. Knowledge of JS would be a plus
  • Strong experience in Data Parsing
  • Experience in working with large datasets, querying terabytes of data on a regular basis – proficient in SQL
  • Must be able to develop reusable code-based crawlers that are easy to modify / transform
  • Proficient in GIT and better understanding of launching instances and setting up crawlers on AWS/Azure
  • Understands detailed requirements and demonstrates excellent problem-solving skills 
  • Strong sense of ownership, drive, and ability to deliver results.
  • A track record of digging in to the tough problems / challenges and bringing innovative approaches to solve for such situations. Must be highly capable of self-teaching new techniques.

B.E/B.Tech in Computer Science / IT, BCA, B.Sc in Computer Science / IT

Job posted by
Fahad Kazi
Snowflake
Snow flake schema
ETL
Informatica
Data Warehouse (DWH)
Stored Procedures
Cloud Computing
Python
Linux/Unix
snow pipe
icon
Bengaluru (Bangalore), Hyderabad, Pune
icon
3 - 7 yrs
icon
₹10L - ₹25L / yr

Greetings of the day.

 

Hiring Snowflake developer_ Accionlabs (Permanent), Hyderabad, Pune, Bangalore

 

Role: Snowflake (Project @ IBM office)

Exp-4+ yrs

Locations – Hyderabad, Pune, Bangalore (Work From Office only)

Maximum salary 25 LPA for highest experience.

 

NOTE 1: Candidate should go to any of the IBM offices in Bengaluru, Hyderabad and Pune based on their preferences (no WFH)

 

 

Job details:

4 + years of overall experience.

Should have 2 + yrs exp in Snowflake. Good exp in Stored Procedure, loading variety of data using Snow pipe/integration .

Should have DW background.

Exposure to Cloud , Python, Unix

 

Job posted by
Anuradha K
PySpark
Data engineering
Big Data
Hadoop
Spark
SQL
Python
Microsoft SQL Server DBA
ELT
icon
Remote only
icon
7 - 13 yrs
icon
₹15L - ₹35L / yr
Experience
Experience Range

2 Years - 10 Years

Function Information Technology
Desired Skills
Must Have Skills:
• Good experience in Pyspark - Including Dataframe core functions and Spark SQL
• Good experience in SQL DBs - Be able to write queries including fair complexity.
• Should have excellent experience in Big Data programming for data transformation and aggregations
• Good at ELT architecture. Business rules processing and data extraction from Data Lake into data streams for business consumption.
• Good customer communication.
• Good Analytical skills
Education
Education Type Engineering
Degree / Diploma Bachelor of Engineering, Bachelor of Computer Applications, Any Engineering
Specialization / Subject Any Specialisation
Job Type Full Time
Job ID 000018
Department Software Development
Job posted by
Minakshi Kumari
Teradata
Teradata DBA
Data Warehouse (DWH)
Teradata SQL Assistant
Teradata Warehouse Miner
Teradici
Amazon Web Services (AWS)
Amazon EC2
Amazon S3
Linux/Unix
Product Lifecycle Management (PLM)
Database Design
Oracle RMAN
OEM
Oracle CC&B
BDA
icon
Remote, Mumbai, Bengaluru (Bangalore)
icon
8 - 14 yrs
icon
₹9L - ₹19L / yr

Role: Teradata Lead

Band: C2

Experience level: Minimum 10 years

Job Description:

This role would be leading the DBA teams of multiple experience level DBAs for a mix of – Teradata, Oracle and SQL.

Skill Set:

Minimum 10 years of relevant Database and Datawarehouse experience.

Hands on experience of administrating Teradata.

Leading the performance analysis, capacity planning and supporting the batchops and users with their jobs.

Drive implementation of standards and best practices to optimize database utilization and availability.

Hands on with AWS Cloud infrastructure services such as EC2, S3 and network services.

Proficient in Linux system administration relevant to Teradata management.

 

Teradata Specific (Mandatory)

Manage and Operate 24x7 production as well as development databases to ensure maximum availability of system resources.

Responsible for operational activities of a Database Administrator such as System monitoring, User Management, Space Management, Troubleshooting, and Batch/user support.

Perform DBA related tasks in key areas of Performance Management & Reporting, workload management using TASM.

Manage Production/Development databases in areas like Capacity Planning, Performance Monitoring & Tuning, Strategies Defined for Backup/Recovery Techniques, Space/ User/ Security management along With Problem determination and resolution.

Experience with Teradata Workload management & monitoring and query optimization.

Expertise with system monitoring using viewpoint and logs.

Proficient in analysing the performance and optimizing at different levels.

Ability to create advanced system-level capacity reports as well as root cause analysis.

 

Oracle Specific (Optional)

Database Administration Installation of Oracle software on Unix/Linux platform.

Database Lifecycle Management - Database creation, setup decommissioning.

Database event alert monitoring, space management, user management.

Database upgrades migrations, cloning.

Database backup restore recovery using RMAN.

Setup and maintain High-Availability and Disaster Recovery solutions.

Proficient in Standby and Data Guard technology.

Hands on with the OEM CC.

 

Mandatory Certification:

  • Teradata Vantage Certified Administrator
  • ITIL Foundation
Job posted by
Sanjay Biswakarma

BigData Developer (Spark+Python)

at Simplifai Cognitive Solutions Pvt Ltd

Founded 2017  •  Product  •  100-500 employees  •  Bootstrapped
Spark
Big Data
Apache Spark
Python
PySpark
Hadoop
icon
Pune
icon
2 - 15 yrs
icon
₹10L - ₹30L / yr

We are looking for a skilled Senior/Lead Bigdata Engineer to join our team. The role is part of the research and development team, where you with enthusiasm and knowledge are going to be our technical evangelist for the development of our inspection technology and products.

 

At Elop we are developing product lines for sustainable infrastructure management using our own patented technology for ultrasound scanners and combine this with other sources to see holistic overview of the concrete structure. At Elop we will provide you with world-class colleagues highly motivated to position the company as an international standard of structural health monitoring. With the right character you will be professionally challenged and developed.

This position requires travel to Norway.

 

Elop is sister company of Simplifai and co-located together in all geographic locations.

https://elop.no/

https://www.simplifai.ai/en/


Roles and Responsibilities

  • Define technical scope and objectives through research and participation in requirements gathering and definition of processes
  • Ingest and Process data from data sources (Elop Scanner) in raw format into Big Data ecosystem
  • Realtime data feed processing using Big Data ecosystem
  • Design, review, implement and optimize data transformation processes in Big Data ecosystem
  • Test and prototype new data integration/processing tools, techniques and methodologies
  • Conversion of MATLAB code into Python/C/C++.
  • Participate in overall test planning for the application integrations, functional areas and projects.
  • Work with cross functional teams in an Agile/Scrum environment to ensure a quality product is delivered.

Desired Candidate Profile

  • Bachelor's degree in Statistics, Computer or equivalent
  • 7+ years of experience in Big Data ecosystem, especially Spark, Kafka, Hadoop, HBase.
  • 7+ years of hands-on experience in Python/Scala is a must.
  • Experience in architecting the big data application is needed.
  • Excellent analytical and problem solving skills
  • Strong understanding of data analytics and data visualization, and must be able to help development team with visualization of data.
  • Experience with signal processing is plus.
  • Experience in working on client server architecture is plus.
  • Knowledge about database technologies like RDBMS, Graph DB, Document DB, Apache Cassandra, OpenTSDB
  • Good communication skills, written and oral, in English

We can Offer

  • An everyday life with exciting and challenging tasks with the development of socially beneficial solutions
  • Be a part of companys research and Development team to create unique and innovative products
  • Colleagues with world-class expertise, and an organization that has ambitions and is highly motivated to position the company as an international player in maintenance support and monitoring of critical infrastructure!
  • Good working environment with skilled and committed colleagues an organization with short decision paths.
  • Professional challenges and development
Job posted by
Priyanka Malani
Spark
Big Data
Data engineering
Hadoop
Apache Kafka
Amazon Web Services (AWS)
Amazon S3
SQL server
Python
Java
icon
Bengaluru (Bangalore)
icon
3 - 5 yrs
icon
₹6L - ₹12L / yr
Data Engineer

• Drive the data engineering implementation
• Strong experience in building data pipelines
• AWS stack experience is must
• Deliver Conceptual, Logical and Physical data models for the implementation
teams.

• SQL stronghold is must. Advanced SQL working knowledge and experience
working with a variety of relational databases, SQL query authoring
• AWS Cloud data pipeline experience is must. Data pipelines and data centric
applications using distributed storage platforms like S3 and distributed processing
platforms like Spark, Airflow, Kafka
• Working knowledge of AWS technologies such as S3, EC2, EMR, RDS, Lambda,
Elasticsearch
• Ability to use a major programming (e.g. Python /Java) to process data for
modelling.
Job posted by
geeti gaurav mohanty
Spark
Python
SQL
icon
Bengaluru (Bangalore)
icon
2 - 5 yrs
icon
₹7L - ₹12L / yr
Primary Responsibilities:
• Responsible for developing and maintaining applications with PySpark
• Contribute to the overall design and architecture of the application developed and deployed.
• Performance Tuning wrt to executor sizing and other environmental parameters, code optimization, partitions tuning, etc.
• Interact with business users to understand requirements and troubleshoot issues.
• Implement Projects based on functional specifications.


Must-Have Skills:

• Good experience in Pyspark - Including Dataframe core functions and Spark SQL
• Good customer communication.
• Good Analytical skills
Job posted by
geeti gaurav mohanty

Data Engineer and Data Bricks

at A Global IT Service company

Agency job
via Multi Recruit
Data engineering
Data Bricks
data engineer
PySpark
ETL
Azure Data Bricks
SSIS
Azure Data Factory
icon
Bengaluru (Bangalore)
icon
5 - 8 yrs
icon
₹20L - ₹30L / yr
  • Insurance P&C and Specialty domain experience a plus
  • Experience in a cloud-based architecture preferred, such as Databricks, Azure Data Lake, Azure Data Factory, etc.
  • Strong understanding of ETL fundamentals and solutions. Should be proficient in writing advanced / complex SQL, expertise in performance tuning and optimization of SQL queries required.
  • Strong experience in Python/PySpark and Spark SQL
  • Experience in troubleshooting data issues, analyzing end to end data pipelines, and working with various teams in resolving issues and solving complex problems.
  • Strong experience developing Spark applications using PySpark and SQL for data extraction, transformation, and aggregation from multiple formats for analyzing & transforming the data to uncover insights and actionable intelligence for internal and external use
Job posted by
Manjunath Multirecruit

Scala Spark Engineer

at Skanda Projects

Founded 2010  •  Services  •  100-1000 employees  •  Profitable
Scala
Apache Spark
Big Data
icon
Bengaluru (Bangalore)
icon
2 - 8 yrs
icon
₹6L - ₹25L / yr
PreferredSkills- • Should have minimum 3 years of experience in Software development • Strong experience in spark Scala development • Person should have strong experience in AWS cloud platform services • Should have good knowledge and exposure in Amazon EMR, EC2 • Should be good in over databases like dynamodb, snowflake
Job posted by
Nagraj Kumar

Hadoop Developer

at Accion Labs

Founded 2009  •  Products & Services  •  100-1000 employees  •  Profitable
HDFS
Hbase
Spark
Flume
hive
Sqoop
Scala
icon
Mumbai
icon
5 - 14 yrs
icon
₹8L - ₹18L / yr
US based Multinational Company Hands on Hadoop
Job posted by
Neha Mayekar
Did not find a job you were looking for?
icon
Search for relevant jobs from 10000+ companies such as Google, Amazon & Uber actively hiring on Cutshort.
Get to hear about interesting companies hiring right now
iconFollow Cutshort
Want to apply to this role at Ganit Business Solutions?
Why apply via Cutshort?
Connect with actual hiring teams and get their fast response. No spam.
Learn more
Get to hear about interesting companies hiring right now
iconFollow Cutshort