BigData

at Virtusa

Agency job
icon
Hyderabad, Pune, Chennai, Bengaluru (Bangalore), Mumbai
icon
5 - 6 yrs
icon
₹18L - ₹25L / yr
icon
Full time
Skills
Big Data
Big Data Engineer
Spark
Apache Spark
Scala
Apache Hive
Apache HBase

 

Experience :5--6+

 

 

Must Have

 

  • Apache Spark, Spark Streaming, Scala Programming, Apache HBASE   
  • Unix Scripting, SQL Knowledge
  • Good to Have.
  • Experience working with Graph Database preferably JANUS Graph DB
  • Experience working with Document Databases  and Apache SOLR

 

Job Description

Data Engineer  with Experience in the following area.

 

  • Designing and implementing high performance data ingestion pipelines from multiple sources using Scala and Apache Spark. 
  • Experience with event based  Spark Streaming technologies to ingest data. 
  • Developing Scalable and re-usable frameworks for ingesting data sets.
  • Integrating end to end data pipelines to take data from source systems to target data repositories ensuring the quality and consistency of data maintained at all times.
  • Preference for Big Data related Certifications like Cloudera Certified Professional CCP and Cloudera Certified Associate CCA
  • Working within Agile delivery methodology to deliver product implementation in iterative sprints.
  • Strong knowledge of Data Management principles

 

Location: PAN INDIA

Read more

About Virtusa

Virtusa help clients change, disrupt, and unlock new value that surpasses their wildest expectations not just to reach our best, but to redefine yours.
Read more
Founded
1996
Type
Services
Size
100-1000 employees
Stage
Profitable
View full company details
Why apply to jobs via Cutshort
Personalized job matches
Stop wasting time. Get matched with jobs that meet your skills, aspirations and preferences.
Verified hiring teams
See actual hiring teams, find common social connections or connect with them directly. No 3rd party agencies here.
Move faster with AI
We use AI to get you faster responses, recommendations and unmatched user experience.
2101133
Matches delivered
3712187
Network size
15000
Companies hiring

Similar jobs

Data Engineer

at building a cutting-edge data science department to serve the older adult community and marketplace.

Agency job
via HyrHub
Big Data
Hadoop
Apache Hive
Data Warehouse (DWH)
PySpark
Cloud Computing
icon
Chandigarh
icon
5 - 8 yrs
icon
₹8L - ₹15L / yr

We are currently seeking talented and highly motivated Data Engineers to lead in the development of our discovery and support platform. The successful candidate will join a small, global team of data focused associates that have successfully built, and maintained a best of class traditional, Kimball based, SQL server founded, data warehouse.  The successful candidate will lead the conversion of the existing data structure into an AWS focused, big data framework and assist in identifying and pipelining existing and augmented data sets into this environment. The successful candidate must be able to lead and assist in architecting and constructing the AWS foundation and initial data ports.

 

Specific responsibilities will be to:

  • Lead and assist in design, deploy, and maintain robust methods for data management and analysis, primarily using the AWS cloud
  • Develop computational methods for integrating multiple data sources to facilitate target and algorithmic
  • Provide computational tools to ensure trustworthy data sources and facilitate reproducible
  • Provide leadership around architecting, designing, and building target AWS data environment (like data lake and data warehouse).
  • Work with on staff subject-matter experts to evaluate existing data sources, DW, ETL ports, existing stove type data sources and available augmentation data sets.
  • Implement methods for execution of high-throughput assays and subsequent acquisition, management, and analysis of the
  • Assist in the communications of complex scientific, software and data concepts and
  • Assist in the identification and hiring of additional data engineer associates.

Job Requirements:

  • Master’s Degree (or equivalent experience) in computer science, data science or a scientific field that has relevance to healthcare in the United States
  • Extensive experience in the use of a high-level programming language (i.e., Python or Scala) and relevant AWS services.
  • Experience in AWS cloud services like S3, Glue, Lake Formation, Athena, and others.
  • Experience in creating and managing Data Lakes and Data Warehouses.
  • Experience with big data tools like Hadoop, Hive, Talend, Apache Spark, Kafka.
  • Advance SQL scripting.
  • Database Management Systems (for example, Oracle, MySQL or MS SQL Server)
  • Hands on experience in data transformation tools, data processing and data modeling on a big data environment.
  • Understanding the basics of distributed systems.
  • Experience working and communicating with subject matter expert
  • The ability to work independently as well as to collaborate on multidisciplinary, global teams in a startup fashion with traditional data warehouse skilled data associates and business teams unfamiliar with data science techniques
  • Strong communication, data presentation and visualization
Read more
Job posted by
Shwetha Naik

Sr Software Engineer - Python

at Energy Exemplar

Founded 1999  •  Product  •  100-500 employees  •  Profitable
Spark
Hadoop
Big Data
Data engineering
PySpark
Apache Spark
Web Scraping
icon
Pune
icon
6 - 8 yrs
icon
₹15L - ₹22L / yr
Greetings!!

The Energy Exemplar (EE) data team is looking for an experienced Python Developer (Data Engineer) to join our Pune office. As a dedicated Data Engineer on our Research team, you will apply data engineering expertise, work very closely with the core data team to identify different data sources for specific energy markets and create an automated data pipeline. The pipeline will then incrementally pull the data from its sources and maintain a dataset, which in turn provides tremendous value to hundreds of EE customers.

 

At EE, you’ll have access to vast amounts of energy-related data from our sources. Our data pipelines are curated and supported by engineering teams. We also offer many company-sponsored classes and conferences that focus on data engineering, data platform. There’s a great growth opportunity for data engineering at EE..

Responsibilities

  •  Develop, test and maintain architectures, such as databases and large-scale processing systems using high-performance data pipelines.
  •  Recommend and implement ways to improve data reliability, efficiency, and quality.
  •  Identify performant features and make them universally accessible to our teams across EE.
  •  Work together with data analysts and data scientists to wrangle the data and provide quality datasets and insights to business-critical decisions
  • Take end-to-end responsibility for the development, quality, testing, and production readiness of the services you build.
  • Define and evangelize Data Engineering best standards and practices to ensure engineering excellence at every stage of a development cycle.
  • Act as a resident expert for data engineering, feature engineering, exploratory data analysis.
  • Agile methodologies, acting as Scrum Master would be an added plus.

Qualifications

  • 6+ years of professional experience in developing data pipelines for large-scale, complex datasets from varieties of data sources.
  • Data Engineering expertise with strong experience working with Python, Beautiful Soup, Selenium, Regular Expression, Web Scraping.
  • Best practices with Python Development, Doc String, Type Hints, Unit Testing, etc.
  • Experience working with Cloud-based data technologies such as Azure Data lake, Azure Data Factory, Azure Data Bricks is optionally desirable.
  • Moderate coding skills. SQL or similar required. C# or other languages strongly preferred.
  • Outstanding communication and collaboration skills. You can learn from and teach others.
  • Strong drive for results. You have a proven record of shepherding experiments to create successful shipping products/services
  • A Bachelor or Masters degree in Computer Science or Engineering with coursework in Python, Big Data, Data Engineering is highly desirable.
Read more
Job posted by
Pratibha Shukla

Data Engineer

at Slintel

Agency job
via Qrata
Big Data
ETL
Apache Spark
Spark
Data engineer
Data engineering
Linux/Unix
MySQL
Python
Amazon Web Services (AWS)
icon
Bengaluru (Bangalore)
icon
4 - 9 yrs
icon
₹20L - ₹28L / yr
Responsibilities
  • Work in collaboration with the application team and integration team to design, create, and maintain optimal data pipeline architecture and data structures for Data Lake/Data Warehouse.
  • Work with stakeholders including the Sales, Product, and Customer Support teams to assist with data-related technical issues and support their data analytics needs.
  • Assemble large, complex data sets from third-party vendors to meet business requirements.
  • Identify, design, and implement internal process improvements: automating manual processes, optimizing data delivery, re-designing infrastructure for greater scalability, etc.
  • Build the infrastructure required for optimal extraction, transformation, and loading of data from a wide variety of data sources using SQL, Elasticsearch, MongoDB, and AWS technology.
  • Streamline existing and introduce enhanced reporting and analysis solutions that leverage complex data sources derived from multiple internal systems.

Requirements
  • 5+ years of experience in a Data Engineer role.
  • Proficiency in Linux.
  • Must have SQL knowledge and experience working with relational databases, query authoring (SQL) as well as familiarity with databases including Mysql, Mongo, Cassandra, and Athena.
  • Must have experience with Python/Scala.
  • Must have experience with Big Data technologies like Apache Spark.
  • Must have experience with Apache Airflow.
  • Experience with data pipeline and ETL tools like AWS Glue.
  • Experience working with AWS cloud services: EC2, S3, RDS, Redshift.
Read more
Job posted by
Prajakta Kulkarni

Data Engineer_Scala

at Ganit Business Solutions

Founded 2017  •  Products & Services  •  100-1000 employees  •  Bootstrapped
ETL
Informatica
Data Warehouse (DWH)
Big Data
Scala
Hadoop
Apache Hive
PySpark
Spark
icon
Remote only
icon
4 - 7 yrs
icon
₹10L - ₹30L / yr

Job Description:

We are looking for a Big Data Engineer who have worked across the entire ETL stack. Someone who has ingested data in a batch and live stream format, transformed large volumes of daily and built Data-warehouse to store the transformed data and has integrated different visualization dashboards and applications with the data stores.    The primary focus will be on choosing optimal solutions to use for these purposes, then maintaining, implementing, and monitoring them.

Responsibilities:

  • Develop, test, and implement data solutions based on functional / non-functional business requirements.
  • You would be required to code in Scala and PySpark daily on Cloud as well as on-prem infrastructure
  • Build Data Models to store the data in a most optimized manner
  • Identify, design, and implement process improvements: automating manual processes, optimizing data delivery, re-designing infrastructure for greater scalability, etc.
  • Implementing the ETL process and optimal data pipeline architecture
  • Monitoring performance and advising any necessary infrastructure changes.
  • Create data tools for analytics and data scientist team members that assist them in building and optimizing our product into an innovative industry leader.
  • Work with data and analytics experts to strive for greater functionality in our data systems.
  • Proactively identify potential production issues and recommend and implement solutions
  • Must be able to write quality code and build secure, highly available systems.
  • Create design documents that describe the functionality, capacity, architecture, and process.
  • Review peer-codes and pipelines before deploying to Production for optimization issues and code standards

Skill Sets:

  • Good understanding of optimal extraction, transformation, and loading of data from a wide variety of data sources using SQL and ‘big data’ technologies.
  • Proficient understanding of distributed computing principles
  • Experience in working with batch processing/ real-time systems using various open-source technologies like NoSQL, Spark, Pig, Hive, Apache Airflow.
  • Implemented complex projects dealing with the considerable data size (PB).
  • Optimization techniques (performance, scalability, monitoring, etc.)
  • Experience with integration of data from multiple data sources
  • Experience with NoSQL databases, such as HBase, Cassandra, MongoDB, etc.,
  • Knowledge of various ETL techniques and frameworks, such as Flume
  • Experience with various messaging systems, such as Kafka or RabbitMQ
  • Creation of DAGs for data engineering
  • Expert at Python /Scala programming, especially for data engineering/ ETL purposes

 

 

 

Read more
Job posted by
Vijitha VS

Data Scientist

at Energy Exemplar

Founded 1999  •  Product  •  100-500 employees  •  Profitable
Data Science
Machine Learning (ML)
Computer Vision
Forecasting
Python
Apache Hive
Spark
SQL
icon
Bengaluru (Bangalore)
icon
5 - 10 yrs
icon
Best in industry

Qualifications

  • 5+ years of professional experience in experiment design and applied machine learning predicting outcomes in large-scale, complex datasets.
  • Proficiency in Python, Azure ML, or other statistics/ML tools.
  • Proficiency in Deep Neural Network, Python based frameworks.
  • Proficiency in Azure DataBricks, Hive, Spark.
  • Proficiency in deploying models into production (Azure stack).
  • Moderate coding skills. SQL or similar required. C# or other languages strongly preferred.
  • Outstanding communication and collaboration skills. You can learn from and teach others.
  • Strong drive for results. You have a proven record of shepherding experiments to create successful shipping products/services.
  • Experience with prediction in adversarial (energy) environments highly desirable.
  • Understanding of the model development ecosystem across platforms, including development, distribution, and best practices, highly desirable.


As a dedicated Data Scientist on our Research team, you will apply data science and your machine learning expertise to enhance our intelligent systems to predict and provide proactive advice. You’ll work with the team to identify and build features, create experiments, vet ML models, and ship successful models that provide value additions for hundreds of EE customers.

At EE, you’ll have access to vast amounts of energy-related data from our sources. Our data pipelines are curated and supported by engineering teams (so you won't have to do much data engineering - you get to do the fun stuff.) We also offer many company-sponsored classes and conferences that focus on data science and ML. There’s great growth opportunity for data science at EE.

Read more
Job posted by
Payal Joshi

Big Data Engineer

at BDIPlus

Founded 2014  •  Product  •  100-500 employees  •  Profitable
Apache Hive
Spark
Scala
PySpark
Data engineering
Big Data
Hadoop
Java
Python
icon
Remote only
icon
2 - 6 yrs
icon
₹6L - ₹20L / yr
We are looking for big data engineers to join our transformational consulting team serving one of our top US clients in the financial sector. You'd get an opportunity to develop big data pipelines and convert business requirements to production grade services and products. With
lesser concentration on enforcing how to do a particular task, we believe in giving people the opportunity to think out of the box and come up with their own innovative solution to problem solving.
You will primarily be developing, managing and executing handling multiple prospect campaigns as part of Prospect Marketing Journey to ensure best conversion rates and retention rates. Below are the roles, responsibilities and skillsets we are looking for and if you feel these resonate with you, please get in touch with us by applying to this role.
Roles and Responsibilities:
• You'd be responsible for development and maintenance of applications with technologies involving Enterprise Java and Distributed technologies.
• You'd collaborate with developers, product manager, business analysts and business users in conceptualizing, estimating and developing new software applications and enhancements.
• You'd Assist in the definition, development, and documentation of software’s objectives, business requirements, deliverables, and specifications in collaboration with multiple cross-functional teams.
• Assist in the design and implementation process for new products, research and create POC for possible solutions.
Skillset:
• Bachelors or Masters Degree in a technology related field preferred.
• Overall experience of 2-3 years on the Big Data Technologies.
• Hands on experience with Spark (Java/ Scala)
• Hands on experience with Hive, Shell Scripting
• Knowledge on Hbase, Elastic Search
• Development experience In Java/ Python is preferred
• Familiar with profiling, code coverage, logging, common IDE’s and other
development tools.
• Demonstrated verbal and written communication skills, and ability to interface with Business, Analytics and IT organizations.
• Ability to work effectively in short-cycle, team oriented environment, managing multiple priorities and tasks.
• Ability to identify non-obvious solutions to complex problems
Read more
Job posted by
Puja Kumari

Data Engineer

at Big revolution in the e-gaming industry. (GK1)

Agency job
via Multi Recruit
Python
Scala
Hadoop
Spark
Data Engineer
Kafka
Luigi
Airflow
Nosql
icon
Bengaluru (Bangalore)
icon
2 - 3 yrs
icon
₹15L - ₹20L / yr
  • We are looking for a Data Engineer to build the next-generation mobile applications for our world-class fintech product.
  • The candidate will be responsible for expanding and optimising our data and data pipeline architecture, as well as optimising data flow and collection for cross-functional teams.
  • The ideal candidate is an experienced data pipeline builder and data wrangler who enjoys optimising data systems and building them from the ground up.
  • Looking for a person with a strong ability to analyse and provide valuable insights to the product and business team to solve daily business problems.
  • You should be able to work in a high-volume environment, have outstanding planning and organisational skills.

 

Qualifications for Data Engineer

 

  • Working SQL knowledge and experience working with relational databases, query authoring (SQL) as well as working familiarity with a variety of databases.
  • Experience building and optimising ‘big data’ data pipelines, architectures, and data sets.
  • Experience performing root cause analysis on internal and external data and processes to answer specific business questions and identify opportunities for improvement.
  • Strong analytic skills related to working with unstructured datasets. Build processes supporting data transformation, data structures, metadata, dependency and workload management.
  • Experience supporting and working with cross-functional teams in a dynamic environment.
  • Looking for a candidate with 2-3 years of experience in a Data Engineer role, who is a CS graduate or has an equivalent experience.

 

What we're looking for?

 

  • Experience with big data tools: Hadoop, Spark, Kafka and other alternate tools.
  • Experience with relational SQL and NoSQL databases, including MySql/Postgres and Mongodb.
  • Experience with data pipeline and workflow management tools: Luigi, Airflow.
  • Experience with AWS cloud services: EC2, EMR, RDS, Redshift.
  • Experience with stream-processing systems: Storm, Spark-Streaming.
  • Experience with object-oriented/object function scripting languages: Python, Java, Scala.
Read more
Job posted by
Ayub Pasha

Data Engineer

at Mobile Programming LLC

Founded 1998  •  Services  •  100-1000 employees  •  Profitable
Big Data
Amazon Web Services (AWS)
Hadoop
SQL
Python
Scala
Linux/Unix
SQL server
Apache Hive
Spark
icon
Remote, Chennai
icon
3 - 7 yrs
icon
₹12L - ₹18L / yr
Position: Data Engineer  
Location: Chennai- Guindy Industrial Estate
Duration: Full time role
Company: Mobile Programming (https://www.mobileprogramming.com/" target="_blank">https://www.mobileprogramming.com/) 
Client Name: Samsung 


We are looking for a Data Engineer to join our growing team of analytics experts. The hire will be
responsible for expanding and optimizing our data and data pipeline architecture, as well as optimizing
data flow and collection for cross functional teams. The ideal candidate is an experienced data pipeline
builder and data wrangler who enjoy optimizing data systems and building them from the ground up.
The Data Engineer will support our software developers, database architects, data analysts and data
scientists on data initiatives and will ensure optimal data delivery architecture is consistent throughout
ongoing projects. They must be self-directed and comfortable supporting the data needs of multiple
teams, systems and products.

Responsibilities for Data Engineer
 Create and maintain optimal data pipeline architecture,
 Assemble large, complex data sets that meet functional / non-functional business requirements.
 Identify, design, and implement internal process improvements: automating manual processes,
optimizing data delivery, re-designing infrastructure for greater scalability, etc.
 Build the infrastructure required for optimal extraction, transformation, and loading of data
from a wide variety of data sources using SQL and AWS big data technologies.
 Build analytics tools that utilize the data pipeline to provide actionable insights into customer
acquisition, operational efficiency and other key business performance metrics.
 Work with stakeholders including the Executive, Product, Data and Design teams to assist with
data-related technical issues and support their data infrastructure needs.
 Create data tools for analytics and data scientist team members that assist them in building and
optimizing our product into an innovative industry leader.
 Work with data and analytics experts to strive for greater functionality in our data systems.

Qualifications for Data Engineer
 Experience building and optimizing big data ETL pipelines, architectures and data sets.
 Advanced working SQL knowledge and experience working with relational databases, query
authoring (SQL) as well as working familiarity with a variety of databases.
 Experience performing root cause analysis on internal and external data and processes to
answer specific business questions and identify opportunities for improvement.
 Strong analytic skills related to working with unstructured datasets.
 Build processes supporting data transformation, data structures, metadata, dependency and
workload management.
 A successful history of manipulating, processing and extracting value from large disconnected
datasets.

 Working knowledge of message queuing, stream processing and highly scalable ‘big datadata
stores.
 Strong project management and organizational skills.
 Experience supporting and working with cross-functional teams in a dynamic environment.

We are looking for a candidate with 3-6 years of experience in a Data Engineer role, who has
attained a Graduate degree in Computer Science, Statistics, Informatics, Information Systems or another quantitative field. They should also have experience using the following software/tools:
 Experience with big data tools: Spark, Kafka, HBase, Hive etc.
 Experience with relational SQL and NoSQL databases
 Experience with AWS cloud services: EC2, EMR, RDS, Redshift
 Experience with stream-processing systems: Storm, Spark-Streaming, etc.
 Experience with object-oriented/object function scripting languages: Python, Java, Scala, etc.

Skills: Big Data, AWS, Hive, Spark, Python, SQL
 
Read more
Job posted by
vandana chauhan

Hadoop Developer

at Pion Global Solutions LTD

Founded 2016  •  Products & Services  •  20-100 employees  •  Bootstrapped
Spark
Big Data
Hadoop
HDFS
Apache Sqoop
Apache Flume
Apache HBase
icon
Mumbai
icon
3 - 100 yrs
icon
₹4L - ₹15L / yr
Looking for Big data Developers in Mumbai Location
Read more
Job posted by
Sheela P

Big Data Developer

at MediaMelon Inc

Founded 2008  •  Product  •  20-100 employees  •  Raised funding
Scala
Spark Streaming
Aero spike
Cassandra
Apache Kafka
Big Data
Elastic Search
icon
Bengaluru (Bangalore), Bengaluru (Bangalore)
icon
1 - 7 yrs
icon
₹0L / yr
Develop analytic tools, working on BigData and Distributed systems. - Provide technical leadership on developing our core Analytic platform - Lead development efforts on product features using Scala/Java -Demonstrable excellence in innovation, problem solving, analytical skills, data structures and design patterns - Expert in building applications using Spark and Spark Streaming -Exposure to NoSQL: HBase/Cassandra, Hive and Pig -Latin, Mahout -Extensive experience with Hadoop and Machine learning algorithms
Read more
Job posted by
Katreddi Kiran Kumar
Did not find a job you were looking for?
icon
Search for relevant jobs from 10000+ companies such as Google, Amazon & Uber actively hiring on Cutshort.
Get to hear about interesting companies hiring right now
iconFollow Cutshort
Want to apply to this role at Virtusa?
Why apply via Cutshort?
Connect with actual hiring teams and get their fast response. No spam.
Learn more
Get to hear about interesting companies hiring right now
iconFollow Cutshort