Experience :5--6+
Must Have
Job Description Data Engineer with Experience in the following area.
Location: PAN INDIA |
About Virtusa
Similar jobs
Data Engineer
at building a cutting-edge data science department to serve the older adult community and marketplace.
We are currently seeking talented and highly motivated Data Engineers to lead in the development of our discovery and support platform. The successful candidate will join a small, global team of data focused associates that have successfully built, and maintained a best of class traditional, Kimball based, SQL server founded, data warehouse. The successful candidate will lead the conversion of the existing data structure into an AWS focused, big data framework and assist in identifying and pipelining existing and augmented data sets into this environment. The successful candidate must be able to lead and assist in architecting and constructing the AWS foundation and initial data ports.
Specific responsibilities will be to:
- Lead and assist in design, deploy, and maintain robust methods for data management and analysis, primarily using the AWS cloud
- Develop computational methods for integrating multiple data sources to facilitate target and algorithmic
- Provide computational tools to ensure trustworthy data sources and facilitate reproducible
- Provide leadership around architecting, designing, and building target AWS data environment (like data lake and data warehouse).
- Work with on staff subject-matter experts to evaluate existing data sources, DW, ETL ports, existing stove type data sources and available augmentation data sets.
- Implement methods for execution of high-throughput assays and subsequent acquisition, management, and analysis of the
- Assist in the communications of complex scientific, software and data concepts and
- Assist in the identification and hiring of additional data engineer associates.
Job Requirements:
- Master’s Degree (or equivalent experience) in computer science, data science or a scientific field that has relevance to healthcare in the United States
- Extensive experience in the use of a high-level programming language (i.e., Python or Scala) and relevant AWS services.
- Experience in AWS cloud services like S3, Glue, Lake Formation, Athena, and others.
- Experience in creating and managing Data Lakes and Data Warehouses.
- Experience with big data tools like Hadoop, Hive, Talend, Apache Spark, Kafka.
- Advance SQL scripting.
- Database Management Systems (for example, Oracle, MySQL or MS SQL Server)
- Hands on experience in data transformation tools, data processing and data modeling on a big data environment.
- Understanding the basics of distributed systems.
- Experience working and communicating with subject matter expert
- The ability to work independently as well as to collaborate on multidisciplinary, global teams in a startup fashion with traditional data warehouse skilled data associates and business teams unfamiliar with data science techniques
- Strong communication, data presentation and visualization
Sr Software Engineer - Python
The Energy Exemplar (EE) data team is looking for an experienced Python Developer (Data Engineer) to join our Pune office. As a dedicated Data Engineer on our Research team, you will apply data engineering expertise, work very closely with the core data team to identify different data sources for specific energy markets and create an automated data pipeline. The pipeline will then incrementally pull the data from its sources and maintain a dataset, which in turn provides tremendous value to hundreds of EE customers.
At EE, you’ll have access to vast amounts of energy-related data from our sources. Our data pipelines are curated and supported by engineering teams. We also offer many company-sponsored classes and conferences that focus on data engineering, data platform. There’s a great growth opportunity for data engineering at EE..
Responsibilities
- Develop, test and maintain architectures, such as databases and large-scale processing systems using high-performance data pipelines.
- Recommend and implement ways to improve data reliability, efficiency, and quality.
- Identify performant features and make them universally accessible to our teams across EE.
- Work together with data analysts and data scientists to wrangle the data and provide quality datasets and insights to business-critical decisions
- Take end-to-end responsibility for the development, quality, testing, and production readiness of the services you build.
- Define and evangelize Data Engineering best standards and practices to ensure engineering excellence at every stage of a development cycle.
- Act as a resident expert for data engineering, feature engineering, exploratory data analysis.
- Agile methodologies, acting as Scrum Master would be an added plus.
Qualifications
- 6+ years of professional experience in developing data pipelines for large-scale, complex datasets from varieties of data sources.
- Data Engineering expertise with strong experience working with Python, Beautiful Soup, Selenium, Regular Expression, Web Scraping.
- Best practices with Python Development, Doc String, Type Hints, Unit Testing, etc.
- Experience working with Cloud-based data technologies such as Azure Data lake, Azure Data Factory, Azure Data Bricks is optionally desirable.
- Moderate coding skills. SQL or similar required. C# or other languages strongly preferred.
- Outstanding communication and collaboration skills. You can learn from and teach others.
- Strong drive for results. You have a proven record of shepherding experiments to create successful shipping products/services
- A Bachelor or Masters degree in Computer Science or Engineering with coursework in Python, Big Data, Data Engineering is highly desirable.
- Work in collaboration with the application team and integration team to design, create, and maintain optimal data pipeline architecture and data structures for Data Lake/Data Warehouse.
- Work with stakeholders including the Sales, Product, and Customer Support teams to assist with data-related technical issues and support their data analytics needs.
- Assemble large, complex data sets from third-party vendors to meet business requirements.
- Identify, design, and implement internal process improvements: automating manual processes, optimizing data delivery, re-designing infrastructure for greater scalability, etc.
- Build the infrastructure required for optimal extraction, transformation, and loading of data from a wide variety of data sources using SQL, Elasticsearch, MongoDB, and AWS technology.
- Streamline existing and introduce enhanced reporting and analysis solutions that leverage complex data sources derived from multiple internal systems.
Requirements
- 5+ years of experience in a Data Engineer role.
- Proficiency in Linux.
- Must have SQL knowledge and experience working with relational databases, query authoring (SQL) as well as familiarity with databases including Mysql, Mongo, Cassandra, and Athena.
- Must have experience with Python/Scala.
- Must have experience with Big Data technologies like Apache Spark.
- Must have experience with Apache Airflow.
- Experience with data pipeline and ETL tools like AWS Glue.
- Experience working with AWS cloud services: EC2, S3, RDS, Redshift.
Data Engineer_Scala
Job Description:
We are looking for a Big Data Engineer who have worked across the entire ETL stack. Someone who has ingested data in a batch and live stream format, transformed large volumes of daily and built Data-warehouse to store the transformed data and has integrated different visualization dashboards and applications with the data stores. The primary focus will be on choosing optimal solutions to use for these purposes, then maintaining, implementing, and monitoring them.
Responsibilities:
- Develop, test, and implement data solutions based on functional / non-functional business requirements.
- You would be required to code in Scala and PySpark daily on Cloud as well as on-prem infrastructure
- Build Data Models to store the data in a most optimized manner
- Identify, design, and implement process improvements: automating manual processes, optimizing data delivery, re-designing infrastructure for greater scalability, etc.
- Implementing the ETL process and optimal data pipeline architecture
- Monitoring performance and advising any necessary infrastructure changes.
- Create data tools for analytics and data scientist team members that assist them in building and optimizing our product into an innovative industry leader.
- Work with data and analytics experts to strive for greater functionality in our data systems.
- Proactively identify potential production issues and recommend and implement solutions
- Must be able to write quality code and build secure, highly available systems.
- Create design documents that describe the functionality, capacity, architecture, and process.
- Review peer-codes and pipelines before deploying to Production for optimization issues and code standards
Skill Sets:
- Good understanding of optimal extraction, transformation, and loading of data from a wide variety of data sources using SQL and ‘big data’ technologies.
- Proficient understanding of distributed computing principles
- Experience in working with batch processing/ real-time systems using various open-source technologies like NoSQL, Spark, Pig, Hive, Apache Airflow.
- Implemented complex projects dealing with the considerable data size (PB).
- Optimization techniques (performance, scalability, monitoring, etc.)
- Experience with integration of data from multiple data sources
- Experience with NoSQL databases, such as HBase, Cassandra, MongoDB, etc.,
- Knowledge of various ETL techniques and frameworks, such as Flume
- Experience with various messaging systems, such as Kafka or RabbitMQ
- Creation of DAGs for data engineering
- Expert at Python /Scala programming, especially for data engineering/ ETL purposes
Qualifications
- 5+ years of professional experience in experiment design and applied machine learning predicting outcomes in large-scale, complex datasets.
- Proficiency in Python, Azure ML, or other statistics/ML tools.
- Proficiency in Deep Neural Network, Python based frameworks.
- Proficiency in Azure DataBricks, Hive, Spark.
- Proficiency in deploying models into production (Azure stack).
- Moderate coding skills. SQL or similar required. C# or other languages strongly preferred.
- Outstanding communication and collaboration skills. You can learn from and teach others.
- Strong drive for results. You have a proven record of shepherding experiments to create successful shipping products/services.
- Experience with prediction in adversarial (energy) environments highly desirable.
- Understanding of the model development ecosystem across platforms, including development, distribution, and best practices, highly desirable.
As a dedicated Data Scientist on our Research team, you will apply data science and your machine learning expertise to enhance our intelligent systems to predict and provide proactive advice. You’ll work with the team to identify and build features, create experiments, vet ML models, and ship successful models that provide value additions for hundreds of EE customers.
At EE, you’ll have access to vast amounts of energy-related data from our sources. Our data pipelines are curated and supported by engineering teams (so you won't have to do much data engineering - you get to do the fun stuff.) We also offer many company-sponsored classes and conferences that focus on data science and ML. There’s great growth opportunity for data science at EE.
lesser concentration on enforcing how to do a particular task, we believe in giving people the opportunity to think out of the box and come up with their own innovative solution to problem solving.
You will primarily be developing, managing and executing handling multiple prospect campaigns as part of Prospect Marketing Journey to ensure best conversion rates and retention rates. Below are the roles, responsibilities and skillsets we are looking for and if you feel these resonate with you, please get in touch with us by applying to this role.
Roles and Responsibilities:
• You'd be responsible for development and maintenance of applications with technologies involving Enterprise Java and Distributed technologies.
• You'd collaborate with developers, product manager, business analysts and business users in conceptualizing, estimating and developing new software applications and enhancements.
• You'd Assist in the definition, development, and documentation of software’s objectives, business requirements, deliverables, and specifications in collaboration with multiple cross-functional teams.
• Assist in the design and implementation process for new products, research and create POC for possible solutions.
Skillset:
• Bachelors or Masters Degree in a technology related field preferred.
• Overall experience of 2-3 years on the Big Data Technologies.
• Hands on experience with Spark (Java/ Scala)
• Hands on experience with Hive, Shell Scripting
• Knowledge on Hbase, Elastic Search
• Development experience In Java/ Python is preferred
• Familiar with profiling, code coverage, logging, common IDE’s and other
development tools.
• Demonstrated verbal and written communication skills, and ability to interface with Business, Analytics and IT organizations.
• Ability to work effectively in short-cycle, team oriented environment, managing multiple priorities and tasks.
• Ability to identify non-obvious solutions to complex problems
- We are looking for a Data Engineer to build the next-generation mobile applications for our world-class fintech product.
- The candidate will be responsible for expanding and optimising our data and data pipeline architecture, as well as optimising data flow and collection for cross-functional teams.
- The ideal candidate is an experienced data pipeline builder and data wrangler who enjoys optimising data systems and building them from the ground up.
- Looking for a person with a strong ability to analyse and provide valuable insights to the product and business team to solve daily business problems.
- You should be able to work in a high-volume environment, have outstanding planning and organisational skills.
Qualifications for Data Engineer
- Working SQL knowledge and experience working with relational databases, query authoring (SQL) as well as working familiarity with a variety of databases.
- Experience building and optimising ‘big data’ data pipelines, architectures, and data sets.
- Experience performing root cause analysis on internal and external data and processes to answer specific business questions and identify opportunities for improvement.
- Strong analytic skills related to working with unstructured datasets. Build processes supporting data transformation, data structures, metadata, dependency and workload management.
- Experience supporting and working with cross-functional teams in a dynamic environment.
- Looking for a candidate with 2-3 years of experience in a Data Engineer role, who is a CS graduate or has an equivalent experience.
What we're looking for?
- Experience with big data tools: Hadoop, Spark, Kafka and other alternate tools.
- Experience with relational SQL and NoSQL databases, including MySql/Postgres and Mongodb.
- Experience with data pipeline and workflow management tools: Luigi, Airflow.
- Experience with AWS cloud services: EC2, EMR, RDS, Redshift.
- Experience with stream-processing systems: Storm, Spark-Streaming.
- Experience with object-oriented/object function scripting languages: Python, Java, Scala.
Location: Chennai- Guindy Industrial Estate
Duration: Full time role
Company: Mobile Programming (https://www.mobileprogramming.com/" target="_blank">https://www.
Client Name: Samsung
We are looking for a Data Engineer to join our growing team of analytics experts. The hire will be
responsible for expanding and optimizing our data and data pipeline architecture, as well as optimizing
data flow and collection for cross functional teams. The ideal candidate is an experienced data pipeline
builder and data wrangler who enjoy optimizing data systems and building them from the ground up.
The Data Engineer will support our software developers, database architects, data analysts and data
scientists on data initiatives and will ensure optimal data delivery architecture is consistent throughout
ongoing projects. They must be self-directed and comfortable supporting the data needs of multiple
teams, systems and products.
Responsibilities for Data Engineer
Create and maintain optimal data pipeline architecture,
Assemble large, complex data sets that meet functional / non-functional business requirements.
Identify, design, and implement internal process improvements: automating manual processes,
optimizing data delivery, re-designing infrastructure for greater scalability, etc.
Build the infrastructure required for optimal extraction, transformation, and loading of data
from a wide variety of data sources using SQL and AWS big data technologies.
Build analytics tools that utilize the data pipeline to provide actionable insights into customer
acquisition, operational efficiency and other key business performance metrics.
Work with stakeholders including the Executive, Product, Data and Design teams to assist with
data-related technical issues and support their data infrastructure needs.
Create data tools for analytics and data scientist team members that assist them in building and
optimizing our product into an innovative industry leader.
Work with data and analytics experts to strive for greater functionality in our data systems.
Qualifications for Data Engineer
Experience building and optimizing big data ETL pipelines, architectures and data sets.
Advanced working SQL knowledge and experience working with relational databases, query
authoring (SQL) as well as working familiarity with a variety of databases.
Experience performing root cause analysis on internal and external data and processes to
answer specific business questions and identify opportunities for improvement.
Strong analytic skills related to working with unstructured datasets.
Build processes supporting data transformation, data structures, metadata, dependency and
workload management.
A successful history of manipulating, processing and extracting value from large disconnected
datasets.
Working knowledge of message queuing, stream processing and highly scalable ‘big data’ data
stores.
Strong project management and organizational skills.
Experience supporting and working with cross-functional teams in a dynamic environment.
We are looking for a candidate with 3-6 years of experience in a Data Engineer role, who has
attained a Graduate degree in Computer Science, Statistics, Informatics, Information Systems or another quantitative field. They should also have experience using the following software/tools:
Experience with big data tools: Spark, Kafka, HBase, Hive etc.
Experience with relational SQL and NoSQL databases
Experience with AWS cloud services: EC2, EMR, RDS, Redshift
Experience with stream-processing systems: Storm, Spark-Streaming, etc.
Experience with object-oriented/object function scripting languages: Python, Java, Scala, etc.
Skills: Big Data, AWS, Hive, Spark, Python, SQL