- Must have 4 to 7 years of experience in ETL Design and Development using Informatica Components.
- Should have extensive knowledge in Unix shell scripting.
- Understanding of DW principles (Fact, Dimension tables, Dimensional Modelling and Data warehousing concepts).
- Research, development, document and modification of ETL processes as per data architecture and modeling requirements.
- Ensure appropriate documentation for all new development and modifications of the ETL processes and jobs.
- Should be good in writing complex SQL queries.
- • Selected candidates will be provided training opportunities on one or more of following: Google Cloud, AWS, DevOps Tools, Big Data technologies like Hadoop, Pig, Hive, Spark, Sqoop, Flume and
- Kafka would get chance to be part of the enterprise-grade implementation of Cloud and Big Data systems
- Will play an active role in setting up the Modern data platform based on Cloud and Big Data
- Would be part of teams with rich experience in various aspects of distributed systems and computing.
About DataMetica
Similar jobs
- Bring in industry best practices around creating and maintaining robust data pipelines for complex data projects with/without AI component
- programmatically ingesting data from several static and real-time sources (incl. web scraping)
- rendering results through dynamic interfaces incl. web / mobile / dashboard with the ability to log usage and granular user feedbacks
- performance tuning and optimal implementation of complex Python scripts (using SPARK), SQL (using stored procedures, HIVE), and NoSQL queries in a production environment
- Industrialize ML / DL solutions and deploy and manage production services; proactively handle data issues arising on live apps
- Perform ETL on large and complex datasets for AI applications - work closely with data scientists on performance optimization of large-scale ML/DL model training
- Build data tools to facilitate fast data cleaning and statistical analysis
- Ensure data architecture is secure and compliant
- Resolve issues escalated from Business and Functional areas on data quality, accuracy, and availability
- Work closely with APAC CDO and coordinate with a fully decentralized team across different locations in APAC and global HQ (Paris).
You should be
- Expert in structured and unstructured data in traditional and Big data environments – Oracle / SQLserver, MongoDB, Hive / Pig, BigQuery, and Spark
- Have excellent knowledge of Python programming both in traditional and distributed models (PySpark)
- Expert in shell scripting and writing schedulers
- Hands-on experience with Cloud - deploying complex data solutions in hybrid cloud / on-premise environment both for data extraction/storage and computation
- Hands-on experience in deploying production apps using large volumes of data with state-of-the-art technologies like Dockers, Kubernetes, and Kafka
- Strong knowledge of data security best practices
- 5+ years experience in a data engineering role
- Science / Engineering graduate from a Tier-1 university in the country
- And most importantly, you must be a passionate coder who really cares about building apps that can help people do things better, smarter, and faster even when they sleep
companies uncover the 3% of active buyers in their target market. It evaluates
over 100 billion data points and analyzes factors such as buyer journeys, technology
adoption patterns, and other digital footprints to deliver market & sales intelligence.
Its customers have access to the buying patterns and contact information of
more than 17 million companies and 70 million decision makers across the world.
Role – Data Engineer
Responsibilities
Work in collaboration with the application team and integration team to
design, create, and maintain optimal data pipeline architecture and data
structures for Data Lake/Data Warehouse.
Work with stakeholders including the Sales, Product, and Customer Support
teams to assist with data-related technical issues and support their data
analytics needs.
Assemble large, complex data sets from third-party vendors to meet business
requirements.
Identify, design, and implement internal process improvements: automating
manual processes, optimizing data delivery, re-designing infrastructure for
greater scalability, etc.
Build the infrastructure required for optimal extraction, transformation, and
loading of data from a wide variety of data sources using SQL, Elasticsearch,
MongoDB, and AWS technology.
Streamline existing and introduce enhanced reporting and analysis solutions
that leverage complex data sources derived from multiple internal systems.
Requirements
5+ years of experience in a Data Engineer role.
Proficiency in Linux.
Must have SQL knowledge and experience working with relational databases,
query authoring (SQL) as well as familiarity with databases including Mysql,
Mongo, Cassandra, and Athena.
Must have experience with Python/Scala.
Must have experience with Big Data technologies like Apache Spark.
Must have experience with Apache Airflow.
Experience with data pipeline and ETL tools like AWS Glue.
Experience working with AWS cloud services: EC2, S3, RDS, Redshift.
Intuitive cloud (http://www.intuitive.cloud">www.intuitive.cloud) is one of the fastest growing top-tier Cloud Solutions and SDx Engineering solution and service company supporting 80+ Global Enterprise Customer across Americas, Europe and Middle East.
Intuitive is a recognized professional and manage service partner for core superpowers in cloud(public/ Hybrid), security, GRC, DevSecOps, SRE, Application modernization/ containers/ K8 -as-a- service and cloud application delivery.
Data Engineering:
- 9+ years’ experience as data engineer.
- Must have 4+ Years in implementing data engineering solutions with Databricks.
- This is hands on role building data pipelines using Databricks. Hands-on technical experience with Apache Spark.
- Must have deep expertise in one of the programming languages for data processes (Python, Scala). Experience with Python, PySpark, Hadoop, Hive and/or Spark to write data pipelines and data processing layers
- Must have worked with relational databases like Snowflake. Good SQL experience for writing complex SQL transformation.
- Performance Tuning of Spark SQL running on S3/Data Lake/Delta Lake/ storage and Strong Knowledge on Databricks and Cluster Configurations.
- Hands on architectural experience
- Nice to have Databricks administration including security and infrastructure features of Databricks.
It's regarding a permanent opening with Data Semantics
Data Semantics
We are Product base company and Microsoft Gold Partner
Data Semantics is an award-winning Data Science company with a vision to empower every organization to harness the full potential of its data assets. In order to achieve this, we provide Artificial Intelligence, Big Data and Data Warehousing solutions to enterprises across the globe. Data Semantics was listed as one of the top 20 Analytics companies by Silicon India 2018 & CIO Review India 2014 as one of the Top 20 BI companies. We are headquartered in Bangalore, India with our offices in 6 global locations including USA United Kingdom, Canada, United Arab Emirates (Dubai Abu Dhabi), and Mumbai. Our mission is to enable our people to learn the art of data management and visualization to help our customers make quick and smart decisions.
Our Services include:
Business Intelligence & Visualization
App and Data Modernization
Low Code Application Development
Artificial Intelligence
Internet of Things
Data Warehouse Modernization
Robotic Process Automation
Advanced Analytics
Our Products:
Sirius – World’s most agile conversational AI platform
Serina
Conversational Analytics
Contactless Attendance Management System
Company URL: https://datasemantics.co
JD:
MSBI
SSAS
SSRS
SSIS
Datawarehousing
SQL
Must Have Skills:
- Solid Knowledge on DWH, ETL and Big Data Concepts
- Excellent SQL Skills (With knowledge of SQL Analytics Functions)
- Working Experience on any ETL tool i.e. SSIS / Informatica
- Working Experience on any Azure or AWS Big Data Tools.
- Experience on Implementing Data Jobs (Batch / Real time Streaming)
- Excellent written and verbal communication skills in English, Self-motivated with strong sense of ownership and Ready to learn new tools and technologies
Preferred Skills:
- Experience on Py-Spark / Spark SQL
- AWS Data Tools (AWS Glue, AWS Athena)
- Azure Data Tools (Azure Databricks, Azure Data Factory)
Other Skills:
- Knowledge about Azure Blob, Azure File Storage, AWS S3, Elastic Search / Redis Search
- Knowledge on domain/function (across pricing, promotions and assortment).
- Implementation Experience on Schema and Data Validator framework (Python / Java / SQL),
- Knowledge on DQS and MDM.
Key Responsibilities:
- Independently work on ETL / DWH / Big data Projects
- Gather and process raw data at scale.
- Design and develop data applications using selected tools and frameworks as required and requested.
- Read, extract, transform, stage and load data to selected tools and frameworks as required and requested.
- Perform tasks such as writing scripts, web scraping, calling APIs, write SQL queries, etc.
- Work closely with the engineering team to integrate your work into our production systems.
- Process unstructured data into a form suitable for analysis.
- Analyse processed data.
- Support business decisions with ad hoc analysis as needed.
- Monitoring data performance and modifying infrastructure as needed.
Responsibility: Smart Resource, having excellent communication skills
* Formulates and recommends standards for achieving maximum performance
and efficiency of the DW ecosystem.
* Participates in the Pre-sales activities for solutions of various customer
problem-statement/situations.
* Develop business cases and ROI for the customer/clients.
* Interview stakeholders and develop BI roadmap for success given project
prioritization
* Evangelize self-service BI and visual discovery while helping to automate any
manual process at the client site.
* Work closely with the Engineering Manager to ensure prioritization of
customer deliverables.
* Champion data quality, integrity, and reliability throughout the organization by
designing and promoting best practices.
*Implementation 20%
* Help DW/DE team members with issues needing technical expertise or
complex systems and/or programming knowledge.
* Provide on-the-job training for new or less experienced team members.
* Develop a technical excellence team
Requirements
- experience designing business intelligence solutions
- experience with ETL Process, Data warehouse architecture
- experience with Azure Data services i.e., ADF, ADLS Gen 2, Azure SQL dB,
Synapse, Azure Databricks, and Power BI
- Good analytical and problem-solving skills
- Fluent in relational database concepts and flat file processing concepts
- Must be knowledgeable in software development lifecycles/methodologies
What you’ll do
- Deliver plugins for our Python-based ETL pipelines.
- Deliver Python microservices for provisioning and managing cloud infrastructure.
- Implement algorithms to analyse large data sets.
- Draft design documents that translate requirements into code.
- Deal with challenges associated with handling large volumes of data.
- Assume responsibilities from technical design through technical client support.
- Manage expectations with internal stakeholders and context-switch in a fast paced environment.
- Thrive in an environment that uses AWS and Elasticsearch extensively.
- Keep abreast of technology and contribute to the engineering strategy.
- Champion best development practices and provide mentorship.
What we’re looking for
- Experience in Python 3.
- Python libraries used for data (such as pandas, numpy).
- AWS.
- Elasticsearch.
- Performance tuning.
- Object Oriented Design and Modelling.
- Delivering complex software, ideally in a FinTech setting.
- CI/CD tools.
- Knowledge of design patterns.
- Sharp analytical and problem-solving skills.
- Strong sense of ownership.
- Demonstrable desire to learn and grow.
- Excellent written and oral communication skills.
- Mature collaboration and mentoring abilities.
About SteelEye Culture
- Work from home until you are vaccinated against COVID-19
- Top of the line health insurance • Order discounted meals every day from a dedicated portal
- Fair and simple salary structure
- 30+ holidays in a year
- Fresh fruits every day
- Centrally located. 5 mins to the nearest metro station (MG Road)
- Measured on output and not input
Datametica is Hiring for Datastage Developer
- Must have 3 to 8 years of experience in ETL Design and Development using IBM Datastage Components.
- Should have extensive knowledge in Unix shell scripting.
- Understanding of DW principles (Fact, Dimension tables, Dimensional Modelling and Data warehousing concepts).
- Research, development, document and modification of ETL processes as per data architecture and modeling requirements.
- Ensure appropriate documentation for all new development and modifications of the ETL processes and jobs.
- Should be good in writing complex SQL queries.
About Us!
A global Leader in the Data Warehouse Migration and Modernization to the Cloud, we empower businesses by migrating their Data/Workload/ETL/Analytics to the Cloud by leveraging Automation.
We have expertise in transforming legacy Teradata, Oracle, Hadoop, Netezza, Vertica, Greenplum along with ETLs like Informatica, Datastage, AbInitio & others, to cloud-based data warehousing with other capabilities in data engineering, advanced analytics solutions, data management, data lake and cloud optimization.
Datametica is a key partner of the major cloud service providers - Google, Microsoft, Amazon, Snowflake.
We have our own products!
Eagle – Data warehouse Assessment & Migration Planning Product
Raven – Automated Workload Conversion Product
Pelican - Automated Data Validation Product, which helps automate and accelerate data migration to the cloud.
Why join us!
Datametica is a place to innovate, bring new ideas to live and learn new things. We believe in building a culture of innovation, growth and belonging. Our people and their dedication over these years are the key factors in achieving our success.
Benefits we Provide!
Working with Highly Technical and Passionate, mission-driven people
Subsidized Meals & Snacks
Flexible Schedule
Approachable leadership
Access to various learning tools and programs
Pet Friendly
Certification Reimbursement Policy
Check out more about us on our website below!
www.datametica.com
2. Assemble large, complex data sets that meet business requirements
3. Identify, design, and implement internal process improvements
4. Optimize data delivery and re-design infrastructure for greater scalability
5. Build the infrastructure required for optimal extraction, transformation, and loading of data from a wide variety of data sources using SQL and AWS technologies
6. Build analytics tools that utilize the data pipeline to provide actionable insights into customer acquisition, operational efficiency, and other key business performance metrics
7. Work with internal and external stakeholders to assist with data-related technical issues and support data infrastructure needs
8. Create data tools for analytics and data scientist team members
Skills Required:
1. Working knowledge of ETL on any cloud (Azure / AWS / GCP)
2. Proficient in Python (Programming / Scripting)
3. Good understanding of any of the data warehousing concepts (Snowflake / AWS Redshift / Azure Synapse Analytics / Google Big Query / Hive)
4. In-depth understanding of principles of database structure
5. Good understanding of any of the ETL technologies (Informatica PowerCenter / AWS Glue / Data Factory / SSIS / Spark / Matillion / Talend / Azure)
6. Proficient in SQL (query solving)
7. Knowledge in Change case Management / Version Control – (VSS / DevOps / TFS / GitHub, Bit bucket, CICD Jenkin)