About Pion Global Solutions LTD
Similar jobs
Publicis Sapient Overview:
The Senior Associate People Senior Associate L1 in Data Engineering, you will translate client requirements into technical design, and implement components for data engineering solution. Utilize deep understanding of data integration and big data design principles in creating custom solutions or implementing package solutions. You will independently drive design discussions to insure the necessary health of the overall solution
.
Job Summary:
As Senior Associate L2 in Data Engineering, you will translate client requirements into technical design, and implement components for data engineering solution. Utilize deep understanding of data integration and big data design principles in creating custom solutions or implementing package solutions. You will independently drive design discussions to insure the necessary health of the overall solution
The role requires a hands-on technologist who has strong programming background like Java / Scala / Python, should have experience in Data Ingestion, Integration and data Wrangling, Computation, Analytics pipelines and exposure to Hadoop ecosystem components. You are also required to have hands-on knowledge on at least one of AWS, GCP, Azure cloud platforms.
Role & Responsibilities:
Your role is focused on Design, Development and delivery of solutions involving:
• Data Integration, Processing & Governance
• Data Storage and Computation Frameworks, Performance Optimizations
• Analytics & Visualizations
• Infrastructure & Cloud Computing
• Data Management Platforms
• Implement scalable architectural models for data processing and storage
• Build functionality for data ingestion from multiple heterogeneous sources in batch & real-time mode
• Build functionality for data analytics, search and aggregation
Experience Guidelines:
Mandatory Experience and Competencies:
# Competency
1.Overall 5+ years of IT experience with 3+ years in Data related technologies
2.Minimum 2.5 years of experience in Big Data technologies and working exposure in at least one cloud platform on related data services (AWS / Azure / GCP)
3.Hands-on experience with the Hadoop stack – HDFS, sqoop, kafka, Pulsar, NiFi, Spark, Spark Streaming, Flink, Storm, hive, oozie, airflow and other components required in building end to end data pipeline.
4.Strong experience in at least of the programming language Java, Scala, Python. Java preferable
5.Hands-on working knowledge of NoSQL and MPP data platforms like Hbase, MongoDb, Cassandra, AWS Redshift, Azure SQLDW, GCP BigQuery etc
6.Well-versed and working knowledge with data platform related services on at least 1 cloud platform, IAM and data security
Preferred Experience and Knowledge (Good to Have):
# Competency
1.Good knowledge of traditional ETL tools (Informatica, Talend, etc) and database technologies (Oracle, MySQL, SQL Server, Postgres) with hands on experience
2.Knowledge on data governance processes (security, lineage, catalog) and tools like Collibra, Alation etc
3.Knowledge on distributed messaging frameworks like ActiveMQ / RabbiMQ / Solace, search & indexing and Micro services architectures
4.Performance tuning and optimization of data pipelines
5.CI/CD – Infra provisioning on cloud, auto build & deployment pipelines, code quality
6.Cloud data specialty and other related Big data technology certifications
Personal Attributes:
• Strong written and verbal communication skills
• Articulation skills
• Good team player
• Self-starter who requires minimal oversight
• Ability to prioritize and manage multiple tasks
• Process orientation and the ability to define and set up processes
Work closely with different Front Office and Support Function stakeholders including but not restricted to Business
Management, Accounts, Regulatory Reporting, Operations, Risk, Compliance, HR on all data collection and reporting use cases.
Collaborate with Business and Technology teams to understand enterprise data, create an innovative narrative to explain, engage and enlighten regular staff members as well as executive leadership with data-driven storytelling
Solve data consumption and visualization through data as a service distribution model
Articulate findings clearly and concisely for different target use cases, including through presentations, design solutions, visualizations
Perform Adhoc / automated report generation tasks using Power BI, Oracle BI, Informatica
Perform data access/transfer and ETL automation tasks using Python, SQL, OLAP / OLTP, RESTful APIs, and IT tools (CFT, MQ-Series, Control-M, etc.)
Provide support and maintain the availability of BI applications irrespective of the hosting location
Resolve issues escalated from Business and Functional areas on data quality, accuracy, and availability, provide incident-related communications promptly
Work with strict deadlines on high priority regulatory reports
Serve as a liaison between business and technology to ensure that data related business requirements for protecting sensitive data are clearly defined, communicated, and well understood, and considered as part of operational
prioritization and planning
To work for APAC Chief Data Office and coordinate with a fully decentralized team across different locations in APAC and global HQ (Paris).
General Skills:
Excellent knowledge of RDBMS and hands-on experience with complex SQL is a must, some experience in NoSQL and Big Data Technologies like Hive and Spark would be a plus
Experience with industrialized reporting on BI tools like PowerBI, Informatica
Knowledge of data related industry best practices in the highly regulated CIB industry, experience with regulatory report generation for financial institutions
Knowledge of industry-leading data access, data security, Master Data, and Reference Data Management, and establishing data lineage
5+ years experience on Data Visualization / Business Intelligence / ETL developer roles
Ability to multi-task and manage various projects simultaneously
Attention to detail
Ability to present to Senior Management, ExCo; excellent written and verbal communication skills
About Slintel (a 6sense company) :
Slintel, a 6sense company, the leader in capturing technographics-powered buying intent, helps companies uncover the 3% of active buyers in their target market. Slintel evaluates over 100 billion data points and analyzes factors such as buyer journeys, technology adoption patterns, and other digital footprints to deliver market & sales intelligence.
Slintel's customers have access to the buying patterns and contact information of more than 17 million companies and 250 million decision makers across the world.
Slintel is a fast growing B2B SaaS company in the sales and marketing tech space. We are funded by top tier VCs, and going after a billion dollar opportunity. At Slintel, we are building a sales development automation platform that can significantly improve outcomes for sales teams, while reducing the number of hours spent on research and outreach.
We are a big data company and perform deep analysis on technology buying patterns, buyer pain points to understand where buyers are in their journey. Over 100 billion data points are analyzed every week to derive recommendations on where companies should focus their marketing and sales efforts on. Third party intent signals are then clubbed with first party data from CRMs to derive meaningful recommendations on whom to target on any given day.
6sense is headquartered in San Francisco, CA and has 8 office locations across 4 countries.
6sense, an account engagement platform, secured $200 million in a Series E funding round, bringing its total valuation to $5.2 billion 10 months after its $125 million Series D round. The investment was co-led by Blue Owl and MSD Partners, among other new and existing investors.
Linkedin (Slintel) : https://www.linkedin.com/company/slintel/">https://www.linkedin.com/company/slintel/
Industry : Software Development
Company size : 51-200 employees (189 on LinkedIn)
Headquarters : Mountain View, California
Founded : 2016
Specialties : Technographics, lead intelligence, Sales Intelligence, Company Data, and Lead Data.
Website (Slintel) : https://www.slintel.com/slintel">https://www.slintel.com/slintel
Linkedin (6sense) : https://www.linkedin.com/company/6sense/">https://www.linkedin.com/company/6sense/
Industry : Software Development
Company size : 501-1,000 employees (937 on LinkedIn)
Headquarters : San Francisco, California
Founded : 2013
Specialties : Predictive intelligence, Predictive marketing, B2B marketing, and Predictive sales
Website (6sense) : https://6sense.com/">https://6sense.com/
Acquisition News :
https://inc42.com/buzz/us-based-based-6sense-acquires-b2b-buyer-intelligence-startup-slintel/
Funding Details & News :
Slintel funding : https://www.crunchbase.com/organization/slintel">https://www.crunchbase.com/organization/slintel
6sense funding : https://www.crunchbase.com/organization/6sense">https://www.crunchbase.com/organization/6sense
https://www.nasdaq.com/articles/ai-software-firm-6sense-valued-at-%245.2-bln-after-softbank-joins-funding-round">https://www.nasdaq.com/articles/ai-software-firm-6sense-valued-at-%245.2-bln-after-softbank-joins-funding-round
https://www.bloomberg.com/news/articles/2022-01-20/6sense-reaches-5-2-billion-value-with-softbank-joining-round">https://www.bloomberg.com/news/articles/2022-01-20/6sense-reaches-5-2-billion-value-with-softbank-joining-round
https://xipometer.com/en/company/6sense">https://xipometer.com/en/company/6sense
Slintel & 6sense Customers :
https://www.featuredcustomers.com/vendor/slintel/customers
https://www.featuredcustomers.com/vendor/6sense/customers">https://www.featuredcustomers.com/vendor/6sense/customers
About the job
Responsibilities
- Work in collaboration with the application team and integration team to design, create, and maintain optimal data pipeline architecture and data structures for Data Lake/Data Warehouse
- Work with stakeholders including the Sales, Product, and Customer Support teams to assist with data-related technical issues and support their data analytics needs
- Assemble large, complex data sets from third-party vendors to meet business requirements.
- Identify, design, and implement internal process improvements: automating manual processes, optimising data delivery, re-designing infrastructure for greater scalability, etc.
- Build the infrastructure required for optimal extraction, transformation, and loading of data from a wide variety of data sources using SQL, Elastic search, MongoDB, and AWS technology
- Streamline existing and introduce enhanced reporting and analysis solutions that leverage complex data sources derived from multiple internal systems
Requirements
- 3+ years of experience in a Data Engineer role
- Proficiency in Linux
- Must have SQL knowledge and experience working with relational databases, query authoring (SQL) as well as familiarity with databases including Mysql, Mongo, Cassandra, and Athena
- Must have experience with Python/ Scala
- Must have experience with Big Data technologies like Apache Spark
- Must have experience with Apache Airflow
- Experience with data pipeline and ETL tools like AWS Glue
- Experience working with AWS cloud services: EC2 S3 RDS, Redshift and other Data solutions eg. Databricks, Snowflake
Desired Skills and Experience
Python, SQL, Scala, Spark, ETL
- At least 4 to 7 years of relevant experience as Big Data Engineer
- Hands-on experience in Scala or Python
- Hands-on experience on major components in Hadoop Ecosystem like HDFS, Map Reduce, Hive, Impala.
- Strong programming experience in building applications/platform using Scala or Python.
- Experienced in implementing Spark RDD Transformations, actions to implement business analysis
We are specialized in productizing solutions of new technology.
Our vision is to build engineers with entrepreneurial and leadership mindsets who can create highly impactful products and solutions using technology to deliver immense value to our clients.
We strive to develop innovation and passion into everything we do, whether it is services or products, or solutions.
- KSQL
- Data Engineering spectrum (Java/Spark)
- Spark Scala / Kafka Streaming
- Confluent Kafka components
- Basic understanding of Hadoop
The Data Engineering team is one of the core technology teams of Lumiq.ai and is responsible for creating all the Data related products and platforms which scale for any amount of data, users, and processing. The team also interacts with our customers to work out solutions, create technical architectures and deliver the products and solutions.
If you are someone who is always pondering how to make things better, how technologies can interact, how various tools, technologies, and concepts can help a customer or how a customer can use our products, then Lumiq is the place of opportunities.
Who are you?
- Enthusiast is your middle name. You know what’s new in Big Data technologies and how things are moving
- Apache is your toolbox and you have been a contributor to open source projects or have discussed the problems with the community on several occasions
- You use cloud for more than just provisioning a Virtual Machine
- Vim is friendly to you and you know how to exit Nano
- You check logs before screaming about an error
- You are a solid engineer who writes modular code and commits in GIT
- You are a doer who doesn’t say “no” without first understanding
- You understand the value of documentation of your work
- You are familiar with Machine Learning Ecosystem and how you can help your fellow Data Scientists to explore data and create production-ready ML pipelines
Eligibility
Experience
- At least 2 years of Data Engineering Experience
- Have interacted with Customers
Must Have Skills
- Amazon Web Services (AWS) - EMR, Glue, S3, RDS, EC2, Lambda, SQS, SES
- Apache Spark
- Python
- Scala
- PostgreSQL
- Git
- Linux
Good to have Skills
- Apache NiFi
- Apache Kafka
- Apache Hive
- Docker
- Amazon Certification
Knowledge of Hadoop ecosystem installation, initial-configuration and performance tuning.
Expert with Apache Ambari, Spark, Unix Shell scripting, Kubernetes and Docker
Knowledge on python would be desirable.
Experience with HDP Manager/clients and various dashboards.
Understanding on Hadoop Security (Kerberos, Ranger and Knox) and encryption and Data masking.
Experience with automation/configuration management using Chef, Ansible or an equivalent.
Strong experience with any Linux distribution.
Basic understanding of network technologies, CPU, memory and storage.
Database administration a plus.
Qualifications and Education Requirements
2 to 4 years of experience with and detailed knowledge of Core Hadoop Components solutions and
dashboards running on Big Data technologies such as Hadoop/Spark.
Bachelor degree or equivalent in Computer Science or Information Technology or related fields.
- Building and operationalizing large scale enterprise data solutions and applications using one or more of AZURE data and analytics services in combination with custom solutions - Azure Synapse/Azure SQL DWH, Azure Data Lake, Azure Blob Storage, Spark, HDInsights, Databricks, CosmosDB, EventHub/IOTHub.
- Experience in migrating on-premise data warehouses to data platforms on AZURE cloud.
- Designing and implementing data engineering, ingestion, and transformation functions
-
Azure Synapse or Azure SQL data warehouse
-
Spark on Azure is available in HD insights and data bricks
The candidate must have Expertise in ADF(Azure data factory), well versed with python.
Performance optimization of scripts (code) and Productionizing of code (SQL, Pandas, Python or PySpark, etc.)
Required skills:
Bachelors in - in Computer Science, Data Science, Computer Engineering, IT or equivalent
Fluency in Python (Pandas), PySpark, SQL, or similar
Azure data factory experience (min 12 months)
Able to write efficient code using traditional, OO concepts, modular programming following the SDLC process.
Experience in production optimization and end-to-end performance tracing (technical root cause analysis)
Ability to work independently with demonstrated experience in project or program management
Azure experience ability to translate data scientist code in Python and make it efficient (production) for cloud deployment