Position: Big Data Engineer
What You'll Do
Punchh is seeking to hire Big Data Engineer at either a senior or tech lead level. Reporting to the Director of Big Data, he/she will play a critical role in leading Punchh’s big data innovations. By leveraging prior industrial experience in big data, he/she will help create cutting-edge data and analytics products for Punchh’s business partners.
This role requires close collaborations with data, engineering, and product organizations. His/her job functions include
- Work with large data sets and implement sophisticated data pipelines with both structured and structured data.
- Collaborate with stakeholders to design scalable solutions.
- Manage and optimize our internal data pipeline that supports marketing, customer success and data science to name a few.
- A technical leader of Punchh’s big data platform that supports AI and BI products.
- Work with infra and operations team to monitor and optimize existing infrastructure
- Occasional business travels are required.
What You'll Need
- 5+ years of experience as a Big Data engineering professional, developing scalable big data solutions.
- Advanced degree in computer science, engineering or other related fields.
- Demonstrated strength in data modeling, data warehousing and SQL.
- Extensive knowledge with cloud technologies, e.g. AWS and Azure.
- Excellent software engineering background. High familiarity with software development life cycle. Familiarity with GitHub/Airflow.
- Advanced knowledge of big data technologies, such as programming language (Python, Java), relational (Postgres, mysql), NoSQL (Mongodb), Hadoop (EMR) and streaming (Kafka, Spark).
- Strong problem solving skills with demonstrated rigor in building and maintaining a complex data pipeline.
- Exceptional communication skills and ability to articulate a complex concept with thoughtful, actionable recommendations.
We are currently seeking talented and highly motivated Data Analyst to lead in the development of our discovery and support platform. The successful candidate will join a small, global team of data focused associates that have successfully built, and maintained a best of class traditional, Kimball based, SQL server founded, data warehouse and Qlik Sense based BI Dashboards. The successful candidate will lead the conversion of managing our master data set, developing reports and analytics dashboards.
To do well in this role you need a very fine eye for detail, experience as a data analyst, and deep understanding of the popular data analysis tools and databases.
Specific responsibilities will be to:
- Managing master data, including creation, updates, and deletion.
- Managing users and user roles.
- Provide quality assurance of imported data, working with quality assurance analysts if necessary.
- Commissioning and decommissioning of data sets.
- Processing confidential data and information according to various compliance.
- Helping develop reports and analysis.
- Managing and designing the reporting environment, including data sources, security, and metadata.
- Supporting the data warehouse in identifying and revising reporting requirements.
- Supporting initiatives for data integrity and normalization.
- Assessing tests and implementing new or upgraded software and assisting with strategic decisions on new systems.
- Generating reports from single or multiple systems.
- Troubleshooting the reporting database environment and reports.
- Evaluating changes and updates to source production systems.
- Training end-users on new reports and dashboards.
- Providing technical expertise in data storage structures, data mining, and data cleansing.
- Master’s Degree (or equivalent experience) in computer science, data science or a scientific field that has relevance to healthcare in the United States.
- Work experience as a data analyst or in a related field for more than 5 years.
- Proficiency in statistics, data analysis, data visualization and research methods.
- Strong SQL and Excel skills with ability to learn other analytic tools.
- Experience with BI dashboard tools like Qlik Sense, Tableau, Power BI.
- Experience with AWS services like EC2, S3, Athena and QuickSight.
- Ability to work with stakeholders to assess potential risks.
- Ability to analyze existing tools and databases and provide software solution recommendations.
- Ability to translate business requirements into non-technical, lay terms.
- High-level experience in methodologies and processes for managing large-scale databases.
- Demonstrated experience in handling large data sets and relational databases.
- Understanding of addressing and metadata standards.
● Able to contribute to the gathering of functional requirements, developing technical
specifications, and test case planning
● Demonstrating technical expertise, and solving challenging programming and design
● 60% hands-on coding with architecture ownership of one or more products
● Ability to articulate architectural and design options, and educate development teams and
● Resolve defects/bugs during QA testing, pre-production, production, and post-release
● Mentor and guide team members
● Work cross-functionally with various bidgely teams including product management, QA/QE,
various product lines, and/or business units to drive forward results
● BS/MS in computer science or equivalent work experience
● 8-12 years’ experience designing and developing applications in Data Engineering
● Hands-on experience with Big data EcoSystems.
● Past experience with Hadoop,Hdfs,Map Reduce,YARN,AWS Cloud, EMR, S3, Spark, Cassandra,
● Expertise with any of the following Object-Oriented Languages (OOD): Java/J2EE,Scala,
● Ability to lead and mentor technical team members
● Expertise with the entire Software Development Life Cycle (SDLC)
● Excellent communication skills: Demonstrated ability to explain complex technical issues to
both technical and non-technical audiences
● Expertise in the Software design/architecture process
● Expertise with unit testing & Test-Driven Development (TDD)
● Business Acumen - strategic thinking & strategy development
● Experience on Cloud or AWS is preferable
● Have a good understanding and ability to develop software, prototypes, or proofs of
concepts (POC's) for various Data Engineering requirements.
● Experience with Agile Development, SCRUM, or Extreme Programming methodologies
|Job Title: Data Engineer|
|Tech Job Family: DACI|
|• Bachelor's Degree in Engineering, Computer Science, CIS, or related field (or equivalent work experience in a related field)|
|• 2 years of experience in Data, BI or Platform Engineering, Data Warehousing/ETL, or Software Engineering|
|• 1 year of experience working on project(s) involving the implementation of solutions applying development life cycles (SDLC)|
|• Master's Degree in Computer Science, CIS, or related field|
|• 2 years of IT experience developing and implementing business systems within an organization|
|• 4 years of experience working with defect or incident tracking software|
|• 4 years of experience with technical documentation in a software development environment|
|• 2 years of experience working with an IT Infrastructure Library (ITIL) framework|
|• 2 years of experience leading teams, with or without direct reports|
|• Experience with application and integration middleware|
|• Experience with database technologies|
|• 2 years of experience in Hadoop or any Cloud Bigdata components (specific to the Data Engineering role)|
|• Expertise in Java/Scala/Python, SQL, Scripting, Teradata, Hadoop (Sqoop, Hive, Pig, Map Reduce), Spark (Spark Streaming, MLib), Kafka or equivalent Cloud Bigdata components (specific to the Data Engineering role)|
|• Expertise in MicroStrategy/Power BI/SQL, Scripting, Teradata or equivalent RDBMS, Hadoop (OLAP on Hadoop), Dashboard development, Mobile development (specific to the BI Engineering role)|
|• 2 years of experience in Hadoop, NO-SQL, RDBMS or any Cloud Bigdata components, Teradata, MicroStrategy (specific to the Platform Engineering role)|
|• Expertise in Python, SQL, Scripting, Teradata, Hadoop utilities like Sqoop, Hive, Pig, Map Reduce, Spark, Ambari, Ranger, Kafka or equivalent Cloud Bigdata components (specific to the Platform Engineering role)|
|Lowe’s is an equal opportunity employer and administers all personnel practices without regard to race, color, religion, sex, age, national origin, disability, sexual orientation, gender identity or expression, marital status, veteran status, genetics or any other category protected under applicable law.|
- Design, implement and support an analytical data infrastructure, providing ad hoc access to large data sets and computing power.
- Contribute to development of standards and the design and implementation of proactive processes to collect and report data and statistics on assigned systems.
- Research opportunities for data acquisition and new uses for existing data.
- Provide technical development expertise for designing, coding, testing, debugging, documenting and supporting data solutions.
- Experience building data pipelines to connect analytics stacks, client data visualization tools and external data sources.
- Experience with cloud and distributed systems principles
- Experience with Azure/AWS/GCP cloud infrastructure
- Experience with Databricks Clusters and Configuration
- Experience with Python, R, sh/bash and JVM-based languages including Scala and Java.
- Experience with Hadoop family languages including Pig and Hive.
2. Assemble large, complex data sets that meet business requirements
3. Identify, design, and implement internal process improvements
4. Optimize data delivery and re-design infrastructure for greater scalability
5. Build the infrastructure required for optimal extraction, transformation, and loading of data from a wide variety of data sources using SQL and AWS technologies
6. Build analytics tools that utilize the data pipeline to provide actionable insights into customer acquisition, operational efficiency, and other key business performance metrics
7. Work with internal and external stakeholders to assist with data-related technical issues and support data infrastructure needs
8. Create data tools for analytics and data scientist team members
1. Working knowledge of ETL on any cloud (Azure / AWS / GCP)
2. Proficient in Python (Programming / Scripting)
3. Good understanding of any of the data warehousing concepts (Snowflake / AWS Redshift / Azure Synapse Analytics / Google Big Query / Hive)
4. In-depth understanding of principles of database structure
5. Good understanding of any of the ETL technologies (Informatica PowerCenter / AWS Glue / Data Factory / SSIS / Spark / Matillion / Talend / Azure)
6. Proficient in SQL (query solving)
7. Knowledge in Change case Management / Version Control – (VSS / DevOps / TFS / GitHub, Bit bucket, CICD Jenkin)
at Home Credit
Minimum 2 years of work experience on Snowflake and Azure storage.
Minimum 3 years of development experience in ETL Tool Experience.
Strong SQL database skills in other databases like Oracle, SQL Server, DB2 and Teradata
Good to have Hadoop and Spark experience.
Good conceptual knowledge on Data-Warehouse and various methodologies.
Working knowledge in any of the scripting like UNIX / Shell
Good Presentation and communication skills.
Should be flexible with the overlapping working hours.
Should be able to work independently and be proactive.
Good understanding of Agile development cycle.
- Exploring and visualizing data to gain an understanding of it, then identifying differences in data distribution that could affect performance when deploying the model in the real world.
- Verifying data quality, and/or ensuring it via data cleaning.
- Able to adapt and work fast in producing the output which upgrades the decision making of stakeholders using ML.
- To design and develop Machine Learning systems and schemes.
- To perform statistical analysis and fine-tune models using test results.
- To train and retrain ML systems and models as and when necessary.
- To deploy ML models in production and maintain the cost of cloud infrastructure.
- To develop Machine Learning apps according to client and data scientist requirements.
- To analyze the problem-solving capabilities and use-cases of ML algorithms and rank them by how successful they are in meeting the objective.
- Worked with real time problems, solved them using ML and deep learning models deployed in real time and should have some awesome projects under his belt to showcase.
- Proficiency in Python and experience with working with Jupyter Framework, Google collab and cloud hosted notebooks such as AWS sagemaker, DataBricks etc.
- Proficiency in working with libraries Sklearn, Tensorflow, Open CV2, Pyspark, Pandas, Numpy and related libraries.
- Expert in visualising and manipulating complex datasets.
- Proficiency in working with visualisation libraries such as seaborn, plotly, matplotlib etc.
- Proficiency in Linear Algebra, statistics and probability required for Machine Learning.
- Proficiency in ML Based algorithms for example, Gradient boosting, stacked Machine learning, classification algorithms and deep learning algorithms. Need to have experience in hypertuning various models and comparing the results of algorithm performance.
- Big data Technologies such as Hadoop stack and Spark.
- Basic use of clouds (VM’s example EC2).
- Brownie points for Kubernetes and Task Queues.
- Strong written and verbal communications.
- Experience working in an Agile environment.