AWS Data Engineer:
Job Description
-
3+ years of experience in AWS Data Engineering.
-
Design and build ETL pipelines & Data lakes to automate ingestion of structured and unstructured data
-
Experience working with AWS big data technologies (Redshift, S3, AWS Glue, Kinesis, Athena ,DMS, EMR and Lambda for Serverless ETL)
-
Should have knowledge in SQL and NoSQL programming languages.
-
Have worked on batch and real time pipelines.
-
Excellent programming and debugging skills in Scala or Python & Spark.
-
Good Experience in Data Lake formation, Apache spark, python, hands on experience in deploying the models.
-
Must have experience in Production migration Process
-
Nice to have experience with Power BI visualization tools and connectivity
Roles & Responsibilities:
-
Design, build and operationalize large scale enterprise data solutions and applications
-
Analyze, re-architect and re-platform on premise data warehouses to data platforms on AWS cloud.
-
Design and build production data pipelines from ingestion to consumption within AWS big data architecture, using Python, or Scala.
-
Perform detail assessments of current state data platforms and create an appropriate transition path to AWS cloud.
About Upgrad KnowledgeHut
Similar jobs
● Proficiency in Linux.
● Experience working with AWS cloud services: EC2, S3, RDS, Redshift.
● Must have SQL knowledge and experience working with relational databases, query
authoring (SQL) as well as familiarity with databases including Mysql, Mongo, Cassandra,
and Athena.
● Must have experience with Python/Scala.
● Must have experience with Big Data technologies like Apache Spark.
● Must have experience with Apache Airflow.
● Experience with data pipelines and ETL tools like AWS Glue.
● Able contribute to the gathering of functional requirements, developing technical
specifications, and project & test planning
● Demonstrating technical expertise, and solving challenging programming and design
problems
● Roughly 80% hands-on coding
● Generate technical documentation and PowerPoint presentations to communicate
architectural and design options, and educate development teams and business users
● Resolve defects/bugs during QA testing, pre-production, production, and post-release
patches
● Work cross-functionally with various bidgely teams including: product management,
QA/QE, various product lines, and/or business units to drive forward results
Requirements
● BS/MS in computer science or equivalent work experience
● 2-4 years’ experience designing and developing applications in Data Engineering
● Hands-on experience with Big data Eco Systems.
● Hadoop,Hdfs,Map Reduce,YARN,AWS Cloud, EMR, S3, Spark, Cassandra, Kafka,
Zookeeper
● Expertise with any of the following Object-Oriented Languages (OOD): Java/J2EE,Scala,
Python
● Strong leadership experience: Leading meetings, presenting if required
● Excellent communication skills: Demonstrated ability to explain complex technical
issues to both technical and non-technical audiences
● Expertise in the Software design/architecture process
● Expertise with unit testing & Test-Driven Development (TDD)
● Experience on Cloud or AWS is preferable
● Have a good understanding and ability to develop software, prototypes, or proofs of
concepts (POC's) for various Data Engineering requirements.
A candidate with 2 to 6 years of relevant experience in Snowflakes Data Cloud.
Mandatory Skills:
• Excellent knowledge in Snowflakes Data Cloud.
• Excellent knowledge in SQL
• Good knowledge in ETL
• Must have working knowledge of Azure Data Factory
• Must have general awareness of Azure Cloud
• Must be aware of optimization techniques in Data retrieval and Data loading.
• Project Planning and Management
o Take end-to-end ownership of multiple projects / project tracks
o Create and maintain project plans and other related documentation for project
objectives, scope, schedule and delivery milestones
o Lead and participate across all the phases of software engineering, right from
requirements gathering to GO LIVE
o Lead internal team meetings on solution architecture, effort estimation, manpower
planning and resource (software/hardware/licensing) planning
o Manage RIDA (Risks, Impediments, Dependencies, Assumptions) for projects by
developing effective mitigation plans
• Team Management
o Act as the Scrum Master
o Conduct SCRUM ceremonies like Sprint Planning, Daily Standup, Sprint Retrospective
o Set clear objectives for the project and roles/responsibilities for each team member
o Train and mentor the team on their job responsibilities and SCRUM principles
o Make the team accountable for their tasks and help the team in achieving them
o Identify the requirements and come up with a plan for Skill Development for all team
members
• Communication
o Be the Single Point of Contact for the client in terms of day-to-day communication
o Periodically communicate project status to all the stakeholders (internal/external)
• Process Management and Improvement
o Create and document processes across all disciplines of software engineering
o Identify gaps and continuously improve processes within the team
o Encourage team members to contribute towards process improvement
o Develop a culture of quality and efficiency within the team
Must have:
• Minimum 08 years of experience (hands-on as well as leadership) in software / data engineering
across multiple job functions like Business Analysis, Development, Solutioning, QA, DevOps and
Project Management
• Hands-on as well as leadership experience in Big Data Engineering projects
• Experience developing or managing cloud solutions using Azure or other cloud provider
• Demonstrable knowledge on Hadoop, Hive, Spark, NoSQL DBs, SQL, Data Warehousing, ETL/ELT,
DevOps tools
• Strong project management and communication skills
• Strong analytical and problem-solving skills
• Strong systems level critical thinking skills
• Strong collaboration and influencing skills
Good to have:
• Knowledge on PySpark, Azure Data Factory, Azure Data Lake Storage, Synapse Dedicated SQL
Pool, Databricks, PowerBI, Machine Learning, Cloud Infrastructure
• Background in BFSI with focus on core banking
• Willingness to travel
Work Environment
• Customer Office (Mumbai) / Remote Work
Education
• UG: B. Tech - Computers / B. E. – Computers / BCA / B.Sc. Computer Science
As AWS Data Engineer, you are a full-stack data engineer that loves solving business problems. You work with business leads, analysts, and data scientists to understand the business domain and engage with fellow engineers to build data products that empower better decision-making. You are passionate about the data quality of our business metrics and the flexibility of your solution that scales to respond to broader business questions.
If you love to solve problems using your skills, then come join the Team Mactores. We have a casual and fun office environment that actively steers clear of rigid "corporate" culture, focuses on productivity and creativity, and allows you to be part of a world-class team while still being yourself.
What you will do?
- Write efficient code in - PySpark, Amazon Glue
- Write SQL Queries in - Amazon Athena, Amazon Redshift
- Explore new technologies and learn new techniques to solve business problems creatively
- Collaborate with many teams - engineering and business, to build better data products and services
- Deliver the projects along with the team collaboratively and manage updates to customers on time
What are we looking for?
- 1 to 3 years of experience in Apache Spark, PySpark, Amazon Glue
- 2+ years of experience in writing ETL jobs using pySpark, and SparkSQL
- 2+ years of experience in SQL queries and stored procedures
- Have a deep understanding of all the Dataframe API with all the transformation functions supported by Spark 2.7+
You will be preferred if you have
- Prior experience in working on AWS EMR, Apache Airflow
- Certifications AWS Certified Big Data – Specialty OR Cloudera Certified Big Data Engineer OR Hortonworks Certified Big Data Engineer
- Understanding of DataOps Engineering
Life at Mactores
We care about creating a culture that makes a real difference in the lives of every Mactorian. Our 10 Core Leadership Principles that honor Decision-making, Leadership, Collaboration, and Curiosity drive how we work.
1. Be one step ahead
2. Deliver the best
3. Be bold
4. Pay attention to the detail
5. Enjoy the challenge
6. Be curious and take action
7. Take leadership
8. Own it
9. Deliver value
10. Be collaborative
We would like you to read more details about the work culture on https://mactores.com/careers
The Path to Joining the Mactores Team
At Mactores, our recruitment process is structured around three distinct stages:
Pre-Employment Assessment:
You will be invited to participate in a series of pre-employment evaluations to assess your technical proficiency and suitability for the role.
Managerial Interview: The hiring manager will engage with you in multiple discussions, lasting anywhere from 30 minutes to an hour, to assess your technical skills, hands-on experience, leadership potential, and communication abilities.
HR Discussion: During this 30-minute session, you'll have the opportunity to discuss the offer and next steps with a member of the HR team.
At Mactores, we are committed to providing equal opportunities in all of our employment practices, and we do not discriminate based on race, religion, gender, national origin, age, disability, marital status, military status, genetic information, or any other category protected by federal, state, and local laws. This policy extends to all aspects of the employment relationship, including recruitment, compensation, promotions, transfers, disciplinary action, layoff, training, and social and recreational programs. All employment decisions will be made in compliance with these principles.
WHAT YOU WILL DO:
-
● Create and maintain optimal data pipeline architecture.
-
● Assemble large, complex data sets that meet functional / non-functional business requirements.
-
● Identify, design, and implement internal process improvements: automating manual processes,
optimizing data delivery, re-designing infrastructure for greater scalability, etc.
-
● Build the infrastructure required for optimal extraction, transformation, and loading of data from a wide
variety of data sources using Spark,Hadoop and AWS 'big data' technologies.(EC2, EMR, S3, Athena).
-
● Build analytics tools that utilize the data pipeline to provide actionable insights into customer acquisition,
operational efficiency and other key business performance metrics.
-
● Work with stakeholders including the Executive, Product, Data and Design teams to assist with
data-related technical issues and support their data infrastructure needs.
-
● Keep our data separated and secure across national boundaries through multiple data centers and AWS
regions.
-
● Create data tools for analytics and data scientist team members that assist them in building and
optimizing our product into an innovative industry leader.
-
● Work with data and analytics experts to strive for greater functionality in our data systems.
REQUIRED SKILLS & QUALIFICATIONS:
-
● 5+ years of experience in a Data Engineer role.
-
● Advanced working SQL knowledge and experience working with relational databases, query authoring
(SQL) as well as working familiarity with a variety of databases.
-
● Experience building and optimizing 'big data' data pipelines, architectures and data sets.
-
● Experience performing root cause analysis on internal and external data and processes to answer
specific business questions and identify opportunities for improvement.
-
● Strong analytic skills related to working with unstructured datasets.
-
● Build processes supporting data transformation, data structures, metadata, dependency and workload
management.
-
● A successful history of manipulating, processing and extracting value from large disconnected datasets.
-
● Working knowledge of message queuing, stream processing, and highly scalable 'big data' data stores.
-
● Strong project management and organizational skills.
-
● Experience supporting and working with cross-functional teams in a dynamic environment
-
● Experience with big data tools: Hadoop, Spark, Pig, Vetica, etc.
-
● Experience with AWS cloud services: EC2, EMR, S3, Athena
-
● Experience with Linux
-
● Experience with object-oriented/object function scripting languages: Python, Java, Shell, Scala, etc.
PREFERRED SKILLS & QUALIFICATIONS:
● Graduate degree in Computer Science, Statistics, Informatics, Information Systems or another quantitative field.
- Building and operationalizing large scale enterprise data solutions and applications using one or more of AZURE data and analytics services in combination with custom solutions - Azure Synapse/Azure SQL DWH, Azure Data Lake, Azure Blob Storage, Spark, HDInsights, Databricks, CosmosDB, EventHub/IOTHub.
- Experience in migrating on-premise data warehouses to data platforms on AZURE cloud.
- Designing and implementing data engineering, ingestion, and transformation functions
-
Azure Synapse or Azure SQL data warehouse
-
Spark on Azure is available in HD insights and data bricks
- Experience with Azure Analysis Services
- Experience in Power BI
- Experience with third-party solutions like Attunity/Stream sets, Informatica
- Experience with PreSales activities (Responding to RFPs, Executing Quick POCs)
- Capacity Planning and Performance Tuning on Azure Stack and Spark.
Recko Inc. is looking for data engineers to join our kick-ass engineering team. We are looking for smart, dynamic individuals to connect all the pieces of the data ecosystem.
What are we looking for:
-
3+ years of development experience in at least one of MySQL, Oracle, PostgreSQL or MSSQL and experience in working with Big Data technologies like Big Data frameworks/platforms/data stores like Hadoop, HDFS, Spark, Oozie, Hue, EMR, Scala, Hive, Glue, Kerberos etc.
-
Strong experience setting up data warehouses, data modeling, data wrangling and dataflow architecture on the cloud
-
2+ experience with public cloud services such as AWS, Azure, or GCP and languages like Java/ Python etc
-
2+ years of development experience in Amazon Redshift, Google Bigquery or Azure data warehouse platforms preferred
-
Knowledge of statistical analysis tools like R, SAS etc
-
Familiarity with any data visualization software
-
A growth mindset and passionate about building things from the ground up and most importantly, you should be fun to work with
As a data engineer at Recko, you will:
-
Create and maintain optimal data pipeline architecture,
-
Assemble large, complex data sets that meet functional / non-functional business requirements.
-
Identify, design, and implement internal process improvements: automating manual processes, optimizing data delivery, re-designing infrastructure for greater scalability, etc.
-
Build the infrastructure required for optimal extraction, transformation, and loading of data from a wide variety of data sources using SQL and AWS ‘big data’ technologies.
-
Build analytics tools that utilize the data pipeline to provide actionable insights into customer acquisition, operational efficiency and other key business performance metrics.
-
Work with stakeholders including the Executive, Product, Data and Design teams to assist with data-related technical issues and support their data infrastructure needs.
-
Keep our data separated and secure across national boundaries through multiple data centers and AWS regions.
-
Create data tools for analytics and data scientist team members that assist them in building and optimizing our product into an innovative industry leader.
-
Work with data and analytics experts to strive for greater functionality in our data systems.
About Recko:
Recko was founded in 2017 to organise the world’s transactional information and provide intelligent applications to finance and product teams to make sense of the vast amount of data available. With the proliferation of digital transactions over the past two decades, Enterprises, Banks and Financial institutions are finding it difficult to keep a track on the money flowing across their systems. With the Recko Platform, businesses can build, integrate and adapt innovative and complex financial use cases within the organization and across external payment ecosystems with agility, confidence and at scale. . Today, customer-obsessed brands such as Deliveroo, Meesho, Grofers, Dunzo, Acommerce, etc use Recko so their finance teams can optimize resources with automation and prioritize growth over repetitive and time-consuming tasks around day-to-day operations.
Recko is a Series A funded startup, backed by marquee investors like Vertex Ventures, Prime Venture Partners and Locus Ventures. Traditionally enterprise software is always built around functionality. We believe software is an extension of one’s capability, and it should be delightful and fun to use.
Working at Recko:
We believe that great companies are built by amazing people. At Recko, We are a group of young Engineers, Product Managers, Analysts and Business folks who are on a mission to bring consumer tech DNA to enterprise fintech applications. The current team at Recko is 60+ members strong with stellar experience across fintech, e-commerce, digital domains at companies like Flipkart, PhonePe, Ola Money, Belong, Razorpay, Grofers, Jio, Oracle etc. We are growing aggressively across verticals.
- Total Experience of 7-10 years and should be interested in teaching and research
- 3+ years’ experience in data engineering which includes data ingestion, preparation, provisioning, automated testing, and quality checks.
- 3+ Hands-on experience in Big Data cloud platforms like AWS and GCP, Data Lakes and Data Warehouses
- 3+ years of Big Data and Analytics Technologies. Experience in SQL, writing code in spark engine using python, scala or java Language. Experience in Spark, Scala
- Experience in designing, building, and maintaining ETL systems
- Experience in data pipeline and workflow management tools like Airflow
- Application Development background along with knowledge of Analytics libraries, opensource Natural Language Processing, statistical and big data computing libraries
- Familiarity with Visualization and Reporting Tools like Tableau, Kibana.
- Should be good at storytelling in Technology
Qualification: B.Tech / BE / M.Sc / MBA / B.Sc, Having Certifications in Big Data Technologies and Cloud platforms like AWS, Azure and GCP will be preferred
Primary Skills: Big Data + Python + Spark + Hive + Cloud Computing
Secondary Skills: NoSQL+ SQL + ETL + Scala + Tableau
Selection Process: 1 Hackathon, 1 Technical round and 1 HR round
Benefit: Free of cost training on Data Science from top notch professors