Must Have Skills:
About UAE Client
Similar jobs
Job Sector: IT, Software
Job Type: Permanent
Location: Chennai
Experience: 10 - 20 Years
Salary: 12 – 40 LPA
Education: Any Graduate
Notice Period: Immediate
Key Skills: Python, Spark, AWS, SQL, PySpark
Contact at triple eight two zero nine four two double seven
Job Description:
Requirements
- Minimum 12 years experience
- In depth understanding and knowledge on distributed computing with spark.
- Deep understanding of Spark Architecture and internals
- Proven experience in data ingestion, data integration and data analytics with spark, preferably PySpark.
- Expertise in ETL processes, data warehousing and data lakes.
- Hands on with python for Big data and analytics.
- Hands on in agile scrum model is an added advantage.
- Knowledge on CI/CD and orchestration tools is desirable.
- AWS S3, Redshift, Lambda knowledge is preferred
Must Have Skills:
• Good experience in Pyspark - Including Dataframe core functions and Spark SQL
• Good experience in SQL DBs - Be able to write queries including fair complexity.
• Should have excellent experience in Big Data programming for data transformation and aggregations
• Good at ELT architecture. Business rules processing and data extraction from Data Lake into data streams for business consumption.
• Good customer communication.
• Good Analytical skills
XpressBees – a logistics company started in 2015 – is amongst the fastest growing
companies of its sector. While we started off rather humbly in the space of
ecommerce B2C logistics, the last 5 years have seen us steadily progress towards
expanding our presence. Our vision to evolve into a strong full-service logistics
organization reflects itself in our new lines of business like 3PL, B2B Xpress and cross
border operations. Our strong domain expertise and constant focus on meaningful
innovation have helped us rapidly evolve as the most trusted logistics partner of
India. We have progressively carved our way towards best-in-class technology
platforms, an extensive network reach, and a seamless last mile management
system. While on this aggressive growth path, we seek to become the one-stop-shop
for end-to-end logistics solutions. Our big focus areas for the very near future
include strengthening our presence as service providers of choice and leveraging the
power of technology to improve efficiencies for our clients.
Job Profile
As a Lead Data Engineer in the Data Platform Team at XpressBees, you will build the data platform
and infrastructure to support high quality and agile decision-making in our supply chain and logistics
workflows.
You will define the way we collect and operationalize data (structured / unstructured), and
build production pipelines for our machine learning models, and (RT, NRT, Batch) reporting &
dashboarding requirements. As a Senior Data Engineer in the XB Data Platform Team, you will use
your experience with modern cloud and data frameworks to build products (with storage and serving
systems)
that drive optimisation and resilience in the supply chain via data visibility, intelligent decision making,
insights, anomaly detection and prediction.
What You Will Do
• Design and develop data platform and data pipelines for reporting, dashboarding and
machine learning models. These pipelines would productionize machine learning models
and integrate with agent review tools.
• Meet the data completeness, correction and freshness requirements.
• Evaluate and identify the data store and data streaming technology choices.
• Lead the design of the logical model and implement the physical model to support
business needs. Come up with logical and physical database design across platforms (MPP,
MR, Hive/PIG) which are optimal physical designs for different use cases (structured/semi
structured). Envision & implement the optimal data modelling, physical design,
performance optimization technique/approach required for the problem.
• Support your colleagues by reviewing code and designs.
• Diagnose and solve issues in our existing data pipelines and envision and build their
successors.
Qualifications & Experience relevant for the role
• A bachelor's degree in Computer Science or related field with 6 to 9 years of technology
experience.
• Knowledge of Relational and NoSQL data stores, stream processing and micro-batching to
make technology & design choices.
• Strong experience in System Integration, Application Development, ETL, Data-Platform
projects. Talented across technologies used in the enterprise space.
• Software development experience using:
• Expertise in relational and dimensional modelling
• Exposure across all the SDLC process
• Experience in cloud architecture (AWS)
• Proven track record in keeping existing technical skills and developing new ones, so that
you can make strong contributions to deep architecture discussions around systems and
applications in the cloud ( AWS).
• Characteristics of a forward thinker and self-starter that flourishes with new challenges
and adapts quickly to learning new knowledge
• Ability to work with a cross functional teams of consulting professionals across multiple
projects.
• Knack for helping an organization to understand application architectures and integration
approaches, to architect advanced cloud-based solutions, and to help launch the build-out
of those systems
• Passion for educating, training, designing, and building end-to-end systems.
Python + Data scientist : |
• Build data-driven models to understand the characteristics of engineering systems |
• Train, tune, validate, and monitor predictive models |
• Sound knowledge on Statistics |
• Experience in developing data processing tasks using PySpark such as reading, merging, enrichment, loading of data from external systems to target data destinations |
• Working knowledge on Big Data or/and Hadoop environments |
• Experience creating CI/CD Pipelines using Jenkins or like tools |
• Practiced in eXtreme Programming (XP) disciplines |
Technical/Core skills
- Minimum 3 yrs of exp in Informatica Big data Developer(BDM) in Hadoop environment.
- Have knowledge of informatica Power exchange (PWX).
- Minimum 3 yrs of exp in big data querying tool like Hive and Impala.
- Ability to designing/development of complex mappings using informatica Big data Developer.
- Create and manage Informatica power exchange and CDC real time implementation
- Strong Unix knowledge skills for writing shell scripts and troubleshoot of existing scripts.
- Good knowledge of big data platforms and its framework.
- Good to have an experience in cloudera data platform (CDP)
- Experience with building stream processing systems using Kafka and spark
- Excellent SQL knowledge
Soft skills :
- Ability to work independently
- Strong analytical and problem solving skills
- Attitude of learning new technology
- Regular interaction with vendors, partners and stakeholders
lesser concentration on enforcing how to do a particular task, we believe in giving people the opportunity to think out of the box and come up with their own innovative solution to problem solving.
You will primarily be developing, managing and executing handling multiple prospect campaigns as part of Prospect Marketing Journey to ensure best conversion rates and retention rates. Below are the roles, responsibilities and skillsets we are looking for and if you feel these resonate with you, please get in touch with us by applying to this role.
Roles and Responsibilities:
• You'd be responsible for development and maintenance of applications with technologies involving Enterprise Java and Distributed technologies.
• You'd collaborate with developers, product manager, business analysts and business users in conceptualizing, estimating and developing new software applications and enhancements.
• You'd Assist in the definition, development, and documentation of software’s objectives, business requirements, deliverables, and specifications in collaboration with multiple cross-functional teams.
• Assist in the design and implementation process for new products, research and create POC for possible solutions.
Skillset:
• Bachelors or Masters Degree in a technology related field preferred.
• Overall experience of 2-3 years on the Big Data Technologies.
• Hands on experience with Spark (Java/ Scala)
• Hands on experience with Hive, Shell Scripting
• Knowledge on Hbase, Elastic Search
• Development experience In Java/ Python is preferred
• Familiar with profiling, code coverage, logging, common IDE’s and other
development tools.
• Demonstrated verbal and written communication skills, and ability to interface with Business, Analytics and IT organizations.
• Ability to work effectively in short-cycle, team oriented environment, managing multiple priorities and tasks.
• Ability to identify non-obvious solutions to complex problems
Key Responsibilities : ( Data Developer Python, Spark)
Exp : 2 to 9 Yrs
Development of data platforms, integration frameworks, processes, and code.
Develop and deliver APIs in Python or Scala for Business Intelligence applications build using a range of web languages
Develop comprehensive automated tests for features via end-to-end integration tests, performance tests, acceptance tests and unit tests.
Elaborate stories in a collaborative agile environment (SCRUM or Kanban)
Familiarity with cloud platforms like GCP, AWS or Azure.
Experience with large data volumes.
Familiarity with writing rest-based services.
Experience with distributed processing and systems
Experience with Hadoop / Spark toolsets
Experience with relational database management systems (RDBMS)
Experience with Data Flow development
Knowledge of Agile and associated development techniques including:
n
Must Have Skills: