![skill icon](/_next/image?url=https%3A%2F%2Fcdn.cutshort.io%2Fpublic%2Fimages%2Fskill_icons%2Fscala.png&w=32&q=75)
![skill icon](/_next/image?url=https%3A%2F%2Fcdn.cutshort.io%2Fpublic%2Fimages%2Fskill_icons%2Fpython.png&w=32&q=75)
Your mission is to help lead team towards creating solutions that improve the way our business is run. Your knowledge of design, development, coding, testing and application programming will help your team raise their game, meeting your standards, as well as satisfying both business and functional requirements. Your expertise in various technology domains will be counted on to set strategic direction and solve complex and mission critical problems, internally and externally. Your quest to embracing leading-edge technologies and methodologies inspires your team to follow suit.
Responsibilities and Duties :
- As a Data Engineer you will be responsible for the development of data pipelines for numerous applications handling all kinds of data like structured, semi-structured &
unstructured. Having big data knowledge specially in Spark & Hive is highly preferred.
- Work in team and provide proactive technical oversight, advice development teams fostering re-use, design for scale, stability, and operational efficiency of data/analytical solutions
Education level :
- Bachelor's degree in Computer Science or equivalent
Experience :
- Minimum 5+ years relevant experience working on production grade projects experience in hands on, end to end software development
- Expertise in application, data and infrastructure architecture disciplines
- Expert designing data integrations using ETL and other data integration patterns
- Advanced knowledge of architecture, design and business processes
Proficiency in :
- Modern programming languages like Java, Python, Scala
- Big Data technologies Hadoop, Spark, HIVE, Kafka
- Writing decently optimized SQL queries
- Orchestration and deployment tools like Airflow & Jenkins for CI/CD (Optional)
- Responsible for design and development of integration solutions with Hadoop/HDFS, Real-Time Systems, Data Warehouses, and Analytics solutions
- Knowledge of system development lifecycle methodologies, such as waterfall and AGILE.
- An understanding of data architecture and modeling practices and concepts including entity-relationship diagrams, normalization, abstraction, denormalization, dimensional
modeling, and Meta data modeling practices.
- Experience generating physical data models and the associated DDL from logical data models.
- Experience developing data models for operational, transactional, and operational reporting, including the development of or interfacing with data analysis, data mapping,
and data rationalization artifacts.
- Experience enforcing data modeling standards and procedures.
- Knowledge of web technologies, application programming languages, OLTP/OLAP technologies, data strategy disciplines, relational databases, data warehouse development and Big Data solutions.
- Ability to work collaboratively in teams and develop meaningful relationships to achieve common goals
Skills :
Must Know :
- Core big-data concepts
- Spark - PySpark/Scala
- Data integration tool like Pentaho, Nifi, SSIS, etc (at least 1)
- Handling of various file formats
- Cloud platform - AWS/Azure/GCP
- Orchestration tool - Airflow
![companies logos](/_next/image?url=https%3A%2F%2Fcdn.cutshort.io%2Fpublic%2Fimages%2Fhiring_companies_logos-v2.webp&w=3840&q=80)
Similar jobs
![skill icon](/_next/image?url=https%3A%2F%2Fcdn.cutshort.io%2Fpublic%2Fimages%2Fskill_icons%2Fdata_science.png&w=32&q=75)
![skill icon](/_next/image?url=https%3A%2F%2Fcdn.cutshort.io%2Fpublic%2Fimages%2Fskill_icons%2Fmachine_learning.png&w=32&q=75)
![skill icon](/_next/image?url=https%3A%2F%2Fcdn.cutshort.io%2Fpublic%2Fimages%2Fskill_icons%2Fpython.png&w=32&q=75)
Title:- Data Scientist
Experience:-6 years
Work Mode:- Onsite
Primary Skills:- Data Science, SQL, Python, Data Modelling, Azure, AWS, Banking Domain (BFSI/NBFC)
Qualification:- Any
Roles & Responsibilities:-
1. Acquiring, cleaning, and preprocessing raw data for analysis.
2. Utilizing statistical methods and tools for analyzing and interpreting complex datasets.
3. Developing and implementing machine learning models for predictive analysis.
4. Creating visualizations to effectively communicate insights to both technical and non-technical stakeholders.
5. Collaborating with cross-functional teams, including data engineers, business analysts, and domain experts.
6. Evaluating and optimizing the performance of machine learning models for accuracy and efficiency.
7. Identifying patterns and trends within data to inform business decision-making.
8. Staying updated on the latest advancements in data science, machine learning, and relevant technologies.
Requirement:-
1. Experience with modeling techniques such as Linear Regression, clustering, and classification techniques.
2. Must have a passion for data, structured or unstructured. 0.6 – 5 years of hands-on experience with Python and SQL is a must.
3. Should have sound experience in data mining, data analysis and machine learning techniques.
4. Excellent critical thinking, verbal and written communications skills.
5. Ability and desire to work in a proactive, highly engaging, high-pressure, client service environment.
6. Good presentation skills.
![skill icon](/_next/image?url=https%3A%2F%2Fcdn.cutshort.io%2Fpublic%2Fimages%2Fskill_icons%2Fpython.png&w=32&q=75)
![skill icon](/_next/image?url=https%3A%2F%2Fcdn.cutshort.io%2Fpublic%2Fimages%2Fskill_icons%2Fmachine_learning.png&w=32&q=75)
The role is with a Fintech Credit Card company based in Pune within the Decision Science team. (OneCard )
About
Credit cards haven't changed much for over half a century so our team of seasoned bankers, technologists, and designers set out to redefine the credit card for you - the consumer. The result is OneCard - a credit card reimagined for the mobile generation. OneCard is India's best metal credit card built with full-stack tech. It is backed by the principles of simplicity, transparency, and giving back control to the user.
The Engineering Challenge
“Re-imaging credit and payments from First Principles”
Payments is an interesting engineering challenge in itself with requirements of low latency, transactional guarantees, security, and high scalability. When we add credit and engagement into the mix, the challenge becomes even more interesting with underwriting and recommendation algorithms working on large data sets. We have eliminated the current call center, sales agent, and SMS-based processes with a mobile app that puts the customers in complete control. To stay agile, the entire stack is built on the cloud with modern technologies.
Purpose of Role :
- Develop and implement the collection analytics and strategy function for the credit cards. Use analysis and customer insights to develop optimum strategy.
CANDIDATE PROFILE :
- Successful candidates will have in-depth knowledge of statistical modelling/data analysis tools (Python, R etc.), techniques. They will be an adept communicator with good interpersonal skills to work with senior stake holders in India to grow revenue primarily through identifying / delivering / creating new, profitable analytics solutions.
We are looking for someone who:
- Proven track record in collection and risk analytics preferably in Indian BFSI industry. This is a must.
- Identify & deliver appropriate analytics solutions
- Experienced in Analytics team management
Essential Duties and Responsibilities :
- Responsible for delivering high quality analytical and value added services
- Responsible for automating insights and proactive actions on them to mitigate collection Risk.
- Work closely with the internal team members to deliver the solution
- Engage Business/Technical Consultants and delivery teams appropriately so that there is a shared understanding and agreement as to deliver proposed solution
- Use analysis and customer insights to develop value propositions for customers
- Maintain and enhance the suite of suitable analytics products.
- Actively seek to share knowledge within the team
- Share findings with peers from other teams and management where required
- Actively contribute to setting best practice processes.
Knowledge, Experience and Qualifications :
Knowledge :
- Good understanding of collection analytics preferably in Retail lending industry.
- Knowledge of statistical modelling/data analysis tools (Python, R etc.), techniques and market trends
- Knowledge of different modelling frameworks like Linear Regression, Logistic Regression, Multiple Regression, LOGIT, PROBIT, time- series modelling, CHAID, CART etc.
- Knowledge of Machine learning & AI algorithms such as Gradient Boost, KNN, etc.
- Understanding of decisioning and portfolio management in banking and financial services would be added advantage
- Understanding of credit bureau would be an added advantage
Experience :
- 4 to 8 years of work experience in core analytics function of a large bank / consulting firm.
- Experience on working on Collection analytics is must
- Experience on handling large data volumes using data analysis tools and generating good data insights
- Demonstrated ability to communicate ideas and analysis results effectively both verbally and in writing to technical and non-technical audiences
- Excellent communication, presentation and writing skills Strong interpersonal skills
- Motivated to meet and exceed stretch targets
- Ability to make the right judgments in the face of complexity and uncertainty
- Excellent relationship and networking skills across our different business and geographies
Qualifications :
- Masters degree in Statistics, Mathematics, Economics, Business Management or Engineering from a reputed college
Responsibilities:
- Should act as a technical resource for the Data Science team and be involved in creating and implementing current and future Analytics projects like data lake design, data warehouse design, etc.
- Analysis and design of ETL solutions to store/fetch data from multiple systems like Google Analytics, CleverTap, CRM systems etc.
- Developing and maintaining data pipelines for real time analytics as well as batch analytics use cases.
- Collaborate with data scientists and actively work in the feature engineering and data preparation phase of model building
- Collaborate with product development and dev ops teams in implementing the data collection and aggregation solutions
- Ensure quality and consistency of the data in Data warehouse and follow best data governance practices
- Analyse large amounts of information to discover trends and patterns
- Mine and analyse data from company databases to drive optimization and improvement of product development, marketing techniques and business strategies.\
Requirements
- Bachelor’s or Masters in a highly numerate discipline such as Engineering, Science and Economics
- 2-6 years of proven experience working as a Data Engineer preferably in ecommerce/web based or consumer technologies company
- Hands on experience of working with different big data tools like Hadoop, Spark , Flink, Kafka and so on
- Good understanding of AWS ecosystem for big data analytics
- Hands on experience in creating data pipelines either using tools or by independently writing scripts
- Hands on experience in scripting languages like Python, Scala, Unix Shell scripting and so on
- Strong problem solving skills with an emphasis on product development.
- Experience using business intelligence tools e.g. Tableau, Power BI would be an added advantage (not mandatory)
![skill icon](/_next/image?url=https%3A%2F%2Fcdn.cutshort.io%2Fpublic%2Fimages%2Fskill_icons%2Fpython.png&w=32&q=75)
GCP Data Analyst profile must have below skills sets :
- Knowledge of programming languages like https://apc01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.simplilearn.com%2Ftutorials%2Fsql-tutorial%2Fhow-to-become-sql-developer&data=05%7C01%7Ca_anjali%40hcl.com%7C4ae720b3f3cc45c3e04608da3346b335%7C189de737c93a4f5a8b686f4ca9941912%7C0%7C0%7C637878675987971859%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=EImfaJAD1KHOyrBQ7FkbaPl1STtfnf4QdQlbjw72%2BmE%3D&reserved=0" target="_blank">SQL, Oracle, R, MATLAB, Java and https://apc01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.simplilearn.com%2Fwhy-learn-python-a-guide-to-unlock-your-python-career-article&data=05%7C01%7Ca_anjali%40hcl.com%7C4ae720b3f3cc45c3e04608da3346b335%7C189de737c93a4f5a8b686f4ca9941912%7C0%7C0%7C637878675987971859%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=Z2n1Xy%2F3YN6nQqSweU5T7EfUTa1kPAAjbCMTWxDCh%2FY%3D&reserved=0" target="_blank">Python
- Data cleansing, data visualization, data wrangling
- Data modeling , data warehouse concepts
- Adapt to Big data platform like Hadoop, Spark for stream & batch processing
- GCP (Cloud Dataproc, Cloud Dataflow, Cloud Datalab, Cloud Dataprep, BigQuery, Cloud Datastore, Cloud Datafusion, Auto ML etc)
Company Overview:
Rakuten, Inc. (TSE's first section: 4755) is the largest ecommerce company in Japan, and third largest eCommerce marketplace company worldwide. Rakuten provides a variety of consumer and business-focused services including e-commerce, e-reading, travel, banking, securities, credit card, e-money, portal and media, online marketing and professional sports. The company is expanding globally and currently has operations throughout Asia, Western Europe, and the Americas. Founded in 1997, Rakuten is headquartered in Tokyo, with over 17,000 employees and partner staff worldwide. Rakuten's 2018 revenues were 1101.48 billions yen. -In Japanese, Rakuten stands for ‘optimism.’ -It means we believe in the future. -It’s an understanding that, with the right mind-set, -we can make the future better by what we do today. Today, our 70+ businesses span e-commerce, digital content, communications and FinTech, bringing the joy of discovery to more than 1.2 billion members across the world.
Website : https://www.rakuten.com/">https://www.rakuten.com/
Crunchbase : https://www.crunchbase.com/organization/rakuten">Rakuten has raised a total of https://www.crunchbase.com/search/funding_rounds/field/organizations/funding_total/rakuten">$42.4M in funding over https://www.crunchbase.com/search/funding_rounds/field/organizations/num_funding_rounds/rakuten">2 rounds
Companysize : 10,001 + Employees
Founded : 1997
Headquarters : Tokyo, Japan
Work location : Bangalore (M.G.Road)
Please find below Job Description.
Role Description – Data Engineer for AN group (Location - India)
Key responsibilities include:
We are looking for engineering candidate in our Autonomous Networking Team. The ideal candidate must have following abilities –
- Hands- on experience in big data computation technologies (at least one and potentially several of the following: Spark and Spark Streaming, Hadoop, Storm, Kafka Streaming, Flink, etc)
- Familiar with other related big data technologies, such as big data storage technologies (e.g., Phoenix/HBase, Redshift, Presto/Athena, Hive, Spark SQL, BigTable, BigQuery, Clickhouse, etc), messaging layer (Kafka, Kinesis, etc), Cloud and container- based deployments (Docker, Kubernetes etc), Scala, Akka, SocketIO, ElasticSearch, RabbitMQ, Redis, Couchbase, JAVA, Go lang.
- Partner with product management and delivery teams to align and prioritize current and future new product development initiatives in support of our business objectives
- Work with cross functional engineering teams including QA, Platform Delivery and DevOps
- Evaluate current state solutions to identify areas to improve standards, simplify, and enhance functionality and/or transition to effective solutions to improve supportability and time to market
- Not afraid of refactoring existing system and guiding the team about same.
- Experience with Event driven Architecture, Complex Event Processing
- Extensive experience building and owning large- scale distributed backend systems.
• 2+ years of experience in data engineering & strong understanding of data engineering principles using big data technologies
• Excellent programming skills in Python is mandatory
• Expertise in relational databases (MSSQL/MySQL/Postgres) and expertise in SQL. Exposure to NoSQL such as Cassandra. MongoDB will be a plus.
• Exposure to deploying ETL pipelines such as AirFlow, Docker containers & Lambda functions
• Experience in AWS loud services such as AWS CLI, Glue, Kinesis etc
• Experience using Tableau for data visualization is a plus
• Ability to demonstrate a portfolio of projects (GitHub, papers, etc.) is a plus
• Motivated, can-do attitude and desire to make a change is a must
• Excellent communication skills
![skill icon](/_next/image?url=https%3A%2F%2Fcdn.cutshort.io%2Fpublic%2Fimages%2Fskill_icons%2Fdata_science.png&w=32&q=75)
![skill icon](/_next/image?url=https%3A%2F%2Fcdn.cutshort.io%2Fpublic%2Fimages%2Fskill_icons%2Fr.png&w=32&q=75)
![skill icon](/_next/image?url=https%3A%2F%2Fcdn.cutshort.io%2Fpublic%2Fimages%2Fskill_icons%2Fpython.png&w=32&q=75)
![skill icon](/_next/image?url=https%3A%2F%2Fcdn.cutshort.io%2Fpublic%2Fimages%2Fskill_icons%2Fmachine_learning.png&w=32&q=75)
- Extract and present valuable information from data
- Understand business requirements and generate insights
- Build mathematical models, validate and work with them
- Explain complex topics tailored to the audience
- Validate and follow up on results
- Work with large and complex data sets
- Establish priorities with clear goals and responsibilities to achieve a high level of performance.
- Work in an agile and iterative manner on solving problems
- Evaluate different options proactively and the ability to solve problems in an innovative way. Develop new solutions or combine existing methods to create new approaches.
- Good understanding of Digital & analytics
- Strong communication skills, orally and in writing
Job Overview:
As a Data Scientist, you will work in collaboration with our business and engineering people, on creating value from data. Often the work requires solving complex problems by turning vast amounts of data into business insights through advanced analytics, modeling, and machine learning. You have a strong foundation in analytics, mathematical modeling, computer science, and math - coupled with a strong business sense. You proactively fetch information from various sources and analyze it for better understanding of how the business performs. Furthermore, you model and build AI tools that automate certain processes within the company. The solutions produced will be implemented to impact business results.
Primary Responsibilities:
- Develop an understanding of business obstacles, create solutions based on advanced analytics and draw implications for model development
- Combine, explore, and draw insights from data. Often large and complex data assets from different parts of the business.
- Design and build explorative, predictive- or prescriptive models, utilizing optimization, simulation, and machine learning techniques
- Prototype and pilot new solutions and be a part of the aim of ‘productizing’ those valuable solutions that can have an impact at a global scale
- Guides and coaches other chapter colleagues to help solve data/technical problems at an operational level, and in methodologies to help improve development processes
- Identifies and interprets trends and patterns in complex data sets to enable the business to make data-driven decisions
![skill icon](/_next/image?url=https%3A%2F%2Fcdn.cutshort.io%2Fpublic%2Fimages%2Fskill_icons%2Fpython.png&w=32&q=75)
- Hands-on experience in Development
- 4-6 years of Hands on experience with Python scripts
- 2-3 years of Hands on experience in PySpark coding. Worked in spark cluster computing technology.
- 3-4 years of Hands on end to end data pipeline experience working on AWS environments
- 3-4 years of Hands on experience working on AWS services – Glue, Lambda, Step Functions, EC2, RDS, SES, SNS, DMS, CloudWatch etc.
- 2-3 years of Hands on experience working on AWS redshift
- 6+ years of Hands on experience with writing Unix Shell scripts
- Good communication skills
Understand various raw data input formats, build consumers on Kafka/ksqldb for them and ingest large amounts of raw data into Flink and Spark.
Conduct complex data analysis and report on results.
Build various aggregation streams for data and convert raw data into various logical processing streams.
Build algorithms to integrate multiple sources of data and create a unified data model from all the sources.
Build a unified data model on both SQL and NO-SQL databases to act as data sink.
Communicate the designs effectively with the fullstack engineering team for development.
Explore machine learning models that can be fitted on top of the data pipelines.
Mandatory Qualifications Skills:
Deep knowledge of Scala and Java programming languages is mandatory
Strong background in streaming data frameworks (Apache Flink, Apache Spark) is mandatory
Good understanding and hands on skills on streaming messaging platforms such as Kafka
Familiarity with R, C and Python is an asset
Analytical mind and business acumen with strong math skills (e.g. statistics, algebra)
Problem-solving aptitude
Excellent communication and presentation skills
![skill icon](/_next/image?url=https%3A%2F%2Fcdn.cutshort.io%2Fpublic%2Fimages%2Fskill_icons%2Fpython.png&w=32&q=75)
- Must have 5-8 years of experience in handling data
- Must have the ability to interpret large amounts of data and to multi-task
- Must have strong knowledge of and experience with programming (Python), Linux/Bash scripting, databases(SQL, etc)
- Must have strong analytical and critical thinking to resolve business problems using data and tech
- Must have domain familiarity and interest of – Cloud technologies (GCP/Azure Microsoft/ AWS Amazon), open-source technologies, Enterprise technologies
- Must have the ability to collect, organize, analyze, and disseminate significant amounts of information with attention to detail and accuracy.
- Must have good communication skills
- Working knowledge/exposure to ElasticSearch, PostgreSQL, Athena, PrestoDB, Jupyter Notebook
![icon](/_next/image?url=https%3A%2F%2Fcdn.cutshort.io%2Fpublic%2Fimages%2Fsearch.png&w=48&q=75)
![companies logos](/_next/image?url=https%3A%2F%2Fcdn.cutshort.io%2Fpublic%2Fimages%2Fhiring_companies_logos-v2.webp&w=3840&q=80)