Similar jobs
Principal Accountabilities :
1. Good in communication and converting business requirements to functional requirements
2. Develop data-driven insights and machine learning models to identify and extract facts from sales, supply chain and operational data
3. Sound Knowledge and experience in statistical and data mining techniques: Regression, Random Forest, Boosting Trees, Time Series Forecasting, etc.
5. Experience in SOTA Deep Learning techniques to solve NLP problems.
6. End-to-end data collection, model development and testing, and integration into production environments.
7. Build and prototype analysis pipelines iteratively to provide insights at scale.
8. Experience in querying different data sources
9. Partner with developers and business teams for the business-oriented decisions
10. Looking for someone who dares to move on even when the path is not clear and be creative to overcome challenges in the data.
CORE RESPONSIBILITIES
- Create and manage cloud resources in AWS
- Data ingestion from different data sources which exposes data using different technologies, such as: RDBMS, REST HTTP API, flat files, Streams, and Time series data based on various proprietary systems. Implement data ingestion and processing with the help of Big Data technologies
- Data processing/transformation using various technologies such as Spark and Cloud Services. You will need to understand your part of business logic and implement it using the language supported by the base data platform
- Develop automated data quality check to make sure right data enters the platform and verifying the results of the calculations
- Develop an infrastructure to collect, transform, combine and publish/distribute customer data.
- Define process improvement opportunities to optimize data collection, insights and displays.
- Ensure data and results are accessible, scalable, efficient, accurate, complete and flexible
- Identify and interpret trends and patterns from complex data sets
- Construct a framework utilizing data visualization tools and techniques to present consolidated analytical and actionable results to relevant stakeholders.
- Key participant in regular Scrum ceremonies with the agile teams
- Proficient at developing queries, writing reports and presenting findings
- Mentor junior members and bring best industry practices
QUALIFICATIONS
- 5-7+ years’ experience as data engineer in consumer finance or equivalent industry (consumer loans, collections, servicing, optional product, and insurance sales)
- Strong background in math, statistics, computer science, data science or related discipline
- Advanced knowledge one of language: Java, Scala, Python, C#
- Production experience with: HDFS, YARN, Hive, Spark, Kafka, Oozie / Airflow, Amazon Web Services (AWS), Docker / Kubernetes, Snowflake
- Proficient with
- Data mining/programming tools (e.g. SAS, SQL, R, Python)
- Database technologies (e.g. PostgreSQL, Redshift, Snowflake. and Greenplum)
- Data visualization (e.g. Tableau, Looker, MicroStrategy)
- Comfortable learning about and deploying new technologies and tools.
- Organizational skills and the ability to handle multiple projects and priorities simultaneously and meet established deadlines.
- Good written and oral communication skills and ability to present results to non-technical audiences
- Knowledge of business intelligence and analytical tools, technologies and techniques.
Mandatory Requirements
- Experience in AWS Glue
- Experience in Apache Parquet
- Proficient in AWS S3 and data lake
- Knowledge of Snowflake
- Understanding of file-based ingestion best practices.
- Scripting language - Python & pyspark
with proven ability to process large datasets and create new data models.
This is an exciting opportunity to join the explosively growing Blockchain fraternity
and work constructively with our in-house industry experts.
We expect you to have hands on working experience with -
- Machine Learning techniques and algorithms, such as k-NN, Naive Bayes, SVM,
Decision Forests, etc.
- Python data science toolkit with Scikit-Learn and Keras/Pytorch
- Prowess with both SQL and NoSQL Databases
- Good, applied statistics skills, such as distributions, statistical testing, regression,
etc.
- Excellent scripting and programming skills
- Fundamental principles of the CCAR or CCEL or Basel II Accord or IFRS
- Docker and other PaaS (Platform-as-a-Service) products
- MUST Specialize in Natural Language Processing
Key Roles & Responsibilities
- Develop predictive models, statistical analyses, optimization procedures, monitoring
processes, data quality analyses, and score implementations supporting impairment
and collections.
- Produce robust documentation to ensure replicability of results and fulfill Antier's
governance requirements.
- Superlative communication skills to represent us before stakeholders
- Leadership skills to build, expand and drive a new process
Academic & other Qualifications
- Postgraduate university degree in the quantitative discipline required (ex. Statistics,
Operations Research, Economics, Computer Science). Graduate studies.
- An ability to produce reports and interrogate systems to produce analysis and
resolve discrepancies/queries.
- Proficiency with analytical software Python or SAS or R, SQL tools.
- Prior experience in banking products such as Mutual Funds will add value
Technical Knowledge (Must Have)
- Strong experience in SQL / HiveQL/ AWS Athena,
- Strong expertise in the development of data pipelines (snaplogic is preferred).
- Design, Development, Deployment and administration of data processing applications.
- Good Exposure towards AWS and Azure Cloud computing environments.
- Knowledge around BigData, AWS Cloud Architecture, Best practices, Securities, Governance, Metadata Management, Data Quality etc.
- Data extraction through various firm sources (RDBMS, Unstructured Data Sources) and load to datalake with all best practices.
- Knowledge in Python
- Good knowledge in NoSQL technologies (Neo4J/ MongoDB)
- Experience/knowledge in SnapLogic (ETL Technologies)
- Working knowledge on Unix (AIX, Linux), shell scripting
- Experience/knowledge in Data Modeling. Database Development
- Experience/knowledge creation of reports and dashboards in Tableau/ PowerBI
Deep-Rooted.Co is on a mission to get Fresh, Clean, Community (Local farmer) produce from harvest to reach your home with a promise of quality first! Our values are rooted in trust, convenience, and dependability, with a bunch of learning & fun thrown in.
Founded out of Bangalore by Arvind, Avinash, Guru and Santosh, with the support of our Investors Accel, Omnivore & Mayfield, we raised $7.5 million in Seed, Series A and Debt funding till date from investors include ACCEL, Omnivore, Mayfield among others. Our brand Deep-Rooted.Co which was launched in August 2020 was the first of its kind as India’s Fruits & Vegetables (F&V) which is present in Bangalore & Hyderabad and on a journey of expansion to newer cities which will be managed seamlessly through Tech platform that has been designed and built to transform the Agri-Tech sector.
Deep-Rooted.Co is committed to building a diverse and inclusive workplace and is an equal-opportunity employer.
How is this possible? It’s because we work with smart people. We are looking for Engineers in Bangalore to work with the Product Leader (Founder) and CTO and this is a meaningful project for us and we are sure you will love the project as it touches everyday life and is fun. This will be a virtual consultation.
We want to start the conversation about the project we have for you, but before that, we want to connect with you to know what’s on your mind. Do drop a note sharing your mobile number and letting us know when we can catch up.
Purpose of the role:
* As a startup we have data distributed all across various sources like Excel, Google Sheets, Databases etc. We need swift decision making based a on a lot of data that exists as we grow. You help us bring together all this data and put it in a data model that can be used in business decision making. * Handle nuances of Excel and Google Sheets API. * Pull data in and manage it growth, freshness and correctness. * Transform data in a format that aids easy decision-making for Product, Marketing and Business Heads. * Understand the business problem, solve the same using the technology and take it to production - no hand offs - full path to production is yours.
Technical expertise:
* Good Knowledge And Experience with Programming languages - Java, SQL,Python. * Good Knowledge of Data Warehousing, Data Architecture. * Experience with Data Transformations and ETL; * Experience with API tools and more closed systems like Excel, Google Sheets etc. * Experience AWS Cloud Platform and Lambda * Experience with distributed data processing tools. * Experiences with container-based deployments on cloud.
Skills:
Java, SQL, Python, Data Build Tool, Lambda, HTTP, Rest API, Extract Transform Load.
Job Description:
We are looking for an exceptional Data Scientist Lead / Manager who is passionate about data and motivated to build large scale machine learning solutions to shine our data products. This person will be contributing to the analytics of data for insight discovery and development of machine learning pipeline to support modeling of terabytes of daily data for various use cases.
Location: Pune (Initially remote due to COVID 19)
*****Looking for someone who can start immediately / Within a month. Hands-on experience in Python programming (Minimum 5 Years) is a must.
About the Organisation :
- It provides a dynamic, fun workplace filled with passionate individuals. We are at the cutting edge of advertising technology and there is never a dull moment at work.
- We have a truly global footprint, with our headquarters in Singapore and offices in Australia, United States, Germany, United Kingdom and India.
- You will gain work experience in a global environment. We speak over 20 different languages, from more than 16 different nationalities and over 42% of our staff are multilingual.
Qualifications:
• 8+ years relevant working experience
• Master / Bachelors in computer science or engineering
• Working knowledge of Python and SQL
• Experience in time series data, data manipulation, analytics, and visualization
• Experience working with large-scale data
• Proficiency of various ML algorithms for supervised and unsupervised learning
• Experience working in Agile/Lean model
• Experience with Java and Golang is a plus
• Experience with BI toolkit such as Tableau, Superset, Quicksight, etc is a plus
• Exposure to building large-scale ML models using one or more of modern tools and libraries such as AWS Sagemaker, Spark ML-Lib, Dask, Tensorflow, PyTorch, Keras, GCP ML Stack
• Exposure to modern Big Data tech such as Cassandra/Scylla, Kafka, Ceph, Hadoop, Spark
• Exposure to IAAS platforms such as AWS, GCP, Azure
Typical persona: Data Science Manager/Architect
Experience: 8+ years programming/engineering experience (with at least last 4 years in Data science in a Product development company)
Type: Hands-on candidate only
Must:
a. Hands-on Python: pandas,scikit-learn
b. Working knowledge of Kafka
c. Able to carry out own tasks and help the team in resolving problems - logical or technical (25% of job)
d. Good on analytical & debugging skills
e. Strong communication skills
Desired (in order of priorities)
a.Go (Strong advantage)
b. Airflow (Strong advantage)
c. Familiarity & working experience on more than one type of database: relational, object, columnar, graph and other unstructured databases
d. Data structures, Algorithms
e. Experience with multi-threaded and thread sync concepts
f. AWS Sagemaker
g. Keras
- Adept at Machine learning techniques and algorithms.
Feature selection, dimensionality reduction, building and
- optimizing classifiers using machine learning techniques
- Data mining using state-of-the-art methods
- Doing ad-hoc analysis and presenting results
- Proficiency in using query languages such as N1QL, SQL
Experience with data visualization tools, such as D3.js, GGplot,
- Plotly, PyPlot, etc.
Creating automated anomaly detection systems and constant tracking
- of its performance
- Strong in Python is a must.
- Strong in Data Analysis and mining is a must
- Deep Learning, Neural Network, CNN, Image Processing (Must)
Building analytic systems - data collection, cleansing and
- integration
Experience with NoSQL databases, such as Couchbase, MongoDB,
Cassandra, HBase
- Data pre-processing, data transformation, data analysis, and feature engineering
- Performance optimization of scripts (code) and Productionizing of code (SQL, Pandas, Python or PySpark, etc.)
- Required skills:
- Bachelors in - in Computer Science, Data Science, Computer Engineering, IT or equivalent
- Fluency in Python (Pandas), PySpark, SQL, or similar
- Azure data factory experience (min 12 months)
- Able to write efficient code using traditional, OO concepts, modular programming following the SDLC process.
- Experience in production optimization and end-to-end performance tracing (technical root cause analysis)
- Ability to work independently with demonstrated experience in project or program management
- Azure experience ability to translate data scientist code in Python and make it efficient (production) for cloud deployment
------------------------
Solve problems in speech and NLP domain using advanced Deep learning and Machine Learning techniques. Few examples of the problems are -
* Limited resource Speaker Diarization on mono-channel recordings in noisy environment.
* Speech Enhancement to improve accuracy of downstream speech analytics tasks.
* Automated Speech Recognition for accent heavy audio with a noisy background.
* Speech analytic tasks, which include: emotions, empathy, keyword extraction.
* Text analytic tasks, which include: topic modeling, entity and intent extraction, opinion mining, text classification, and sentiment detection on multilingual data.
A typical day at work
-----------------------------
You will work closely with the product team to own a business problem. You will then model the business problem into a Machine Learning problem. Next you will do literature review to identify approaches to solve the problem. Test these approaches, identify the best approach, add your own insights to improve the performance and ship that to production!
What should you know?
---------------------------------
* Solid understanding of Classical Machine Learning and Deep Learning concepts and algorithms.
* Experience with literature review either in academia or industry.
* Proficiency in at least one programming language such as Python, C, C++, Java, etc.
* Proficiency in Machine Learning tools such as TensorFlow, Keras, Caffe, Torch/PyTorch or Theano.
* Advanced degree in Computer Science, Electrical Engineering, Machine Learning, Mathematics, Statistics, Physics, or Computational Linguistics
Why DeepAffects?
--------------------------
* You’ll learn insanely fast here.
* Esops and competitive compensation.
* Opportunity and encouragement for publishing research at top conferences, paid trips to attend workshop and conferences where you have published.
* Independent work, flexible timings and sense of ownership of your work.
* Mentorship from distinguished researchers and professors.