Principal Accountabilities :
1. Good in communication and converting business requirements to functional requirements
2. Develop data-driven insights and machine learning models to identify and extract facts from sales, supply chain and operational data
3. Sound Knowledge and experience in statistical and data mining techniques: Regression, Random Forest, Boosting Trees, Time Series Forecasting, etc.
5. Experience in SOTA Deep Learning techniques to solve NLP problems.
6. End-to-end data collection, model development and testing, and integration into production environments.
7. Build and prototype analysis pipelines iteratively to provide insights at scale.
8. Experience in querying different data sources
9. Partner with developers and business teams for the business-oriented decisions
10. Looking for someone who dares to move on even when the path is not clear and be creative to overcome challenges in the data.
About Matellio India Private Limited
Similar jobs
- Work in collaboration with the application team and integration team to design, create, and maintain optimal data pipeline architecture and data structures for Data Lake/Data Warehouse.
- Work with stakeholders including the Sales, Product, and Customer Support teams to assist with data-related technical issues and support their data analytics needs.
- Assemble large, complex data sets from third-party vendors to meet business requirements.
- Identify, design, and implement internal process improvements: automating manual processes, optimizing data delivery, re-designing infrastructure for greater scalability, etc.
- Build the infrastructure required for optimal extraction, transformation, and loading of data from a wide variety of data sources using SQL, Elasticsearch, MongoDB, and AWS technology.
- Streamline existing and introduce enhanced reporting and analysis solutions that leverage complex data sources derived from multiple internal systems.
Requirements
- 5+ years of experience in a Data Engineer role.
- Proficiency in Linux.
- Must have SQL knowledge and experience working with relational databases, query authoring (SQL) as well as familiarity with databases including Mysql, Mongo, Cassandra, and Athena.
- Must have experience with Python/Scala.
- Must have experience with Big Data technologies like Apache Spark.
- Must have experience with Apache Airflow.
- Experience with data pipeline and ETL tools like AWS Glue.
- Experience working with AWS cloud services: EC2, S3, RDS, Redshift.
closely with the Kinara management team to investigate strategically important business
questions.
Lead a team through the entire analytical and machine learning model life cycle:
Define the problem statement
Build and clean datasets
Exploratory data analysis
Feature engineering
Apply ML algorithms and assess the performance
Code for deployment
Code testing and troubleshooting
Communicate Analysis to Stakeholders
Manage Data Analysts and Data Scientists
Experience in Pricing models will be definite plus
Senior Data Scientist-Job Description
The Senior Data Scientist role is a creative problem solver who utilizes statistical/mathematical principles and modelling skills to uncover new insights that will significantly and meaningfully impact business decisions and actions. She/he applies their data science expertise in identifying, defining, and executing state-of-art techniques for academic opportunities and business objectives in collaboration with other Analytics team members. The Senior Data Scientist will execute analyses & outputs spanning test design and measurement, predictive analytics, multivariate analysis, data/text mining, pattern recognition, artificial intelligence, and machine learning.
Key Responsibilities:
- Perform the full range of data science activities including test design and measurement, predictive/advanced analytics, and data mining, and analytic dashboards.
- Extract, manipulate, analyse & interpret data from various corporate data sources developing advanced analytic solutions, deriving key observations, findings, insights, and formulating actionable recommendations.
- Generate clearly understood and intuitive data science / advanced analytics outputs.
- Provide thought leadership and recommendations on business process improvement, analytic solutions to complex problems.
- Participate in best practice sharing and communication platform for advancement of the data science discipline.
- Coach and collaborate with other data scientists and data analysts.
- Present impact, insights, outcomes & recommendations to key business partners and stakeholders.
- Comply with established Service Level Agreements to ensure timely, high quality deliverables with value-add recommendations, clearly articulated key findings and observations.
Qualification:
- Bachelor's Degree (B.A./B.S.) or Master’s Degree (M.A./M.S.) in Computer Science, Statistics, Mathematics, Machine Learning, Physics, or similar degree
- 5+ years of experience in data science in a digitally advanced industry focusing on strategic initiatives, marketing and/or operations.
- Advanced knowledge of best-in-class analytic software tools and languages: Python, SQL, R, SAS, Tableau, Excel, PowerPoint.
- Expertise in statistical methods, statistical analysis, data visualization, and data mining techniques.
- Experience in Test design, Design of Experiments, A/B Testing, Measurement Science Strong influencing skills to drive a robust testing agenda and data driven decision making for process improvements
- Strong Critical thinking skills to track down complex data and engineering issues, evaluate different algorithmic approaches, and analyse data to solve problems.
- Experience in partnering with IT, marketing operations & business operations to deploy predictive analytic solutions.
- Ability to translate/communicate complex analytical/statistical/mathematical concepts with non-technical audience.
- Strong written and verbal communications skills, as well as presentation skills.
- Minimum 1 years of relevant experience, in PySpark (mandatory)
- Hands on experience in development, test, deploy, maintain and improving data integration pipeline in AWS cloud environment is added plus
- Ability to play lead role and independently manage 3-5 member of Pyspark development team
- EMR ,Python and PYspark mandate.
- Knowledge and awareness working with AWS Cloud technologies like Apache Spark, , Glue, Kafka, Kinesis, and Lambda in S3, Redshift, RDS
This position is not for freshers. We are looking for candidates with AI/ML/CV experience of at least 4 year in the industry.
We’re looking to hire someone to help scale Machine Learning and NLP efforts at Episource. You’ll work with the team that develops the models powering Episource’s product focused on NLP driven medical coding. Some of the problems include improving our ICD code recommendations , clinical named entity recognition and information extraction from clinical notes.
This is a role for highly technical machine learning & data engineers who combine outstanding oral and written communication skills, and the ability to code up prototypes and productionalize using a large range of tools, algorithms, and languages. Most importantly they need to have the ability to autonomously plan and organize their work assignments based on high-level team goals.
You will be responsible for setting an agenda to develop and ship machine learning models that positively impact the business, working with partners across the company including operations and engineering. You will use research results to shape strategy for the company, and help build a foundation of tools and practices used by quantitative staff across the company.
What you will achieve:
-
Define the research vision for data science, and oversee planning, staffing, and prioritization to make sure the team is advancing that roadmap
-
Invest in your team’s skills, tools, and processes to improve their velocity, including working with engineering counterparts to shape the roadmap for machine learning needs
-
Hire, retain, and develop talented and diverse staff through ownership of our data science hiring processes, brand, and functional leadership of data scientists
-
Evangelise machine learning and AI internally and externally, including attending conferences and being a thought leader in the space
-
Partner with the executive team and other business leaders to deliver cross-functional research work and models
Required Skills:
-
Strong background in classical machine learning and machine learning deployments is a must and preferably with 4-8 years of experience
-
Knowledge of deep learning & NLP
-
Hands-on experience in TensorFlow/PyTorch, Scikit-Learn, Python, Apache Spark & Big Data platforms to manipulate large-scale structured and unstructured datasets.
-
Experience with GPU computing is a plus.
-
Professional experience as a data science leader, setting the vision for how to most effectively use data in your organization. This could be through technical leadership with ownership over a research agenda, or developing a team as a personnel manager in a new area at a larger company.
-
Expert-level experience with a wide range of quantitative methods that can be applied to business problems.
-
Evidence you’ve successfully been able to scope, deliver and sell your own research in a way that shifts the agenda of a large organization.
-
Excellent written and verbal communication skills on quantitative topics for a variety of audiences: product managers, designers, engineers, and business leaders.
-
Fluent in data fundamentals: SQL, data manipulation using a procedural language, statistics, experimentation, and modeling
Qualifications
-
Professional experience as a data science leader, setting the vision for how to most effectively use data in your organization
-
Expert-level experience with machine learning that can be applied to business problems
-
Evidence you’ve successfully been able to scope, deliver and sell your own work in a way that shifts the agenda of a large organization
-
Fluent in data fundamentals: SQL, data manipulation using a procedural language, statistics, experimentation, and modeling
-
Degree in a field that has very applicable use of data science / statistics techniques (e.g. statistics, applied math, computer science, OR a science field with direct statistics application)
-
5+ years of industry experience in data science and machine learning, preferably at a software product company
-
3+ years of experience managing data science teams, incl. managing/grooming managers beneath you
-
3+ years of experience partnering with executive staff on data topics
DataWeave provides Retailers and Brands with “Competitive Intelligence as a Service” that enables them to take key decisions that impact their revenue. Powered by AI, we provide easily consumable and actionable competitive intelligence by aggregating and analyzing billions of publicly available data points on the Web to help businesses develop data-driven strategies and make smarter decisions.
Data Science@DataWeave
We the Data Science team at DataWeave (called Semantics internally) build the core machine learning backend and structured domain knowledge needed to deliver insights through our data products. Our underpinnings are: innovation, business awareness, long term thinking, and pushing the envelope. We are a fast paced labs within the org applying the latest research in Computer Vision, Natural Language Processing, and Deep Learning to hard problems in different domains.
How we work?
It's hard to tell what we love more, problems or solutions! Every day, we choose to address some of the hardest data problems that there are. We are in the business of making sense of messy public data on the web. At serious scale!
What do we offer?
- Some of the most challenging research problems in NLP and Computer Vision. Huge text and image datasets that you can play with!
- Ability to see the impact of your work and the value you're adding to our customers almost immediately.
- Opportunity to work on different problems and explore a wide variety of tools to figure out what really excites you.
- A culture of openness. Fun work environment. A flat hierarchy. Organization wide visibility. Flexible working hours.
- Learning opportunities with courses and tech conferences. Mentorship from seniors in the team.
- Last but not the least, competitive salary packages and fast paced growth opportunities.
Who are we looking for?
The ideal candidate is a strong software developer or a researcher with experience building and shipping production grade data science applications at scale. Such a candidate has keen interest in liaising with the business and product teams to understand a business problem, and translate that into a data science problem. You are also expected to develop capabilities that open up new business productization opportunities.
We are looking for someone with 6+ years of relevant experience working on problems in NLP or Computer Vision with a Master's degree (PhD preferred).
Key problem areas
- Preprocessing and feature extraction noisy and unstructured data -- both text as well as images.
- Keyphrase extraction, sequence labeling, entity relationship mining from texts in different domains.
- Document clustering, attribute tagging, data normalization, classification, summarization, sentiment analysis.
- Image based clustering and classification, segmentation, object detection, extracting text from images, generative models, recommender systems.
- Ensemble approaches for all the above problems using multiple text and image based techniques.
Relevant set of skills
- Have a strong grasp of concepts in computer science, probability and statistics, linear algebra, calculus, optimization, algorithms and complexity.
- Background in one or more of information retrieval, data mining, statistical techniques, natural language processing, and computer vision.
- Excellent coding skills on multiple programming languages with experience building production grade systems. Prior experience with Python is a bonus.
- Experience building and shipping machine learning models that solve real world engineering problems. Prior experience with deep learning is a bonus.
- Experience building robust clustering and classification models on unstructured data (text, images, etc). Experience working with Retail domain data is a bonus.
- Ability to process noisy and unstructured data to enrich it and extract meaningful relationships.
- Experience working with a variety of tools and libraries for machine learning and visualization, including numpy, matplotlib, scikit-learn, Keras, PyTorch, Tensorflow.
- Use the command line like a pro. Be proficient in Git and other essential software development tools.
- Working knowledge of large-scale computational models such as MapReduce and Spark is a bonus.
- Be a self-starter—someone who thrives in fast paced environments with minimal ‘management’.
- It's a huge bonus if you have some personal projects (including open source contributions) that you work on during your spare time. Show off some of your projects you have hosted on GitHub.
Role and responsibilities
- Understand the business problems we are solving. Build data science capability that align with our product strategy.
- Conduct research. Do experiments. Quickly build throw away prototypes to solve problems pertaining to the Retail domain.
- Build robust clustering and classification models in an iterative manner that can be used in production.
- Constantly think scale, think automation. Measure everything. Optimize proactively.
- Take end to end ownership of the projects you are working on. Work with minimal supervision.
- Help scale our delivery, customer success, and data quality teams with constant algorithmic improvements and automation.
- Take initiatives to build new capabilities. Develop business awareness. Explore productization opportunities.
- Be a tech thought leader. Add passion and vibrance to the team. Push the envelope. Be a mentor to junior members of the team.
- Stay on top of latest research in deep learning, NLP, Computer Vision, and other relevant areas.