- You'd have to set up your own shop, work with design customers to find generalizable use cases, and build them out.
- Ability to collaborate with cross-functional teams to build and ship new features
- At least 2-5 years of experience
- Predictive Analytics – Machine Learning Algorithms, Logistics & Linear Regression, Decision Tree, Clustering.
- Exploratory Data Analysis – Data Preparation, Data Exploration, and Data Visualization.
- Analytics Tools – R, Python, SQL, Power BI, MS Excel.
Working along with the highly motivated advanced Machine Learning team, with key responsibilities are to research, design, develop, and implement applications that will be integrated into our workflows.
Responsibilities and Accountabilities:
III Skill Set & Personality Traits required:
Data Scientist - Product Development
Employment Type: Full Time, Permanent
Experience: 3-5 Years as a Full Time Data Scientist
We are looking for an exceptional Data Scientist who is passionate about data and motivated to build large scale machine learning solutions to shine our data products. This person will be contributing to the analytics of data for insight discovery and development of machine learning pipeline to support modeling of terabytes (TB) of daily data for various use cases.
Location: Pune (Currently remote up till pandemic, later you need to relocate)
About the Organization: A funded product development company, headquarter in Singapore and offices in Australia, United States, Germany, United Kingdom and India. You will gain work experience in a global environment. Qualifications:
- 3+ years relevant working experience
- Master / Bachelor’s in computer science or engineering
- Working knowledge of Python, Spark / Pyspark, SQL
- Experience working with large-scale data
- Experience in data manipulation, analytics, visualization, model building, model deployment
- Proficiency of various ML algorithms for supervised and unsupervised learning
- Experience working in Agile/Lean model
- Exposure to building large-scale ML models using one or more of modern tools and libraries such as AWS Sagemaker, Spark ML-Lib, Tensorflow, PyTorch, Keras, GCP ML Stack
- Exposure to MLOps tools such as MLflow, Airflow
- Exposure to modern Big Data tech such as Cassandra/Scylla, Snowflake, Kafka, Ceph, Hadoop
- Exposure to IAAS platforms such as AWS, GCP, Azure
- Experience with Java and Golang is a plus
- Experience with BI toolkit such as Superset, Tableau, Quicksight, etc is a plus
****** Looking for someone who can join immediately / within a month and carries experience with product development companies and dealt with streaming data. Experience working in a product development team is desirable. AWS experience is a must. Strong experience in Python and its related library is required.
This profile will include the following responsibilities:
- Develop Parsers for XML and JSON Data sources/feeds
- Write Automation Scripts for product development
- Build API Integrations for 3rd Party product integration
- Perform Data Analysis
- Research on Machine learning algorithms
- Understand AWS cloud architecture and work with 3 party vendors for deployments- Resolve issues in AWS environment
We are looking for candidates with:
Programming Language: Python
Web Development: Basic understanding of Web Development. Working knowledge of Python Flask is desirable
Database & Platform: AWS/Docker/MySQL/MongoDB
Basic Understanding of Machine Learning Models & AWS Fundamentals is recommended.
About Us :
Docsumo is Document AI software that helps enterprises capture data and analyze customer documents. We convert documents such as invoices, ID cards, and bank statements into actionable data. We are work with clients such as PayU, Arbor and Hitachi and backed by Sequoia, Barclays, Techstars, and Better Capital.
As a Senior Machine Learning you will be working directly with the CTO to develop end to end API products for the US market in the information extraction domain.
- You will be designing and building systems that help Docsumo process visual data i.e. as PDF & images of documents.
- You'll work in our Machine Intelligence team, a close-knit group of scientists and engineers who incubate new capabilities from whiteboard sketches all the way to finished apps.
- You will get to learn the ins and outs of building core capabilities & API products that can scale globally.
- Should have hands-on experience applying advanced statistical learning techniques to different types of data.
- Should be able to design, build and work with RESTful Web Services in JSON and XML formats. (Flask preferred)
- Should follow Agile principles and processes including (but not limited to) standup meetings, sprints and retrospectives.
Skills / Requirements :
- Minimum 3+ years experience working in machine learning, text processing, data science, information retrieval, deep learning, natural language processing, text mining, regression, classification, etc.
- Must have a full-time degree in Computer Science or similar (Statistics/Mathematics)
- Working with OpenCV, TensorFlow and Keras
- Working with Python: Numpy, Scikit-learn, Matplotlib, Panda
- Familiarity with Version Control tools such as Git
- Theoretical and practical knowledge of SQL / NoSQL databases with hands-on experience in at least one database system.
- Must be self-motivated, flexible, collaborative, with an eagerness to learn
- Research and develop statistical learning models for data analysis
- Collaborate with product management and engineering departments to understand company needs and devise possible solutions
- Keep up-to-date with latest technology trends
- Communicate results and ideas to key decision makers
- Implement new statistical or other mathematical methodologies as needed for specific models or analysis
- Optimize joint development efforts through appropriate database use and project design
- Masters or PhD in Computer Science, Electrical Engineering, Statistics, Applied Math or equivalent fields with strong mathematical background
- Excellent understanding of machine learning techniques and algorithms, including clustering, anomaly detection, optimization, neural network etc
- 3+ years experiences building data science-driven solutions including data collection, feature selection, model training, post-deployment validation
- Strong hands-on coding skills (preferably in Python) processing large-scale data set and developing machine learning models
- Familiar with one or more machine learning or statistical modeling tools such as Numpy, ScikitLearn, MLlib, Tensorflow
- Good team worker with excellent communication skills written, verbal and presentation
- Experience with AWS, S3, Flink, Spark, Kafka, Elastic Search
- Knowledge and experience with NLP technology
- Previous work in a start-up environment
- Key Responsibilities : Use cases to support use case analysis E2E, define capabilities, understand the data and model Machine Learning Operations MLOps Azure Machine Learning, Azure Cognitive Services, Azure DevOps, Overall Azure Cloud Experience, Powershell, DSVM, AML Compute / Training Clusters Azure Infrastructure Experience, Python, Big Data Python Scripting 8 Automate ML models deployments, Manage, monitor, troubleshoot machine learning infrastructure and Setup ML Pipe lines
- Technical Experience : Proven skills experience in Azure AI ML solution design and architecture based solution using Azure Cloud capabilities AML / AKS Proven record of embedding advanced analytical models into business processes Collaborate in multi-functional teams to evaluate business activities, and then develop innovative and effective approaches to tackle teams analytics problems and communicate results bitbucket, Nodejs, PowerBI SQL, Python
- Experience in setting up MLOps framework for AI ML team
- Develop REST/JSON API’s Design code for high scale/availability/resiliency.
- Develop responsive web apps and integrate APIs using NodeJS.
- Presenting Chat efficiency reports to higher Management
- Develop system flow diagrams to automate a business function and identify impacted systems; metrics to depict the cost benefit analysis of the solutions developed.
- Work closely with business operations to convert requirements into system solutions and collaborate with development teams to ensure delivery of highly scalable and available systems.
- Using tools to classify/categorize the chat based on intents and coming up with F1 score for Chat Analysis
- Experience in analyzing real agents Chat conversation with agent to train the Chatbot.
- Developing Conversational Flows in the chatbot
- Calculating Chat efficiency reports.
Good to Have:
- Monitors performance and quality control plans to identify performance.
- Works on problems of moderate and varied complexity where analysis of data may require adaptation of standardized practices.
- Works with management to prioritize business and information needs.
- Experience in analyzing real agents Chat conversation with agent to train the Chatbot.
- Identifies, analyzes, and interprets trends or patterns in complex data sets.
- Ability to manage multiple assignments.
- Understanding of ChatBot Architecture.
- Experience of Chatbot training
We are looking for a data scientist that will help us to discover the information hidden in vast amounts of data, and help us make smarter decisions to deliver even better products. Your primary focus will be in applying data mining techniques, doing statistical analysis, and building high quality prediction systems integrated with our products.
- Selecting features, building and optimizing classifiers using machine learning techniques
- Data mining using state-of-the-art methods
- Extending company’s data with third party sources of information when needed
- Enhancing data collection procedures to include information that is relevant for building analytic systems
- Processing, cleansing, and verifying the integrity of data used for analysis
- Doing ad-hoc analysis and presenting results in a clear manner
- Creating automated anomaly detection systems and constant tracking of its performance
Skills and Qualifications
- Excellent understanding of machine learning techniques and algorithms, such as Linear regression, SVM, Decision Forests, LSTM, CNN etc.
- Experience with Deep Learning preferred.
- Experience with common data science toolkits, such as R, NumPy, MatLab, etc. Excellence in at least one of these is highly desirable
- Great communication skills
- Proficiency in using query languages such as SQL, Hive, Pig
- Good applied statistics skills, such as statistical testing, regression, etc.
- Good scripting and programming skills
- Data-oriented personality