About Us :
Docsumo is Document AI software that helps enterprises capture data and analyze customer documents. We convert documents such as invoices, ID cards, and bank statements into actionable data. We are work with clients such as PayU, Arbor and Hitachi and backed by Sequoia, Barclays, Techstars, and Better Capital.
As a Senior Machine Learning you will be working directly with the CTO to develop end to end API products for the US market in the information extraction domain.
Responsibilities :
- You will be designing and building systems that help Docsumo process visual data i.e. as PDF & images of documents.
- You'll work in our Machine Intelligence team, a close-knit group of scientists and engineers who incubate new capabilities from whiteboard sketches all the way to finished apps.
- You will get to learn the ins and outs of building core capabilities & API products that can scale globally.
- Should have hands-on experience applying advanced statistical learning techniques to different types of data.
- Should be able to design, build and work with RESTful Web Services in JSON and XML formats. (Flask preferred)
- Should follow Agile principles and processes including (but not limited to) standup meetings, sprints and retrospectives.
Skills / Requirements :
- Minimum 3+ years experience working in machine learning, text processing, data science, information retrieval, deep learning, natural language processing, text mining, regression, classification, etc.
- Must have a full-time degree in Computer Science or similar (Statistics/Mathematics)
- Working with OpenCV, TensorFlow and Keras
- Working with Python: Numpy, Scikit-learn, Matplotlib, Panda
- Familiarity with Version Control tools such as Git
- Theoretical and practical knowledge of SQL / NoSQL databases with hands-on experience in at least one database system.
- Must be self-motivated, flexible, collaborative, with an eagerness to learn
About Docsumo
Similar jobs
- Lead the data science, ML, product analytics, and insights functions by translating sparse and decentralized datasets to develop metrics, standardize processes, and lead the path from data to insights.
- Building visualizations, models, pipelines, alerts/insights systems, and recommendations in Python/Java to support business decisions and operational experiences.
- Advising executives on calibration strategy, DEI, and workforce planning.
Purpose of Job:
Responsible to lead a team of analysts to build and deploy predictive models to infuse core
business functions with deep analytical insights. The Senior Data Scientist will also work
closely with the Kinara management team to investigate strategically important business questions.
Job Responsibilities:
Lead a team through the entire analytical and machine learning model life cycle:
Define the problem statement
Build and clean datasets
Exploratory data analysis
Feature engineering
Apply ML algorithms and assess the performance
Code for deployment
Code testing and troubleshooting
Communicate Analysis to Stakeholders
Manage Data Analysts and Data Scientists
Qualifications:
Education: MS/MTech/Btech graduates or equivalent with a focus on data science and
quantitative fields (CS, Engineering, Mathematics, Economics)
Work Experience: 5+ years in a professional role with 3+ years in ML/AI
Other Requirements: ⮚ Domain knowledge in Financial Services is a big plus
Skills & Competencies
Technical Skills
⮚ Aptitude in Math and Stats
⮚ Proven experience in the use of Python, SQL, DevOps
⮚ Excellent in programming (Python), stats tools, and SQL
⮚ Working knowledge of tools and utilities - AWS, Git, Selenium, Postman,Prefect, Airflow, PySpark
Soft Skills
⮚ Deep Curiosity and Humility
⮚ Strong communications verbal and written
Senior Data Scientist
Your goal: To improve the education process and improve the student experience through data.
The organization: Data Science for Learning Services Data Science and Machine Learning are core to Chegg. As a Student Hub, we want to ensure that students discover the full breadth of learning solutions we have to offer to get full value on their learning time with us. To create the most relevant and engaging interactions, we are solving a multitude of machine learning problems so that we can better model student behavior, link various types of content, optimize workflows, and provide a personalized experience.
The Role: Senior Data Scientist
As a Senior Data Scientist, you will focus on conducting research and development in NLP and ML. You will be responsible for writing production-quality code for data product solutions at Chegg. You will lead in identification and implementation of key projects to process data and knowledge discovery.
Responsibilities:
• Translate product requirements into AIML/NLP solutions
• Be able to think out of the box and be able to design novel solutions for the problem at hand
• Write production-quality code
• Be able to design data and annotation collection strategies
• Identify key evaluation metrics and release requirements for data products
• Integrate new data and design workflows
• Innovate, share, and educate team members and community
Requirements:
• Working experience in machine learning, NLP, recommendation systems, experimentation, or related fields, with a specialization in NLP • Working experience on large language models that cater to multiple tasks such as text generation, Q&A, summarization, translation etc is highly preferred
• Knowledge on MLOPs and deployment pipelines is a must
• Expertise on supervised, unsupervised and reinforcement ML algorithms.
• Strong programming skills in Python
• Top data wrangling skills using SQL or NOSQL queries
• Experience using containers to deploy real-time prediction services
• Passion for using technology to help students
• Excellent communication skills
• Good team player and a self-starter
• Outstanding analytical and problem-solving skills
• Experience working with ML pipeline products such as AWS Sagemaker, Google ML, or Databricks a plus.
Why do we exist?
Students are working harder than ever before to stabilize their future. Our recent research study called State of the Student shows that nearly 3 out of 4 students are working to support themselves through college and 1 in 3 students feel pressure to spend more than they can afford. We founded our business on provided affordable textbook rental options to address these issues. Since then, we’ve expanded our offerings to supplement many facets of higher educational learning through Chegg Study, Chegg Math, Chegg Writing, Chegg Internships, Thinkful Online Learning, and more, to support students beyond their college experience. These offerings lower financial concerns for students by modernizing their learning experience. We exist so students everywhere have a smarter, faster, more affordable way to student.
Video Shorts
Life at Chegg: https://jobs.chegg.com/Video-Shorts-Chegg-Services
Certified Great Place to Work!: http://reviews.greatplacetowork.com/chegg
Chegg India: http://www.cheggindia.com/
Chegg Israel: http://insider.geektime.co.il/organizations/chegg
Thinkful (a Chegg Online Learning Service): https://www.thinkful.com/about/#careers
Chegg out our culture and benefits!
http://www.chegg.com/jobs/benefits
https://www.youtube.com/watch?v=YYHnkwiD7Oo
Chegg is an equal-opportunity employer
Job description:
- Selecting features, building and optimizing classifiers using machine learning techniques
- Mining data as and when required
- Enhancing data collection procedures to include information that is relevant for building analytic systems
- Processing, cleansing, and verifying the integrity of data used for analysis
- Doing ad-hoc analysis and presenting results in a clear manner
- Creating automated anomaly detection systems and constant tracking of its performance
- Efficient stakeholder management
Skills and Qualifications
- Excellent understanding of machine learning techniques and algorithms, such as k-NN, Naive Bayes, SVM, Decision Forests, etc.
- Good applied statistics skills, such as distributions, statistical testing, regression, etc.
- Experience with common data science toolkits.
- Great communication skills
- Experience with data visualisation tools
- Proficiency in using query languages such as SQL
- Good scripting and programming skills
- Data-oriented personality
- B.Tech, M.Tech, B.S., M.S., MBA
Requirement / Desired Skills
Data Scientist -- Data mining skills , SQL, Advanced ML Techniques, NLP (natural Language Processing)
- Provide insights based on data to business teams
- Develop framework, solutions and recommendations for business problems
- Build ML models for predictive solutions
- Use advance data science techniques to build business solutions
- Automation / Optimization of new/existing models ensuring smooth,timely and accurate execution with lowest possible TAT.
- Design & maintenance of response tracking, measurement, and comparison of success parameters of various projects.
- Ability to handle large volumes of data with ease using multiple software like Python ,R etc
Experience in modeling techniques and hands on experience in building Logistic regression models, Random Forrest, K-mean Cluster, NLP, Decision tree, Boosting techniques etc
- Good at data interpretation and reasoning skills
Job Details:-
Designation - Data Scientist
Urgently required. (NP of maximum 15 days)
Location:- Mumbai
Experience:- 5-7 years.
Package Offered:- Rs.5,00,000/- to Rs.9,00,000/- pa.
Data Scientist
Job Description:-
Responsibilities:
- Identify valuable data sources and automate collection processes
- Undertake preprocessing of structured and unstructured data
- Analyze large amounts of information to discover trends and patterns
- Build predictive models and machine-learning algorithms
- Combine models through ensemble modeling
- Present information using data visualization techniques
- Propose solutions and strategies to business challenges
- Collaborate with engineering and product development teams
Requirements:
- Proven experience as a Data Scientist or Data Analyst
- Experience in data mining
- Understanding of machine-learning and operations research
- Knowledge of R, SQL and Python; familiarity with Scala, Java is an asset
- Experience using business intelligence tools (e.g. Tableau) and data frameworks (e.g. Hadoop)
- Analytical mind and business acumen
- Strong math skills (e.g. statistics, algebra)
- Problem-solving aptitude
- Excellent communication and presentation skills
- BSc/BA in Computer Science, Engineering or relevant field; graduate degree in Data Science or other quantitative field is preferred
o Convert machine learning models into APIs for applications accessibility
o Running machine learning tests and experiments
o Implementing appropriate ML algorithms
o Creating machine learning models and retraining systems
o Study and transform data science prototypes
o Design machine learning systems
o Research and implement appropriate ML algorithms and tools
o Train and retrain systems when necessary
o Test and deploy models
o Use AI to empower the company with novel capabilities
o Designing and developing machine learning and deep learning system
o Outstanding analytical and problem-solving skills
• Alexa
o Excellent in Python programming
o Experience with AWS Lamda
o Experience with Alexa skills
o Alexa skill directives
o Excellent in NodeJS programming
o Experience with GCP - Dialog Flow and Actions on Google
o Using built-in intents and developing custom intents
o API integration and Postman knowledge
Basic Qualifications:
∙Bachelors in Computer Science/Mathematics + Research (Machine Learning, Deep Learning, Statistics, Data Mining, Game Theory or core mathematical areas) from Tier1 tech institutes.
∙3+ years of relevant experience in building large scale machine learning or deep learning models and/or systems.
∙1 year or more of experience specifically with deep learning (CNN, RNN, LSTM, RBM etc).
∙Strong working knowledge of deep learning, machine learning, and statistics.
- Deep domain understanding of Personalization, Search and Visual.
∙Strong math skills with statistical modeling / machine learning.
∙Hands-on experience building models with deep learning frameworks like MXNet or Tensorflow.
∙Experience in using Python, statistical/machine learning libs.
∙Ability to think creatively and solve problems.
∙Data presentation skills.
Preferred:
∙MS/ Ph.D. (Machine Learning, Deep Learning, Statistics, Data Mining, Game Theory or core mathematical areas) from IISc and other Top Global Universities.
∙Or, Publications in highly accredited journals (If available, please share links to your published work.).
∙Or, history of scaling ML/Deep learning algorithm at massively large scale.
DataWeave provides Retailers and Brands with “Competitive Intelligence as a Service” that enables them to take key decisions that impact their revenue. Powered by AI, we provide easily consumable and actionable competitive intelligence by aggregating and analyzing billions of publicly available data points on the Web to help businesses develop data-driven strategies and make smarter decisions.
Data Science@DataWeave
We the Data Science team at DataWeave (called Semantics internally) build the core machine learning backend and structured domain knowledge needed to deliver insights through our data products. Our underpinnings are: innovation, business awareness, long term thinking, and pushing the envelope. We are a fast paced labs within the org applying the latest research in Computer Vision, Natural Language Processing, and Deep Learning to hard problems in different domains.
How we work?
It's hard to tell what we love more, problems or solutions! Every day, we choose to address some of the hardest data problems that there are. We are in the business of making sense of messy public data on the web. At serious scale!
What do we offer?
- Some of the most challenging research problems in NLP and Computer Vision. Huge text and image datasets that you can play with!
- Ability to see the impact of your work and the value you're adding to our customers almost immediately.
- Opportunity to work on different problems and explore a wide variety of tools to figure out what really excites you.
- A culture of openness. Fun work environment. A flat hierarchy. Organization wide visibility. Flexible working hours.
- Learning opportunities with courses and tech conferences. Mentorship from seniors in the team.
- Last but not the least, competitive salary packages and fast paced growth opportunities.
Who are we looking for?
The ideal candidate is a strong software developer or a researcher with experience building and shipping production grade data science applications at scale. Such a candidate has keen interest in liaising with the business and product teams to understand a business problem, and translate that into a data science problem. You are also expected to develop capabilities that open up new business productization opportunities.
We are looking for someone with 6+ years of relevant experience working on problems in NLP or Computer Vision with a Master's degree (PhD preferred).
Key problem areas
- Preprocessing and feature extraction noisy and unstructured data -- both text as well as images.
- Keyphrase extraction, sequence labeling, entity relationship mining from texts in different domains.
- Document clustering, attribute tagging, data normalization, classification, summarization, sentiment analysis.
- Image based clustering and classification, segmentation, object detection, extracting text from images, generative models, recommender systems.
- Ensemble approaches for all the above problems using multiple text and image based techniques.
Relevant set of skills
- Have a strong grasp of concepts in computer science, probability and statistics, linear algebra, calculus, optimization, algorithms and complexity.
- Background in one or more of information retrieval, data mining, statistical techniques, natural language processing, and computer vision.
- Excellent coding skills on multiple programming languages with experience building production grade systems. Prior experience with Python is a bonus.
- Experience building and shipping machine learning models that solve real world engineering problems. Prior experience with deep learning is a bonus.
- Experience building robust clustering and classification models on unstructured data (text, images, etc). Experience working with Retail domain data is a bonus.
- Ability to process noisy and unstructured data to enrich it and extract meaningful relationships.
- Experience working with a variety of tools and libraries for machine learning and visualization, including numpy, matplotlib, scikit-learn, Keras, PyTorch, Tensorflow.
- Use the command line like a pro. Be proficient in Git and other essential software development tools.
- Working knowledge of large-scale computational models such as MapReduce and Spark is a bonus.
- Be a self-starter—someone who thrives in fast paced environments with minimal ‘management’.
- It's a huge bonus if you have some personal projects (including open source contributions) that you work on during your spare time. Show off some of your projects you have hosted on GitHub.
Role and responsibilities
- Understand the business problems we are solving. Build data science capability that align with our product strategy.
- Conduct research. Do experiments. Quickly build throw away prototypes to solve problems pertaining to the Retail domain.
- Build robust clustering and classification models in an iterative manner that can be used in production.
- Constantly think scale, think automation. Measure everything. Optimize proactively.
- Take end to end ownership of the projects you are working on. Work with minimal supervision.
- Help scale our delivery, customer success, and data quality teams with constant algorithmic improvements and automation.
- Take initiatives to build new capabilities. Develop business awareness. Explore productization opportunities.
- Be a tech thought leader. Add passion and vibrance to the team. Push the envelope. Be a mentor to junior members of the team.
- Stay on top of latest research in deep learning, NLP, Computer Vision, and other relevant areas.