Data Science Jobs in Pune
Tealbox.Digital is a Marketing-Technology company founded by IIT-Delhi alumni, focused on solving complex problems for businesses operating online. We leverage Paid Media to help clients all over the globe acquire new customers, retain existing ones and maximize customer lifetime value. We have been involved with several pre-sales startups that approached us with the intent to establish their proof of concept and we have enabled those businesses to grow from there to seven-figure turnovers. Our performances have impacted businesses so largely that we were able to set them up to raise capital from large investors. We currently operate in four continents and are growing steadily owing to the current need for businesses to go online and efficient performance marketing solutions to keep this channel sustainable. The businesses we largely work with are small-scale enterprises and startups and we are directly responsible for shaping how their business grows.
TealBox Digital is an Indian startup with a global outlook. We are an equal opportunity employer and are committed to creating exceptional employee experiences. We believe in empowering people by focusing on employee development and investing in continuous learning opportunities. We’re committed to helping people thrive professionally and personally.
We are looking for curious individuals who are natural leaders and have the ability to recognise what needs to be done. We require candidates with strong written & verbal communication skills and analytical thinkers. The ideal candidate will need to be able to effectively articulate insights and recommended actions.
In this role,
- You need to analyze and solve increasingly complex business problems.
- You will be working as a part of the Digital Marketing & Customer Analytics team which provides a set of processes that measure, manage and analyze marketing activities in order to provide actionable insights and recommendations to clients’ advertising campaigns to optimize ROI & performance efficiency in operations.
- You will have experience in transforming large amounts of diverse business data into valuable and meaningful information that can be used to support campaign-wise decision-making.
- Design and carry out high quality analysis in service of projects undertaken on behalf of clients and ensure the analysis is presented in a compelling way.
- You will have good written and oral communication skills in English language and the ability to transform data into actionable insights for an audience composed of digital experts, novices and beginners.
- Conceptualize and implement Paid Media strategies to acquire new customers and grow client businesses.
- Stay informed of the relevant industry, paid media, and paid media platform trends and best practices.
- The ideal candidate is someone who enjoys solving problems. They love taking on difficult challenges and finding creative solutions and does not get flustered easily.
- Graduates with an aptitude to pick up new skills or 4-6 years’ experience in the marketing segment with a clear understanding of digital marketing or individuals with equivalent/relevant experience.
- Experience with FB Ads, Google Ads, Amazon Ads, Programmatic Buying and other allied tools such as Google Analytics, Google Tag Manager etc, is a plus but NOT mandatory.
- Strong presentation skills and ability to work with diverse stakeholders.
- Ability to analyze large data sets and use analytical skills to create easy-to-understand, actionable insights for business stakeholders.
- Intermediate knowledge/experience with Google Sheets, Google Slides, Google Data Studio, Python, etc.
- Strong math/analytical and quantitative skills.
- Strong problem-solving skills.
- Very strong oral and written communication skills.
- Intermediate knowledge/experience with relational databases and report development.
- Close attention to detail, accuracy, follow-through and excellent organizational skills.
- Proof of ability to take up responsibility and deliver on actions
We have a urgent requirement for the post of IBM MDM (AE) profile
Notice period - should b e 15-30 days
Data Scientist - Product Development
Employment Type: Full Time, Permanent
Experience: 3-5 Years as a Full Time Data Scientist
We are looking for an exceptional Data Scientist who is passionate about data and motivated to build large scale machine learning solutions to shine our data products. This person will be contributing to the analytics of data for insight discovery and development of machine learning pipeline to support modeling of terabytes (TB) of daily data for various use cases.
Location: Pune (Currently remote up till pandemic, later you need to relocate)
About the Organization: A funded product development company, headquarter in Singapore and offices in Australia, United States, Germany, United Kingdom and India. You will gain work experience in a global environment. Qualifications:
- 3+ years relevant working experience
- Master / Bachelor’s in computer science or engineering
- Working knowledge of Python, Spark / Pyspark, SQL
- Experience working with large-scale data
- Experience in data manipulation, analytics, visualization, model building, model deployment
- Proficiency of various ML algorithms for supervised and unsupervised learning
- Experience working in Agile/Lean model
- Exposure to building large-scale ML models using one or more of modern tools and libraries such as AWS Sagemaker, Spark ML-Lib, Tensorflow, PyTorch, Keras, GCP ML Stack
- Exposure to MLOps tools such as MLflow, Airflow
- Exposure to modern Big Data tech such as Cassandra/Scylla, Snowflake, Kafka, Ceph, Hadoop
- Exposure to IAAS platforms such as AWS, GCP, Azure
- Experience with Java and Golang is a plus
- Experience with BI toolkit such as Superset, Tableau, Quicksight, etc is a plus
****** Looking for someone who can join immediately / within a month and carries experience with product development companies and dealt with streaming data. Experience working in a product development team is desirable. AWS experience is a must. Strong experience in Python and its related library is required.
We are looking for an exceptional Data Scientist who is passionate about data and motivated to build large scale machine learning solutions. This person will be contributing to the analytics of data for insight discovery and development of machine learning pipeline to support modelling of terabytes of daily data for various use cases
Typical persona: Data Science Manager / Architect
Experience: 8+ years programming/engineering experience (with at least last 4 years in big data, Data science)
- Hands-on Python: Pandas, Scikit-Learn
- Working knowledge of Kafka
- Able to carry out own tasks and help the team in resolving problems - logical or technical (25% of job)
- Good on analytical & debugging skills
- Strong communication skills
Desired (in order of priorities):
- Go (Strong advantage)
- Airflow (Strong advantage)
- Familiarity & working experience on more than one type of database: relational, object, columnar, graph and other unstructured databases
- Data structures, Algorithms
- Experience with multi-threaded and thread sync concepts
- AWS Sagemaker
- Should have strong experience in Python programming minimum 4 Years
A global business process management company
B1 – Data Scientist - Kofax Accredited Developers
Requirement – 3
- Accreditation of Kofax KTA / KTM
- Experience in Kofax Total Agility Development – 2-3 years minimum
- Ability to develop and translate functional requirements to design
- Experience in requirement gathering, analysis, development, testing, documentation, version control, SDLC, Implementation and process orchestration
- Experience in Kofax Customization, writing Custom Workflow Agents, Custom Modules, Release Scripts
- Application development using Kofax and KTM modules
- Good/Advance understanding of Machine Learning /NLP/ Statistics
- Exposure to or understanding of RPA/OCR/Cognitive Capture tools like Appian/UI Path/Automation Anywhere etc
- Excellent communication skills and collaborative attitude
- Work with multiple teams and stakeholders within like Analytics, RPA, Technology and Project management teams
- Good understanding of compliance, data governance and risk control processes
Total Experience – 7-10 Years in BPO/KPO/ ITES/BFSI/Retail/Travel/Utilities/Service Industry
Good to have
- Previous experience of working on Agile & Hybrid delivery environment
- Knowledge of VB.Net, C#( C-Sharp ), SQL Server , Web services
- Masters in Statistics/Mathematics/Economics/Econometrics Or BE/B-Tech, MCA or MBA
As an experienced Data Scientist you’ll join a team of data scientists, analysts, and software engineers
working to push the boundaries of data science in health care. We like to experiment, iterate, and
innovate with technology, from developing new algorithms specific to health care’s challenges, to
bringing the latest machine learning practices and applications developed in other industries into the
health care world. We know that algorithms are only valuable when powered by the right data, so we
focus on fully understanding the problems we need to solve, and truly understanding the data behind
them before launching into solutions – ensuring that the solutions we do land on are impactful and
• Research, conceptualize, and implement analytical approaches and predictive modeling to
evaluate scenarios, predict utilization and clinical outcomes, and recommend actions to impact
• Manage and execute on the entire model development process, including scope definition,
hypothesis formation, data cleaning and preparation, feature selection, model implementation
in production, validation and iteration, using multiple data sources.
• Provide guidance on necessary data and software infrastructure capabilities to deliver a scalable
solution across partners and support the implementation of the team’s algorithms and models
• Contribute to the development and publication in major journals, conferences showcasing
leadership in healthcare data science.
• Work closely and collaborate with Data Scientists, Machine Learning engineers, IT teams and
Business stakeholders spread out across various locations in US and India to achieve business
• Provide guidance to other Data Scientist and Machine Learning Engineers
empower healthcare payers, providers and members to quickly process medical data to
make informed decisions and reduce health care costs. You will be focusing on research,
development, strategy, operations, people management, and being a thought leader for
team members based out of India. You should have professional healthcare experience
using both structured and unstructured data to build applications. These applications
include but are not limited to machine learning, artificial intelligence, optical character
recognition, natural language processing, and integrating processes into the overall AI
pipeline to mine healthcare and medical information with high recall and other relevant
metrics. The results will be used dually for real-time operational processes with both
automated and human-based decision making as well as contribute to reducing
healthcare administrative costs. We work with all major cloud and big data vendors
offerings including (Azure, AWS, Google, IBM, etc.) to achieve our goals in healthcare and
The Director, Data Science will have the opportunity to build a team, shape team culture
and operating norms as a result of the fast-paced nature of a new, high-growth
• Strong communication and presentation skills to convey progress to a diverse group of stakeholders
• Strong expertise in data science, data engineering, software engineering, cloud vendors, big data technologies, real-time streaming applications, DevOps and product delivery
• Experience building stakeholder trust and confidence in deployed models especially via application of the algorithmic bias, interpretable machine learning,
data integrity, data quality, reproducible research and reliable engineering 24x7x365 product availability, scalability
• Expertise in healthcare privacy, federated learning, continuous integration and deployment, DevOps support
• Provide mentoring to data scientists and machine learning engineers as well as career development
• Meet project related team members for individual specific needs on a regular basis related to project/product deliverables
• Provide training and guidance for team members when required
• Provide performance feedback when required by leadership
The Experience You’ll Need (Required):
• MS/M.Tech degree or PhD in Computer Science, Mathematics, Physics or related STEM fields
• Significant healthcare data experience including but not limited to usage of claims data
• Delivered multiple data science and machine learning projects over 8+ years with values exceeding $10 Million or more and has worked on platform members exceeding 10 million lives
• 9+ years of industry experience in data science, machine learning, and artificial intelligence
• Strong expertise in data science, data engineering, software engineering, cloud vendors, big data technologies, real time streaming applications, DevOps, and product delivery
• Knows how to solve and launch real artificial intelligence and data science related problems and products along with managing and coordinating the
business process change, IT / cloud operations, meeting production level code standards
• Ownerships of key workflows part of data science life cycle like data acquisition, data quality, and results
• Experience building stakeholder trust and confidence in deployed models especially via application of algorithmic bias, interpretable machine learning,
data integrity, data quality, reproducible research, and reliable engineering 24x7x365 product availability, scalability
• Expertise in healthcare privacy, federated learning, continuous integration and deployment, DevOps support
• 3+ Years of experience managing directly five (5) or more senior level data scientists, machine learning engineers with advanced degrees and directly
made staff decisions
• Very strong understanding of mathematical concepts including but not limited to linear algebra, advanced calculus, partial differential equations, and
statistics including Bayesian approaches at master’s degree level and above
• 6+ years of programming experience in C++ or Java or Scala and data science programming languages like Python and R including strong understanding of
concepts like data structures, algorithms, compression techniques, high performance computing, distributed computing, and various computer architecture
• Very strong understanding and experience with traditional data science approaches like sampling techniques, feature engineering, classification, and
regressions, SVM, trees, model evaluations with several projects over 3+ years
• Very strong understanding and experience in Natural Language Processing,
reasoning, and understanding, information retrieval, text mining, search, with
3+ years of hands on experience
• Experience with developing and deploying several products in production with
experience in two or more of the following languages (Python, C++, Java, Scala)
• Strong Unix/Linux background and experience with at least one of the
following cloud vendors like AWS, Azure, and Google
• Three plus (3+) years hands on experience with MapR \ Cloudera \ Databricks
Big Data platform with Spark, Hive, Kafka etc.
• Three plus (3+) years of experience with high-performance computing like
Dask, CUDA distributed GPU, TPU etc.
• Presented at major conferences and/or published materials
- Partners with business stakeholders to translate business objectives into clearly defined analytical projects.
- Identify opportunities for text analytics and NLP to enhance the core product platform, select the best machine learning techniques for the specific business problem and then build the models that solve the problem.
- Own the end-end process, from recognizing the problem to implementing the solution.
- Define the variables and their inter-relationships and extract the data from our data repositories, leveraging infrastructure including Cloud computing solutions and relational database environments.
- Build predictive models that are accurate and robust and that help our customers to utilize the core platform to the maximum extent.
Skills and Qualification
- 12 to 15 yrs of experience.
- An advanced degree in predictive analytics, machine learning, artificial intelligence; or a degree in programming and significant experience with text analytics/NLP. He shall have a strong background in machine learning (unsupervised and supervised techniques). In particular, excellent understanding of machine learning techniques and algorithms, such as k-NN, Naive Bayes, SVM, Decision Forests, logistic regression, MLPs, RNNs, etc.
- Experience with text mining, parsing, and classification using state-of-the-art techniques.
- Experience with information retrieval, Natural Language Processing, Natural Language
- Understanding and Neural Language Modeling.
- Ability to evaluate the quality of ML models and to define the right performance metrics for models in accordance with the requirements of the core platform.
- Experience in the Python data science ecosystem: Pandas, NumPy, SciPy, sci-kit-learn, NLTK, Gensim, etc.
- Excellent verbal and written communication skills, particularly possessing the ability to share technical results and recommendations to both technical and non-technical audiences.
- Ability to perform high-level work both independently and collaboratively as a project member or leader on multiple projects.
Responsibilities for Data Scientist/ NLP Engineer
Work with customers to identify opportunities for leveraging their data to drive business
• Develop custom data models and algorithms to apply to data sets.
• Basic data cleaning and annotation for any incoming raw data.
• Use predictive modeling to increase and optimize customer experiences, revenue
generation, ad targeting and other business outcomes.
• Develop company A/B testing framework and test model quality.
• Deployment of ML model in production.
Qualifications for Junior Data Scientist/ NLP Engineer
• BS, MS in Computer Science, Engineering, or related discipline.
• 3+ Years of experience in Data Science/Machine Learning.
• Experience with programming language Python.
• Familiar with at least one database query language, such as SQL
• Knowledge of Text Classification & Clustering, Question Answering & Query Understanding,
Search Indexing & Fuzzy Matching.
• Excellent written and verbal communication skills for coordinating acrossteams.
• Willing to learn and master new technologies and techniques.
• Knowledge and experience in statistical and data mining techniques:
GLM/Regression, Random Forest, Boosting, Trees, text mining, NLP, etc.
• Experience with chatbots would be bonus but not required
- 3+ years of experience in Machine Learning
- Bachelors/Masters in Computer Engineering/Science.
- Bachelors/Masters in Engineering/Mathematics/Statistics with sound knowledge of programming and computer concepts.
- 10 and 12th acedemics 70 % & above.
- Strong Python/ programming skills
- Good conceptual understanding of Machine Learning/Deep Learning/Natural Language Processing
- Strong verbal and written communication skills.
- Should be able to manage team, meet project deadlines and interface with clients.
- Should be able to work across different domains and quickly ramp up the business processes & flows & translate business problems into the data solutions
- Writing reusable, testable, and efficient code
- Design and implementation of low-latency, high-availability, and performant applications
- Integration of user-facing elements developed by front-end developers with server side logic
- Implementation of security and data protection
- Integration of data storage solutions (may include databases, key-value stores, blob stores, etc.)
- Expert in Python, with knowledge of at least one Python web framework (such as Django, Flask, etc depending on your technology stack)
- Familiarity with some ORM (Object Relational Mapper) libraries
- Able to integrate multiple data sources and databases into one system
- Understanding of the threading limitations of Python, and multi-process architecture
- Good understanding of server-side templating languages (such as Jinja 2, Mako, etc depending on your technology stack)
- Understanding of accessibility and security compliance (depending on the specific project)
- Knowledge of user authentication and authorization between multiple systems, servers, and environments
- Understanding of fundamental design principles behind a scalable application
- Familiarity with event-driven programming in Python
- Understanding of the differences between multiple delivery platforms, such as mobile vs desktop, and optimizing output to match the specific platform
- Able to create database schemas that represent and support business processes
- Strong unit test and debugging skills
- Basic knowledge of machine learning algorithm and libraries like keras, tensorflow, sklearn.
Role and Responsibilities
- Execute data mining projects, training and deploying models over a typical duration of 2 -12 months.
- The ideal candidate should be able to innovate, analyze the customer requirement, develop a solution in the time box of the project plan, execute and deploy the solution.
- Integrate the data mining projects embedded data mining applications in the FogHorn platform (on Docker or Android).
Candidates must meet ALL of the following qualifications:
- Have analyzed, trained and deployed at least three data mining models in the past. If the candidate did not directly deploy their own models, they will have worked with others who have put their models into production. The models should have been validated as robust over at least an initial time period.
- Three years of industry work experience, developing data mining models which were deployed and used.
- Programming experience in Python is core using data mining related libraries like Scikit-Learn. Other relevant Python mining libraries include NumPy, SciPy and Pandas.
- Data mining algorithm experience in at least 3 algorithms across: prediction (statistical regression, neural nets, deep learning, decision trees, SVM, ensembles), clustering (k-means, DBSCAN or other) or Bayesian networks
Any of the following extra qualifications will make a candidate more competitive:
- Soft Skills
- Sets expectations, develops project plans and meets expectations.
- Experience adapting technical dialogue to the right level for the audience (i.e. executives) or specific jargon for a given vertical market and job function.
- Technical skills
- Commonly, candidates have a MS or Ph.D. in Computer Science, Math, Statistics or an engineering technical discipline. BS candidates with experience are considered.
- Have managed past models in production over their full life cycle until model replacement is needed. Have developed automated model refreshing on newer data. Have developed frameworks for model automation as a prototype for product.
- Training or experience in Deep Learning, such as TensorFlow, Keras, convolutional neural networks (CNN) or Long Short Term Memory (LSTM) neural network architectures. If you don’t have deep learning experience, we will train you on the job.
- Shrinking deep learning models, optimizing to speed up execution time of scoring or inference.
- OpenCV or other image processing tools or libraries
- Cloud computing: Google Cloud, Amazon AWS or Microsoft Azure. We have integration with Google Cloud and are working on other integrations.
- Decision trees like XGBoost or Random Forests is helpful.
- Complex Event Processing (CEP) or other streaming data as a data source for data mining analysis
- Time series algorithms from ARIMA to LSTM to Digital Signal Processing (DSP).
- Bayesian Networks (BN), a.k.a. Bayesian Belief Networks (BBN) or Graphical Belief Networks (GBN)
- Experience with PMML is of interest (see www.DMG.org).
- Vertical experience in Industrial Internet of Things (IoT) applications:
- Energy: Oil and Gas, Wind Turbines
- Manufacturing: Motors, chemical processes, tools, automotive
- Smart Cities: Elevators, cameras on population or cars, power grid
- Transportation: Cars, truck fleets, trains
About FogHorn Systems
FogHorn is a leading developer of “edge intelligence” software for industrial and commercial IoT application solutions. FogHorn’s Lightning software platform brings the power of advanced analytics and machine learning to the on-premise edge environment enabling a new class of applications for advanced monitoring and diagnostics, machine performance optimization, proactive maintenance and operational intelligence use cases. FogHorn’s technology is ideally suited for OEMs, systems integrators and end customers in manufacturing, power and water, oil and gas, renewable energy, mining, transportation, healthcare, retail, as well as Smart Grid, Smart City, Smart Building and connected vehicle applications.
- 2019 Edge Computing Company of the Year – Compass Intelligence
- 2019 Internet of Things 50: 10 Coolest Industrial IoT Companies – CRN
- 2018 IoT Planforms Leadership Award & Edge Computing Excellence – IoT Evolution World Magazine
- 2018 10 Hot IoT Startups to Watch – Network World. (Gartner estimated 20 billion connected things in use worldwide by 2020)
- 2018 Winner in Artificial Intelligence and Machine Learning – Globe Awards
- 2018 Ten Edge Computing Vendors to Watch – ZDNet & 451 Research
- 2018 The 10 Most Innovative AI Solution Providers – Insights Success
- 2018 The AI 100 – CB Insights
- 2017 Cool Vendor in IoT Edge Computing – Gartner
- 2017 20 Most Promising AI Service Providers – CIO Review
Our Series A round was for $15 million. Our Series B round was for $30 million October 2017. Investors include: Saudi Aramco Energy Ventures, Intel Capital, GE, Dell, Bosch, Honeywell and The Hive.
About the Data Science Solutions team
In 2018, our Data Science Solutions team grew from 4 to 9. We are growing again from 11. We work on revenue generating projects for clients, such as predictive maintenance, time to failure, manufacturing defects. About half of our projects have been related to vision recognition or deep learning. We are not only working on consulting projects but developing vertical solution applications that run on our Lightning platform, with embedded data mining.
Our data scientists like our team because:
- We care about “best practices”
- Have a direct impact on the company’s revenue
- Give or receive mentoring as part of the collaborative process
- Questions and challenging the status quo with data is safe
- Intellectual curiosity balanced with humility
- Present papers or projects in our “Thought Leadership” meeting series, to support continuous learning
This will include:
The verticals included are:
Roles & Responsibilities:
· You will be involved in every part of the project lifecycle, right from identifying the business problem and proposing a solution, to data collection, cleaning, and preprocessing, to training and optimizing ML/DL models and deploying them to production.
· You will often be required to design and execute proof-of-concept projects that can demonstrate business value and build confidence with CloudMoyo’s clients.
· You will be involved in designing and delivering data visualizations that utilize the ML models to generate insights and intuitively deliver business value to CXOs.
Desired Skill Set:
· Candidates should have strong Python coding skills and be comfortable working with various ML/DL frameworks and libraries.
· Hands-on skills and industry experience in one or more of the following areas is necessary:
1) Deep Learning (CNNs/RNNs, Reinforcement Learning, VAEs/GANs)
2) Machine Learning (Regression, Random Forests, SVMs, K-means, ensemble methods)
3) Natural Language Processing
4) Graph Databases (Neo4j, Apache Giraph)
5) Azure Bot Service
6) Azure ML Studio / Azure Cognitive Services
7) Log Analytics with NLP/ML/DL
· Previous experience with data visualization, C# or Azure Cloud platform and services will be a plus.
· Candidates should have excellent communication skills and be highly technical, with the ability to discuss ideas at any level from executive to developer.
· Creative problem-solving, unconventional approaches and a hacker mindset is highly desired.