Job Description
Data scientist with strong background in data mining, machine learning, recommendation systems, and statistics. Should possess signature strengths of a qualified mathematician with ability to apply concepts of Mathematics, Applied Statistics, with specialization in one or more of NLP, Computer Vision, Speech, Data mining to develop models that provide effective solution.. A strong data engineering background with hands-on coding capabilities is needed to own and deliver outcomes.
A Master’s or PhD Degree in a highly quantitative field (Computer Science, Machine Learning, Operational Research, Statistics, Mathematics, etc.) or equivalent experience, 7+ years of industry experience in predictive modelling, data science and analysis, with prior experience in a ML or data scientist role and a track record of building ML or DL models.
Responsibilities and skills:
● Work with our customers to deliver a ML / DL project from beginning to end, including understanding the business need, aggregating data, exploring data, building & validating predictive models, and deploying completed models to deliver business impact to the organization.
● Selecting features, building and optimizing classifiers using ML techniques ● Data mining using state-of-the-art methods, create text mining pipelines to clean & process large unstructured datasets to reveal high quality information and hidden insights using machine learning techniques
● Should be able to appreciate and work on Computer Vision problems – for example extract rich information from images to categorize and process visual data— Develop machine learning algorithms for object and image classification, Experience in using DBScan, PCA, Random Forests and Multinomial Logistic Regression to select the best features to classify objects.
OR
● Deep understanding of NLP such as fundamentals of information retrieval, deep learning approaches, transformers, attention models, text summarisation, attribute extraction, etc. Preferable experience in one or more of the following areas: recommender systems, moderation of user generated content, sentiment analysis, etc.
OR
● Speech recognition, speech to text and vice versa, understanding NLP and IR, text summarisation, statistical and deep learning approaches to text processing. Experience of having worked in these areas.
Excellent understanding of machine learning techniques and algorithms, such as k-NN, Naive Bayes, SVM, Decision Forests, etc. Needs to appreciate deep learning frameworks like MXNet, Caffe 2, Keras, Tensorflow
● Experience in working with GPUs to develop models, handling terabyte size datasets ● Experience with common data science toolkits, such as R, Weka, NumPy, MatLab, mlr, mllib, Scikit-learn, caret etc - excellence in at least one of these is highly desirable ● Should be able to work hands-on in Python, R etc. Should closely collaborate & work with engineering teams to iteratively analyse data using Scala, Spark, Hadoop, Kafka, Storm etc.,
● Experience with NoSQL databases and familiarity with data visualization tools will be of great advantage
About Sahaj AI Software
Our name ‘Sahaj’ reflects the mission that binds us together. As a company of technology artisans, we do this by harnessing the individuality and diversity in every one of us along with an emphasis on first principles thinking. It is how we find simple and yet sophisticated solutions to our clients’ most complex problems.
With purpose-built solutions and technology advisory, we solve the most complex engineering problems for clients to improve data availability and craft AI solutions that adapt and evolve with their changing needs or technology advances. Without technology force-fits and over-designed solutions, everything we do is crafted specifically for each client.
Our emphasis on craft makes us different. Our solutions for clients are never assembled in a software factory or replicas of pre-built solutions designed for others. We are a company that thrives because of our people. Unlike a software factory that relies on regimented processes or rigid methods to reduce our dependence on human talent, everything we do at Sahaj is designed with a singular objective of encouraging human ingenuity.
Similar jobs
Responsibilities
- Work on execution and scheduling of all tasks related to assigned projects' deliverable dates
- Optimize and debug existing codes to make them scalable and improve performance
- Design, development, and delivery of tested code and machine learning models into production environments
- Work effectively in teams, managing and leading teams
- Provide effective, constructive feedback to the delivery leader
- Manage client expectations and work with an agile mindset with machine learning and AI technology
- Design and prototype data-driven solutions
Eligibility
- Highly experienced in designing, building, and shipping scalable and production-quality machine learning algorithms in the field of Python applications
- Working knowledge and experience in NLP core components (NER, Entity Disambiguation, etc.)
- In-depth expertise in Data Munging and Storage (Experienced in SQL, NoSQL, MongoDB, Graph Databases)
- Expertise in writing scalable APIs for machine learning models
- Experience with maintaining code logs, task schedulers, and security
- Working knowledge of machine learning techniques, feed-forward, recurrent and convolutional neural networks, entropy models, supervised and unsupervised learning
- Experience with at least one of the following: Keras, Tensorflow, Caffe, or PyTorch
Principal Accountabilities :
1. Good in communication and converting business requirements to functional requirements
2. Develop data-driven insights and machine learning models to identify and extract facts from sales, supply chain and operational data
3. Sound Knowledge and experience in statistical and data mining techniques: Regression, Random Forest, Boosting Trees, Time Series Forecasting, etc.
5. Experience in SOTA Deep Learning techniques to solve NLP problems.
6. End-to-end data collection, model development and testing, and integration into production environments.
7. Build and prototype analysis pipelines iteratively to provide insights at scale.
8. Experience in querying different data sources
9. Partner with developers and business teams for the business-oriented decisions
10. Looking for someone who dares to move on even when the path is not clear and be creative to overcome challenges in the data.
● Statistics - Always makes data-driven decisions using tools from statistics, such as: populations and
sampling, normal distribution and central limit theorem, mean, median, mode, variance, standard
deviation, covariance, correlation, p-value, expected value, conditional probability and Bayes's theorem
● Machine Learning
○ Solid grasp of attention mechanism, transformers, convolutions, optimisers, loss functions,
LSTMs, forget gates, activation functions.
○ Can implement all of these from scratch in pytorch, tensorflow or numpy.
○ Comfortable defining own model architectures, custom layers and loss functions.
● Modelling
○ Comfortable with using all the major ML frameworks (pytorch, tensorflow, sklearn, etc) and NLP
models (not essential). Able to pick the right library and framework for the job.
○ Capable of turning research and papers into operational execution and functionality delivery.
- Provide insights based on data to business teams
- Develop framework, solutions and recommendations for business problems
- Build ML models for predictive solutions
- Use advance data science techniques to build business solutions
- Automation / Optimization of new/existing models ensuring smooth,timely and accurate execution with lowest possible TAT.
- Design & maintenance of response tracking, measurement, and comparison of success parameters of various projects.
- Ability to handle large volumes of data with ease using multiple software like Python ,R etc
Experience in modeling techniques and hands on experience in building Logistic regression models, Random Forrest, K-mean Cluster, NLP, Decision tree, Boosting techniques etc
- Good at data interpretation and reasoning skills
- Modeling complex problems, discovering insights, and identifying opportunities through the use of statistical, algorithmic, mining, and visualization techniques
- Experience working with business understanding the requirement, creating the problem statement, and building scalable and dependable Analytical solutions
- Must have hands-on and strong experience in Python
- Broad knowledge of fundamentals and state-of-the-art in NLP and machine learning
- Strong analytical & algorithm development skills
- Deep knowledge of techniques such as Linear Regression, gradient descent, Logistic Regression, Forecasting, Cluster analysis, Decision trees, Linear Optimization, Text Mining, etc
- Ability to collaborate across teams and strong interpersonal skills
Skills
- Sound theoretical knowledge in ML algorithm and their application
- Hands-on experience in statistical modeling tools such as R, Python, and SQL
- Hands-on experience in Machine learning/data science
- Strong knowledge of statistics
- Experience in advanced analytics / Statistical techniques – Regression, Decision trees, Ensemble machine learning algorithms, etc
- Experience in Natural Language Processing & Deep Learning techniques
- Pandas, NLTK, Scikit-learn, SpaCy, Tensorflow
We are looking for an outstanding Big Data Engineer with experience setting up and maintaining Data Warehouse and Data Lakes for an Organization. This role would closely collaborate with the Data Science team and assist the team build and deploy machine learning and deep learning models on big data analytics platforms.
Roles and Responsibilities:
- Develop and maintain scalable data pipelines and build out new integrations and processes required for optimal extraction, transformation, and loading of data from a wide variety of data sources using 'Big Data' technologies.
- Develop programs in Scala and Python as part of data cleaning and processing.
- Assemble large, complex data sets that meet functional / non-functional business requirements and fostering data-driven decision making across the organization.
- Responsible to design and develop distributed, high volume, high velocity multi-threaded event processing systems.
- Implement processes and systems to validate data, monitor data quality, ensuring production data is always accurate and available for key stakeholders and business processes that depend on it.
- Perform root cause analysis on internal and external data and processes to answer specific business questions and identify opportunities for improvement.
- Provide high operational excellence guaranteeing high availability and platform stability.
- Closely collaborate with the Data Science team and assist the team build and deploy machine learning and deep learning models on big data analytics platforms.
Skills:
- Experience with Big Data pipeline, Big Data analytics, Data warehousing.
- Experience with SQL/No-SQL, schema design and dimensional data modeling.
- Strong understanding of Hadoop Architecture, HDFS ecosystem and eexperience with Big Data technology stack such as HBase, Hadoop, Hive, MapReduce.
- Experience in designing systems that process structured as well as unstructured data at large scale.
- Experience in AWS/Spark/Java/Scala/Python development.
- Should have Strong skills in PySpark (Python & SPARK). Ability to create, manage and manipulate Spark Dataframes. Expertise in Spark query tuning and performance optimization.
- Experience in developing efficient software code/frameworks for multiple use cases leveraging Python and big data technologies.
- Prior exposure to streaming data sources such as Kafka.
- Should have knowledge on Shell Scripting and Python scripting.
- High proficiency in database skills (e.g., Complex SQL), for data preparation, cleaning, and data wrangling/munging, with the ability to write advanced queries and create stored procedures.
- Experience with NoSQL databases such as Cassandra / MongoDB.
- Solid experience in all phases of Software Development Lifecycle - plan, design, develop, test, release, maintain and support, decommission.
- Experience with DevOps tools (GitHub, Travis CI, and JIRA) and methodologies (Lean, Agile, Scrum, Test Driven Development).
- Experience building and deploying applications on on-premise and cloud-based infrastructure.
- Having a good understanding of machine learning landscape and concepts.
Qualifications and Experience:
Engineering and post graduate candidates, preferably in Computer Science, from premier institutions with proven work experience as a Big Data Engineer or a similar role for 3-5 years.
Certifications:
Good to have at least one of the Certifications listed here:
AZ 900 - Azure Fundamentals
DP 200, DP 201, DP 203, AZ 204 - Data Engineering
AZ 400 - Devops Certification
CommerceIQ is Hiring Data Scientist (3-5 yrs)
At CommerceIQ, we are building the world’s most sophisticated E-commerce Channel Optimization software to help brands leverage Machine Learning, Analytics and Automation to grow their E-commerce business on all channels, globally.
Using CommerceIQ as a single source of truth, customers have driven 40% increase in incremental sales, 20% improvement in profitability and 32% reduction in out of stock rates on Amazon.
What You’ll Be Doing
As a Senior Data Scientist, you will work closely with Engineering/Product/Operations teams to build state-of-the-art ML based solutions for B2B SaaS products. This entails not only leveraging advanced techniques for predictions, time-series forecasting, topic modelling, optimisation but deep understanding of business and product too.
- Apply excellent problem solving skills to deconstruct and formulate solutions from first-principles
- Work on data science roadmap and build the core engine of our flagship CommerceIQ product
- Collaborate with product and engineering to design product strategy, identify key metrics to drive and support with proof of concept
- Perform rapid prototyping of experimental solutions and develop robust, sustainable and scalable production systems
- Work with large scale ecommerce data of the biggest brands on amazon
- Apply out-of-the-box, advanced algorithms to complex problems in real-time systems
- Drive productization of techniques to be made available to a wide range of customers
- You would be working with and mentoring fellow team members on the owned charter
What we are looking for -
- Bachelor’s or Masters in Computer Science or Maths/Stats from a reputed college with 4+ years of experience in solving data science problems that have driven value to customers
- Good depth and breadth in machine learning (theory and practice), optimization methods, data mining, statistics and linear algebra. Experience in NLP would be an advantage
- Hands-on programming skills and ability to write modular and scalable code in Python/R. Knowledge of SQL is required
- Familiarity with distributed computing architecture like Spark, Map-Reduce paradigm and Hadoop will be an added advantage
- Strong spoken and written communication skills, able to explain complex ideas in a simple, intuitive manner, write/maintain good technical documentation on projects
- Experience with building ML data products in an engineering organization interfacing with other teams and departments to deliver impact
- We are looking for candidates who are curious and self-starters; obsess over customer problems to deliver maximum value to them.
- Data scientist, Machine Learning, data science, data analyst
Job Type: Full-time
Experience:
- Data Scientist: 3 years (Required)
Application Question:
- Looking for product based industry experience from tier 1 /tier 2 colleges (NIT ,BIT, IIT,IIIT, BITS, Strong Profiles)
MTX Group Inc. is seeking a motivated Technical Lead - AI to join our team. MTX Group Inc. is a global implementation partner enabling organizations to become fit enterprises. MTX provides expertise across various platforms and technologies, including Google Cloud, Salesforce, artificial intelligence/machine learning, data integration, data governance, data quality, analytics, visualization and mobile technology. MTX’s very own Artificial Intelligence platform Maverick, enables clients to accelerate processes and critical decisions by leveraging a Cognitive Decision Engine, a collection of purpose-built Artificial Neural Networks designed to leverage the power of Machine Learning. The Maverick Platform includes Smart Asset Detection and Monitoring, Chatbot Services, Document Verification, to name a few.
Responsibilities:
- Extensive research and development of new AI/ML techniques that enables learning
the semantics of data (images, video, text, audio, speech, etc)
- Improving the existing ML and DNN models and products through R&D on cutting edge technologies
- Collaborate with Machine Learning teams to drive innovation of complex and accurate cognitive system
- Collaborate with Engineering and Core team to drive innovation of scalable ML and AI serving production platforms
- Create POCs to quickly test a new model architecture and create improvement over an existing methodology
- Introduce major innovations that can result in better product features and develop strategies and plans required to drive these
- Lead a team and collaborate with product managers, tech review complex implementations and provide optimisation best practices
What you will bring:
- 4-6 years of Experience
- Experience in neural networks, graphical models, reinforcement learning, and natural language processing
- Experience in Computer Vision techniques and image detection neural network models like semantic segmentation, instance segmentation, object detection, etc
- In-depth understanding of benchmarking, parallel computing, distributed computing, machine learning, and AI
- Programming experience in one or more of the following: Python, C, C++, C#, Java, R, and toolkits such as Tensorflow, Keras, PyTorch, Caffe, MxNet, SciPy, SciKit, etc
- Ability to perform research that is justified and guided by business opportunities
- Demonstrated successful implementation if industry grade AI solutions in the past
- Ability to lead a team of AI engineers in an agile development environment
What we offer:
- Group Medical Insurance (Family Floater Plan - Self + Spouse + 2 Dependent Children)
- Sum Insured: INR 5,00,000/-
- Maternity cover upto two children
- Inclusive of COVID-19 Coverage
- Cashless & Reimbursement facility
- Access to free online doctor consultation
- Personal Accident Policy (Disability Insurance) -
- Sum Insured: INR. 25,00,000/- Per Employee
- Accidental Death and Permanent Total Disability is covered up to 100% of Sum Insured
- Permanent Partial Disability is covered as per the scale of benefits decided by the Insurer
- Temporary Total Disability is covered
- An option of Paytm Food Wallet (up to Rs. 2500) as a tax saver benefit
- Monthly Internet Reimbursement of upto Rs. 1,000
- Opportunity to pursue Executive Programs/ courses at top universities globally
- Professional Development opportunities through various MTX sponsored certifications on multiple technology stacks including Salesforce, Google Cloud, Amazon & others