Data Scientist
Cubera is a data company revolutionizing big data analytics and Adtech through data share value principles wherein the users entrust their data to us. We refine the art of understanding, processing, extracting, and evaluating the data that is entrusted to us. We are a gateway for brands to increase their lead efficiency as the world moves towards web3.
What you’ll do?
- Build machine learning models, perform proof-of-concept, experiment, optimize, and deploy your models into production; work closely with software engineers to assist in productionizing your ML models.
- Establish scalable, efficient, automated processes for large-scale data analysis, machine-learning model development, model validation, and serving.
- Research new and innovative machine learning approaches.
- Perform hands-on analysis and modeling of enormous data sets to develop insights that increase Ad Traffic and Campaign Efficacy.
- Collaborate with other data scientists, data engineers, product managers, and business stakeholders to build well-crafted, pragmatic data products.
- Actively take on new projects and constantly try to improve the existing models and infrastructure necessary for offline and online experimentation and iteration.
- Work with your team on ambiguous problem areas in existing or new ML initiatives
What are we looking for?
- Ability to write a SQL query to pull the data you need.
- Fluency in Python and familiarity with its scientific stack such as numpy, pandas, scikit learn, matplotlib.
- Experience in Tensorflow and/or R Modelling and/or PyTorch
- Ability to understand a business problem and translate, and structure it into a data science problem.
Job Category: Data Science
Job Type: Full Time
Job Location: Bangalore
Similar jobs
- A Natural Language Processing (NLP) expert with strong computer science fundamentals and experience in working with deep learning frameworks. You will be working at the cutting edge of NLP and Machine Learning.
Roles and Responsibilities
- Work as part of a distributed team to research, build and deploy Machine Learning models for NLP.
- Mentor and coach other team members
- Evaluate the performance of NLP models and ideate on how they can be improved
- Support internal and external NLP-facing APIs
- Keep up to date on current research around NLP, Machine Learning and Deep Learning
Mandatory Requirements
- Any graduation with at least 2 years of demonstrated experience as a Data Scientist.
Behavioral Skills
Strong analytical and problem-solving capabilities.
- Proven ability to multi-task and deliver results within tight time frames
- Must have strong verbal and written communication skills
- Strong listening skills and eagerness to learn
- Strong attention to detail and the ability to work efficiently in a team as well as individually
Technical Skills
Hands-on experience with
- NLP
- Deep Learning
- Machine Learning
- Python
- Bert
Preferred Requirements
- Experience in Computer Vision is preferred
Daily and monthly responsibilities
- Review and coordinate with business application teams on data delivery requirements.
- Develop estimation and proposed delivery schedules in coordination with development team.
- Develop sourcing and data delivery designs.
- Review data model, metadata and delivery criteria for solution.
- Review and coordinate with team on test criteria and performance of testing.
- Contribute to the design, development and completion of project deliverables.
- Complete in-depth data analysis and contribution to strategic efforts
- Complete understanding of how we manage data with focus on improvement of how data is sourced and managed across multiple business areas.
Basic Qualifications
- Bachelor’s degree.
- 5+ years of data analysis working with business data initiatives.
- Knowledge of Structured Query Language (SQL) and use in data access and analysis.
- Proficient in data management including data analytical capability.
- Excellent verbal and written communications also high attention to detail.
- Experience with Python.
- Presentation skills in demonstrating system design and data analysis solutions.
Purpose of Job:
Responsible to lead a team of analysts to build and deploy predictive models to infuse core
business functions with deep analytical insights. The Senior Data Scientist will also work
closely with the Kinara management team to investigate strategically important business questions.
Job Responsibilities:
Lead a team through the entire analytical and machine learning model life cycle:
Define the problem statement
Build and clean datasets
Exploratory data analysis
Feature engineering
Apply ML algorithms and assess the performance
Code for deployment
Code testing and troubleshooting
Communicate Analysis to Stakeholders
Manage Data Analysts and Data Scientists
Qualifications:
Education: MS/MTech/Btech graduates or equivalent with a focus on data science and
quantitative fields (CS, Engineering, Mathematics, Economics)
Work Experience: 5+ years in a professional role with 3+ years in ML/AI
Other Requirements: ⮚ Domain knowledge in Financial Services is a big plus
Skills & Competencies
Technical Skills
⮚ Aptitude in Math and Stats
⮚ Proven experience in the use of Python, SQL, DevOps
⮚ Excellent in programming (Python), stats tools, and SQL
⮚ Working knowledge of tools and utilities - AWS, Git, Selenium, Postman,Prefect, Airflow, PySpark
Soft Skills
⮚ Deep Curiosity and Humility
⮚ Strong communications verbal and written
Responsibilities:
• Designing Hive/HCatalog data model includes creating table definitions, file formats, compression techniques for Structured & Semi-structured data processing
• Implementing Spark processing based ETL frameworks
• Implementing Big data pipeline for Data Ingestion, Storage, Processing & Consumption
• Modifying the Informatica-Teradata & Unix based data pipeline
• Enhancing the Talend-Hive/Spark & Unix based data pipelines
• Develop and Deploy Scala/Python based Spark Jobs for ETL processing
• Strong SQL & DWH concepts.
Preferred Background:
• Function as integrator between business needs and technology solutions, helping to create technology solutions to meet clients’ business needs
• Lead project efforts in defining scope, planning, executing, and reporting to stakeholders on strategic initiatives
• Understanding of EDW system of business and creating High level design document and low level implementation document
• Understanding of Big Data Lake system of business and creating High level design document and low level implementation document
• Designing Big data pipeline for Data Ingestion, Storage, Processing & Consumption
- Conducting advanced statistical analysis to provide actionable insights, identify trends, and measure performance
- Performing data exploration, cleaning, preparation and feature engineering; in addition to executing tasks such as building a POC, validation/ AB testing
- Collaborating with data engineers & architects to implement and deploy scalable solutions
- Communicating results to diverse audiences with effective writing and visualizations
- Identifying and executing on high impact projects, triage external requests, and ensure timely completion for the results to be useful
- Providing thought leadership by researching best practices, conducting experiments, and collaborating with industry leaders
What you need to have:
- 2-4 year experience in machine learning algorithms, predictive analytics, demand forecasting in real-world projects
- Strong statistical background in descriptive and inferential statistics, regression, forecasting techniques.
- Strong Programming background in Python (including packages like Tensorflow), R, D3.js , Tableau, Spark, SQL, MongoDB.
- Preferred exposure to Optimization & Meta-heuristic algorithm and related applications
- Background in a highly quantitative field like Data Science, Computer Science, Statistics, Applied Mathematics,Operations Research, Industrial Engineering, or similar fields.
- Should have 2-4 years of experience in Data Science algorithm design and implementation, data analysis in different applied problems.
- DS Mandatory skills : Python, R, SQL, Deep learning, predictive analysis, applied statistics
We are establishing infrastructure for internal and external reporting using Tableau and are looking for someone with experience building visualizations and dashboards in Tableau and using Tableau Server to deliver them to internal and external users.
Required Experience
- Implementation of interactive visualizations using Tableau Desktop
- Integration with Tableau Server and support of production dashboards and embedded reports with it
- Writing and optimization of SQL queries
- Proficient in Python including the use of Pandas and numpy libraries to perform data exploration and analysis
- 3 years of experience working as a Software Engineer / Senior Software Engineer
- Bachelors in Engineering – can be Electronic and comm , Computer , IT
- Well versed with Basic Data Structures Algorithms and system design
- Should be capable of working well in a team – and should possess very good communication skills
- Self-motivated and fun to work with and organized
- Productive and efficient working remotely
- Test driven mindset with a knack for finding issues and problems at earlier stages of development
- Interest in learning and picking up a wide range of cutting edge technologies
- Should be curious and interested in learning some Data science related concepts and domain knowledge
- Work alongside other engineers on the team to elevate technology and consistently apply best practices
Highly Desirable
- Data Analytics
- Experience in AWS cloud or any cloud technologies
- Experience in BigData technologies and streaming like – pyspark, kafka is a big plus
- Shell scripting
- Preferred tech stack – Python, Rest API, Microservices, Flask/Fast API, pandas, numpy, linux, shell scripting, Airflow, pyspark
- Has a strong backend experience – and worked with Microservices and Rest API’s - Flask, FastAPI, Databases Relational and Non-relational
Location: Bengaluru
Department: - Engineering
Bidgely is looking for extraordinary and dynamic Senior Data Analyst to be part of its core team in Bangalore. You must have delivered exceptionally high quality robust products dealing with large data. Be part of a highly energetic and innovative team that believes nothing is impossible with some creativity and hard work.
● Design and implement a high volume data analytics pipeline in Looker for Bidgely flagship product.
● Implement data pipeline in Bidgely Data Lake
● Collaborate with product management and engineering teams to elicit & understand their requirements & challenges and develop potential solutions
● Stay current with the latest tools, technology ideas and methodologies; share knowledge by clearly articulating results and ideas to key decision makers.
● 3-5 years of strong experience in data analytics and in developing data pipelines.
● Very good expertise in Looker
● Strong in data modeling, developing SQL queries and optimizing queries.
● Good knowledge of data warehouse (Amazon Redshift, BigQuery, Snowflake, Hive).
● Good understanding of Big data applications (Hadoop, Spark, Hive, Airflow, S3, Cloudera)
● Attention to details. Strong communication and collaboration skills.
● BS/MS in Computer Science or equivalent from premier institutes.
Preferred Education & Experience:
-
Bachelor’s or master’s degree in Computer Engineering, Computer Science, Computer Applications, Mathematics, Statistics or related technical field or equivalent practical experience. Relevant experience of at least 3 years in lieu of above if from a different stream of education.
-
Well-versed in and 5+ years of hands-on demonstrable experience with:
▪ Data Analysis & Data Modeling
▪ Database Design & Implementation
▪ Database Performance Tuning & Optimization
▪ PL/pgSQL & SQL -
5+ years of hands-on development experience in Relational Database (PostgreSQL/SQL Server/Oracle).
-
5+ years of hands-on development experience in SQL, PL/PgSQL, including stored procedures, functions, triggers, and views.
-
Hands-on experience with demonstrable working experience in Database Design Principles, SQL Query Optimization Techniques, Index Management, Integrity Checks, Statistics, and Isolation levels
-
Hands-on experience with demonstrable working experience in Database Read & Write Performance Tuning & Optimization.
-
Knowledge and Experience working in Domain Driven Design (DDD) Concepts, Object Oriented Programming System (OOPS) Concepts, Cloud Architecture Concepts, NoSQL Database Concepts are added values
-
Knowledge and working experience in Oil & Gas, Financial, & Automotive Domains is a plus
-
Hands-on development experience in one or more NoSQL data stores such as Cassandra, HBase, MongoDB, DynamoDB, Elastic Search, Neo4J, etc. a plus.
- Strong in problem solving, algorithms and data structures
- Proficient in Python
- Hands on experience in technologies and tools related to any of NLP, Deep learning, Machine learning, Conversational AI
- Experience with knowledge Graph or any graph based system is plus
- Able to train and deploy models
- Broad knowledge of machine learning algorithms and principles
- Performance profiling and Tuning
- Communicate and propose solutions to business challenges
- Familiarity with at least one of the cloud computing infrastructure - GCP/AWS
- Keep abreast with the latest technological advances
- Team mentoring and leadership skills.
Job Description:
Roles & Responsibilities:
· You will be involved in every part of the project lifecycle, right from identifying the business problem and proposing a solution, to data collection, cleaning, and preprocessing, to training and optimizing ML/DL models and deploying them to production.
· You will often be required to design and execute proof-of-concept projects that can demonstrate business value and build confidence with CloudMoyo’s clients.
· You will be involved in designing and delivering data visualizations that utilize the ML models to generate insights and intuitively deliver business value to CXOs.
Desired Skill Set:
· Candidates should have strong Python coding skills and be comfortable working with various ML/DL frameworks and libraries.
· Hands-on skills and industry experience in one or more of the following areas is necessary:
1) Deep Learning (CNNs/RNNs, Reinforcement Learning, VAEs/GANs)
2) Machine Learning (Regression, Random Forests, SVMs, K-means, ensemble methods)
3) Natural Language Processing
4) Graph Databases (Neo4j, Apache Giraph)
5) Azure Bot Service
6) Azure ML Studio / Azure Cognitive Services
7) Log Analytics with NLP/ML/DL
· Previous experience with data visualization, C# or Azure Cloud platform and services will be a plus.
· Candidates should have excellent communication skills and be highly technical, with the ability to discuss ideas at any level from executive to developer.
· Creative problem-solving, unconventional approaches and a hacker mindset is highly desired.