Requirements
Experience
- 5+ years of professional experience in implementing MLOps framework to scale up ML in production.
- Hands-on experience with Kubernetes, Kubeflow, MLflow, Sagemaker, and other ML model experiment management tools including training, inference, and evaluation.
- Experience in ML model serving (TorchServe, TensorFlow Serving, NVIDIA Triton inference server, etc.)
- Proficiency with ML model training frameworks (PyTorch, Pytorch Lightning, Tensorflow, etc.).
- Experience with GPU computing to do data and model training parallelism.
- Solid software engineering skills in developing systems for production.
- Strong expertise in Python.
- Building end-to-end data systems as an ML Engineer, Platform Engineer, or equivalent.
- Experience working with cloud data processing technologies (S3, ECR, Lambda, AWS, Spark, Dask, ElasticSearch, Presto, SQL, etc.).
- Having Geospatial / Remote sensing experience is a plus.
Similar jobs
Machine Learning & Deep Learning – Strong
Experienced in TensorFlow, PyTorch, ONNX, Object Detection, Pretrained Models like YOLO, SSD, Faster RCNN, etc…
Python – Strong
NumPy, Pandas, OpenCV
Problem Solving - strong
C++ - average
It will be good if candidate have working experience in C++ in any domain
Note :: Looking for Immediate to 30 days of Notice Period
About Us
Censius is a US-based product company that is enabling AI at scale for enterprises. We are unlocking MLOps scalability by building the world's fastest way to deploy models and are amongst the earliest companies to tackle Model Performance Management. At Censius, you will get to solve difficult problems in a very nascent, but rapidly growing, area.
About the role
In this role, you will design and implement a generic ML platform that helps monitor models across modalities in production. You will collaborate with the research and development teams to build robust ML and big data monitoring platforms.
Responsibilities
* Work on large-scale machine learning challenges that impact millions of people around the globe
* Research and implement cutting-edge algorithms and implement pipelines that work with massive data sets in real-time
* Implementing fast, scalable solutions with optimal performance day in and out
* Machine-learning being the core of our business, this role will be responsible for all phases of the product development lifecycle
* Evaluate and validate the analyses with statistical methods and explain to people unfamiliar with the domain of data science
* Writing specifications for algorithms, reports on data analysis, documentation of algorithms, and collaborating with product teams' skills and attributes for success will want you to have
* Strong programming skills in Python.
* Working experience with a variety of ML techniques (decision trees, clustering, boosting, bagging, neural networks, etc.)
* Working experience with advanced statistical concepts (outliers, distance, regression, distributions, statistical tests, etc.)
* Hands-on experience with one or more machine learning frameworks - PyTorch, Keras, Tensorflow, XGBoost, and libraries - Pandas, NumPY, Scikit-learn
* Familiarity with ML platforms like MLflow, Weights&Biases, Kubeflow, and AWS SageMaker.It'd be nice if you have* Passion for developing data products from scratch and a high level of proactiveness
* Knowledge of Reinforcement learning and Optimisation problems on a large scale is a big plus
* Some experience in project management and mentoring is also a plus.
* Knowledge and experience in deploying large-scale systems using distributed and cloud-based systems (Hadoop, Amazon EC2) is a big plus.
You will excel in this role if
* You are scrappy, take ownership, and follow through to the very end
* You enjoy wearing multiple hats
* A sincere desire to learn and grow - we're quite small, so the desire to learn and grow as the company grows is essential!
Benefits
- Competitive Salary 💸
- Work Remotely 🌎
- Health insurance 🏥
- Unlimited Time Off ⏰
- Support for continual learning (free books and online courses) 📚
- Reimbursement for streaming services (think Netflix) 🎥
- Reimbursement for gym or physical activity of your choice 🏋🏽♀️
- Flex hours 💪
- Leveling Up Opportunities 🌱
Job Description: Data Scientist
At Propellor.ai, we derive insights that allow our clients to make scientific decisions. We believe in demanding more from the fields of Mathematics, Computer Science, and Business Logic. Combine these and we show our clients a 360-degree view of their business. In this role, the Data Scientist will be expected to work on Procurement problems along with a team-based across the globe.
We are a Remote-First Company.
Read more about us here: https://www.propellor.ai/consulting" target="_blank">https://www.propellor.ai/consulting
What will help you be successful in this role
- Articulate
- High Energy
- Passion to learn
- High sense of ownership
- Ability to work in a fast-paced and deadline-driven environment
- Loves technology
- Highly skilled at Data Interpretation
- Problem solver
- Ability to narrate the story to the business stakeholders
- Generate insights and the ability to turn them into actions and decisions
Skills to work in a challenging, complex project environment
- Need you to be naturally curious and have a passion for understanding consumer behavior
- A high level of motivation, passion, and high sense of ownership
- Excellent communication skills needed to manage an incredibly diverse slate of work, clients, and team personalities
- Flexibility to work on multiple projects and deadline-driven fast-paced environment
- Ability to work in ambiguity and manage the chaos
Key Responsibilities
- Analyze data to unlock insights: Ability to identify relevant insights and actions from data. Use regression, cluster analysis, time series, etc. to explore relationships and trends in response to stakeholder questions and business challenges.
- Bring in experience for AI and ML: Bring in Industry experience and apply the same to build efficient and optimal Machine Learning solutions.
- Exploratory Data Analysis (EDA) and Generate Insights: Analyse internal and external datasets using analytical techniques, tools, and visualization methods. Ensure pre-processing/cleansing of data and evaluate data points across the enterprise landscape and/or external data points that can be leveraged in machine learning models to generate insights.
- DS and ML Model Identification and Training: Identity, test, and train machine learning models that need to be leveraged for business use cases. Evaluate models based on interpretability, performance, and accuracy as required. Experiment and identify features from datasets that will help influence model outputs. Determine what models will need to be deployed, data points that need to be fed into models, and aid in the deployment and maintenance of models.
Technical Skills
An enthusiastic individual with the following skills. Please do not hesitate to apply if you do not match all of them. We are open to promising candidates who are passionate about their work, fast learners and are team players.
- Strong experience with machine learning and AI including regression, forecasting, time series, cluster analysis, classification, Image recognition, NLP, Text Analytics and Computer Vision.
- Strong experience with advanced analytics tools for Object-oriented/object function scripting using languages such as Python, or similar.
- Strong experience with popular database programming languages including SQL.
- Strong experience in Spark/Pyspark
- Experience in working in Databricks
What are the company benefits you get, when you join us as?
- Permanent Work from Home Opportunity
- Opportunity to work with Business Decision Makers and an internationally based team
- The work environment that offers limitless learning
- A culture void of any bureaucracy, hierarchy
- A culture of being open, direct, and with mutual respect
- A fun, high-caliber team that trusts you and provides the support and mentorship to help you grow
- The opportunity to work on high-impact business problems that are already defining the future of Marketing and improving real lives
To know more about how we work: https://bit.ly/3Oy6WlE" target="_blank">https://bit.ly/3Oy6WlE
Whom will you work with?
You will closely work with other Senior Data Scientists and Data Engineers.
Immediate to 15-day Joiners will be preferred.
We at Datametica Solutions Private Limited are looking for an SQL Lead / Architect who has a passion for the cloud with knowledge of different on-premises and cloud Data implementation in the field of Big Data and Analytics including and not limiting to Teradata, Netezza, Exadata, Oracle, Cloudera, Hortonworks and alike.
Ideal candidates should have technical experience in migrations and the ability to help customers get value from Datametica's tools and accelerators.
Job Description :
Experience: 6+ Years
Work Location: Pune / Hyderabad
Technical Skills :
- Good programming experience as an Oracle PL/SQL, MySQL, and SQL Server Developer
- Knowledge of database performance tuning techniques
- Rich experience in a database development
- Experience in Designing and Implementation Business Applications using the Oracle Relational Database Management System
- Experience in developing complex database objects like Stored Procedures, Functions, Packages and Triggers using SQL and PL/SQL
Required Candidate Profile :
- Excellent communication, interpersonal, analytical skills and strong ability to drive teams
- Analyzes data requirements and data dictionary for moderate to complex projects • Leads data model related analysis discussions while collaborating with Application Development teams, Business Analysts, and Data Analysts during joint requirements analysis sessions
- Translate business requirements into technical specifications with an emphasis on highly available and scalable global solutions
- Stakeholder management and client engagement skills
- Strong communication skills (written and verbal)
About Us!
A global leader in the Data Warehouse Migration and Modernization to the Cloud, we empower businesses by migrating their Data/Workload/ETL/Analytics to the Cloud by leveraging Automation.
We have expertise in transforming legacy Teradata, Oracle, Hadoop, Netezza, Vertica, Greenplum along with ETLs like Informatica, Datastage, AbInitio & others, to cloud-based data warehousing with other capabilities in data engineering, advanced analytics solutions, data management, data lake and cloud optimization.
Datametica is a key partner of the major cloud service providers - Google, Microsoft, Amazon, Snowflake.
We have our own products!
Eagle Data warehouse Assessment & Migration Planning Product
Raven Automated Workload Conversion Product
Pelican Automated Data Validation Product, which helps automate and accelerate data migration to the cloud.
Why join us!
Datametica is a place to innovate, bring new ideas to live, and learn new things. We believe in building a culture of innovation, growth, and belonging. Our people and their dedication over these years are the key factors in achieving our success.
Benefits we Provide!
Working with Highly Technical and Passionate, mission-driven people
Subsidized Meals & Snacks
Flexible Schedule
Approachable leadership
Access to various learning tools and programs
Pet Friendly
Certification Reimbursement Policy
Check out more about us on our website below!
www.datametica.com
- Key responsibility is to design & develop a data pipeline for real-time data integration, processing, executing of the model (if required), and exposing output via MQ / API / No-SQL DB for consumption
- Provide technical expertise to design efficient data ingestion solutions to store & process unstructured data, such as Documents, audio, images, weblogs, etc
- Developing API services to provide data as a service
- Prototyping Solutions for complex data processing problems using AWS cloud-native solutions
- Implementing automated Audit & Quality assurance Checks in Data Pipeline
- Document & maintain data lineage from various sources to enable data governance
- Coordination with BIU, IT, and other stakeholders to provide best-in-class data pipeline solutions, exposing data via APIs, loading in down streams, No-SQL Databases, etc
Skills
- Programming experience using Python & SQL
- Extensive working experience in Data Engineering projects, using AWS Kinesys, AWS S3, DynamoDB, EMR, Lambda, Athena, etc for event processing
- Experience & expertise in implementing complex data pipeline
- Strong Familiarity with AWS Toolset for Storage & Processing. Able to recommend the right tools/solutions available to address specific data processing problems
- Hands-on experience in Unstructured (Audio, Image, Documents, Weblogs, etc) Data processing.
- Good analytical skills with the ability to synthesize data to design and deliver meaningful information
- Know-how on any No-SQL DB (DynamoDB, MongoDB, CosmosDB, etc) will be an advantage.
- Ability to understand business functionality, processes, and flows
- Good combination of technical and interpersonal skills with strong written and verbal communication; detail-oriented with the ability to work independently
Functional knowledge
- Real-time Event Processing
- Data Governance & Quality assurance
- Containerized deployment
- Linux
- Unstructured Data Processing
- AWS Toolsets for Storage & Processing
- Data Security
- Must have the experience of leading teams and drive customer interactions
- Must have multiple successful deployments user stories
- Extensive hands on experience in Apache Spark along with HiveQL
- Sound knowledge in Amazon Web Services or any other Cloud environment.
- Experienced in data flow orchestration using Apache Airflow
- JSON, XML, CSV, Parquet file formats with snappy compression.
- File movements between HDFS and AWS S3
- Experience in shell scripting and scripting to automate report generation and migration of reports to AWS S3
- Worked in building a data pipeline using Pandas and Flask FrameworkGood Familiarity with Anaconda and Jupyternotebook
- Use data to develop machine learning models that optimize decision making in Credit Risk, Fraud, Marketing, and Operations
- Implement data pipelines, new features, and algorithms that are critical to our production models
- Create scalable strategies to deploy and execute your models
- Write well designed, testable, efficient code
- Identify valuable data sources and automate collection processes.
- Undertake to preprocess of structured and unstructured data.
- Analyze large amounts of information to discover trends and patterns.
Requirements:
- 2+ years of experience in applied data science or engineering with a focus on machine learning
- Python expertise with good knowledge of machine learning libraries, tools, techniques, and frameworks (e.g. pandas, sklearn, xgboost, lightgbm, logistic regression, random forest classifier, gradient boosting regressor, etc)
- strong quantitative and programming skills with a product-driven sensibility
DataWeave provides Retailers and Brands with “Competitive Intelligence as a Service” that enables them to take key decisions that impact their revenue. Powered by AI, we provide easily consumable and actionable competitive intelligence by aggregating and analyzing billions of publicly available data points on the Web to help businesses develop data-driven strategies and make smarter decisions.
Data Science@DataWeave
We the Data Science team at DataWeave (called Semantics internally) build the core machine learning backend and structured domain knowledge needed to deliver insights through our data products. Our underpinnings are: innovation, business awareness, long term thinking, and pushing the envelope. We are a fast paced labs within the org applying the latest research in Computer Vision, Natural Language Processing, and Deep Learning to hard problems in different domains.
How we work?
It's hard to tell what we love more, problems or solutions! Every day, we choose to address some of the hardest data problems that there are. We are in the business of making sense of messy public data on the web. At serious scale!
What do we offer?
- Some of the most challenging research problems in NLP and Computer Vision. Huge text and image datasets that you can play with!
- Ability to see the impact of your work and the value you're adding to our customers almost immediately.
- Opportunity to work on different problems and explore a wide variety of tools to figure out what really excites you.
- A culture of openness. Fun work environment. A flat hierarchy. Organization wide visibility. Flexible working hours.
- Learning opportunities with courses and tech conferences. Mentorship from seniors in the team.
- Last but not the least, competitive salary packages and fast paced growth opportunities.
Who are we looking for?
The ideal candidate is a strong software developer or a researcher with experience building and shipping production grade data science applications at scale. Such a candidate has keen interest in liaising with the business and product teams to understand a business problem, and translate that into a data science problem. You are also expected to develop capabilities that open up new business productization opportunities.
We are looking for someone with 6+ years of relevant experience working on problems in NLP or Computer Vision with a Master's degree (PhD preferred).
Key problem areas
- Preprocessing and feature extraction noisy and unstructured data -- both text as well as images.
- Keyphrase extraction, sequence labeling, entity relationship mining from texts in different domains.
- Document clustering, attribute tagging, data normalization, classification, summarization, sentiment analysis.
- Image based clustering and classification, segmentation, object detection, extracting text from images, generative models, recommender systems.
- Ensemble approaches for all the above problems using multiple text and image based techniques.
Relevant set of skills
- Have a strong grasp of concepts in computer science, probability and statistics, linear algebra, calculus, optimization, algorithms and complexity.
- Background in one or more of information retrieval, data mining, statistical techniques, natural language processing, and computer vision.
- Excellent coding skills on multiple programming languages with experience building production grade systems. Prior experience with Python is a bonus.
- Experience building and shipping machine learning models that solve real world engineering problems. Prior experience with deep learning is a bonus.
- Experience building robust clustering and classification models on unstructured data (text, images, etc). Experience working with Retail domain data is a bonus.
- Ability to process noisy and unstructured data to enrich it and extract meaningful relationships.
- Experience working with a variety of tools and libraries for machine learning and visualization, including numpy, matplotlib, scikit-learn, Keras, PyTorch, Tensorflow.
- Use the command line like a pro. Be proficient in Git and other essential software development tools.
- Working knowledge of large-scale computational models such as MapReduce and Spark is a bonus.
- Be a self-starter—someone who thrives in fast paced environments with minimal ‘management’.
- It's a huge bonus if you have some personal projects (including open source contributions) that you work on during your spare time. Show off some of your projects you have hosted on GitHub.
Role and responsibilities
- Understand the business problems we are solving. Build data science capability that align with our product strategy.
- Conduct research. Do experiments. Quickly build throw away prototypes to solve problems pertaining to the Retail domain.
- Build robust clustering and classification models in an iterative manner that can be used in production.
- Constantly think scale, think automation. Measure everything. Optimize proactively.
- Take end to end ownership of the projects you are working on. Work with minimal supervision.
- Help scale our delivery, customer success, and data quality teams with constant algorithmic improvements and automation.
- Take initiatives to build new capabilities. Develop business awareness. Explore productization opportunities.
- Be a tech thought leader. Add passion and vibrance to the team. Push the envelope. Be a mentor to junior members of the team.
- Stay on top of latest research in deep learning, NLP, Computer Vision, and other relevant areas.