Please note - This is a 100% remote opportunity and you can work from any location.
About the team:
You will be a part of Cactus Labs which is the R&D Cell of Cactus Communications. Cactus Labs is a high impact cell that works to solve complex technical and business problems that help keep us strategically competitive in the industry. We are a multi-cultural team spread across multiple countries. We work in the domain of AI/ML especially with Text (NLP - Natural Language Processing), Language Understanding, Explainable AI, Big Data, AR/VR etc.
The opportunity: Within Cactus Labs you will work with the Big Data team. This team manages Terabytes of data coming from different sources. We are re-orchestrating data pipelines to handle this data at scale and improve visibility and robustness. We operate across all the three Cloud Platforms and leverage the best of them.
In this role, you will get to own a component end to end. You will also get to work on could platform and learn to design distributed data processing systems to operate at scale.
Responsibilities:
- Build and maintain robust data processing pipelines at scale
- Collaborate with a team of Big Data Engineers, Big Data and Cloud Architects and Domain SMEs to drive the product ahead
- Help junior team members in designing solutions and split their user stories
- Review team members’ code make sure standards are followed, verify unit test coverage
- Follow best practices in building and optimize existing processes
- Stay up to date with the progress in the domain since we work on cutting-edge technologies and are constantly trying new things out
- Build solutions for massive scale. This requires extensive benchmarking to pick the right approach
- Understand the data in and out and make sense of it. You will at times need to draw conclusions and present it to the business users
- Be independent, self-driven and highly motivated. While you will have the best people to learn from and access to various courses or training materials, we expect you to take charge of your growth and learning.
Expectations from you:
- 5-8 Years of relevant experience in Big Data preferable with pyspark
- Highly proficient in distributed computing and Big Data Ecosystem - Hadoop, HDFS, Apache Spark
- Good understanding of data lake and their importance in a Big Data Ecosystem
- Being able to mentor junior team members and review their code
- Experience in working in a Cloud Environment (AWS, Azure or GCP)
- You like to work without a lot of supervision or micromanagement.
- Above all, you get excited by data. You like to dive deep, mine patterns and draw conclusions. You believe in making data driven decisions and helping the team look for the pattern as well.
Preferred skills:
- Familiarity with search engines like Elasticsearch and Bigdata warehouses systems like AWS Athena, Google Big Query etc
- Building data pipelines using Airflow
- Experience of working in AWS Cloud Environment
About Cactus Communications
We seek professionals who see differently; who find opportunity where others don't; and who look within themselves and know that with the right support and team, they can impact the world!
Join us-
- If you like Remote work setup. We're a Remote-first organisation.
- If you are keen on getting a global exposure
- If you like the freedom to innovate & build products
- If you want to be part of the team that works in the domain of AI/ML especially with Text (NLP - Natural Language Processing), Language Understanding, Explainable AI, Big Data using many exciting new technologies
- If you're keen on being part of a culture that values people for their talent, personality, competency, and the ability to learn and grow
Similar jobs
Sizzle is an exciting new startup that’s changing the world of gaming. At Sizzle, we’re building AI to automate gaming highlights, directly from Twitch and YouTube streams. We’re looking for a superstar engineer that is well versed with AI and audio technologies around audio detection, speech-to-text, interpretation, and sentiment analysis.
You will be responsible for:
Developing audio algorithms to detect key moments within popular online games, such as:
Streamer speaking, shouting, etc.
Gunfire, explosions, and other in-game audio events
Speech-to-text and sentiment analysis of the streamer’s narration
Leveraging baseline technologies such as TensorFlow and others -- and building models on top of them
Building neural network architectures for audio analysis as it pertains to popular games
Specifying exact requirements for training data sets, and working with analysts to create the data sets
Training final models, including techniques such as transfer learning, data augmentation, etc. to optimize models for use in a production environment
Working with back-end engineers to get all of the detection algorithms into production, to automate the highlight creation
You should have the following qualities:
Solid understanding of AI frameworks and algorithms, especially pertaining to audio analysis, speech-to-text, sentiment analysis, and natural language processing
Experience using Python, TensorFlow and other AI tools
Demonstrated understanding of various algorithms for audio analysis, such as CNNs, LSTM for natural language processing, and others
Nice to have: some familiarity with AI-based audio analysis including sentiment analysis
Familiarity with AWS environments
Excited about working in a fast-changing startup environment
Willingness to learn rapidly on the job, try different things, and deliver results
Ideally a gamer or someone interested in watching gaming content online
Skills:
Machine Learning, Audio Analysis, Sentiment Analysis, Speech-To-Text, Natural Language Processing, Neural Networks, TensorFlow, OpenCV, AWS, Python
Work Experience: 2 years to 10 years
About Sizzle
Sizzle is building AI to automate gaming highlights, directly from Twitch and YouTube videos. Presently, there are over 700 million fans around the world that watch gaming videos on Twitch and YouTube. Sizzle is creating a new highlights experience for these fans, so they can catch up on their favorite streamers and esports leagues. Sizzle is available at www.sizzle.gg .
● Proficient in Python and using packages like NLTK, Numpy, Pandas
● Should have worked on deep learning frameworks (like Tensorflow, Keras, PyTorch, etc)
● Hands-on experience in Natural Language Processing, Sequence, and RNN Based models
● Mathematical intuition of ML and DL algorithms
● Should be able to perform thorough model evaluation by creating hypotheses on the basis of statistical
analyses
● Should be comfortable in going through open-source code and reading research papers.
Be a part of the growth story of a rapidly growing organization in AI. We are seeking a passionate Machine Learning (ML) Engineer, with a strong background in developing and deploying state-of-the-art models on Cloud. You will participate in the complete cycle of building machine learning models from conceptualization of ideas, data preparation, feature selection, training, evaluation, and productionization.
On a typical day, you might build data pipelines, develop a new machine learning algorithm, train a new model or deploy the trained model on the cloud. You will have a high degree of autonomy, ownership, and influence over your work, machine learning organizations' evolution, and the direction of the company.
Required Qualifications
- Bachelor's degree in computer science/electrical engineering or equivalent practical experience
- 7+ years of Industry experience in Data Science, ML/AI projects. Experience in productionizing machine learning in the industry setting
- Strong grasp of statistical machine learning, linear algebra, deep learning, and computer vision
- 3+ years experience with one or more general-purpose programming languages including but not limited to: R, Python.
- Experience with PyTorch or TensorFlow or other ML Frameworks.
- Experience in using Cloud services such as AWS, GCP, Azure. Understand the principles of developing cloud-native application development
In this role you will:
- Design and implement ML components, systems and tools to automate and enable our various AI industry solutions
- Apply research methodologies to identify the machine learning models to solve a business problem and deploy the model at scale.
- Own the ML pipeline from data collection, through the prototype development to production.
- Develop high-performance, scalable, and maintainable inference services that communicate with the rest of our tech stack
Company Name: Curl Tech
Location: Bangalore
Website: www.curl.tech
Company Profile: Curl Tech is a deep-tech firm, based out of Bengaluru, India. Curl works on developing Products & Solutions leveraging emerging technologies such as Machine Learning, Blockchain (DLT) & IoT. We work on domains such as Commodity Trading, Banking & Financial Services, Healthcare, Logistics & Retail.
Curl has been founded by technology enthusiasts with rich industry experience. Products and solutions that have been developed at Curl, have gone on to have considerable success and have in turn become separate companies (focused on that product / solution).
If you are looking for a job, that would challenge you and desire to work with an organization that disrupts entire value chain; Curl is the right one for you!
Designation: Data Scientist or Junior Data Scientist (according to experience)
Job Description:
Good with Machine Learning and Deep learning, good with programming and maths.
Details: The candidate will be working on many image analytics/ numerical data analytics projects. The work involves, data collection, building the machine learning models, deployment, client interaction and publishing academic papers.
Responsibilities:
-
The candidate will be working on many image analytics/numerical data projects.
-
Candidate will be building various machine learning models depending upon the requirements.
-
Candidate would be responsible for deployment of the machine learning models.
-
Candidate would be the face of the company in front of the clients and will have regular client interactions to understand that client requirements.
What we are looking for candidates with:
-
Basic Understanding of Statistics, Time Series, Machine Learning, Deep Learning, and their fundamentals and mathematical underpinnings.
-
Proven code proficiency in Python,C/C++ or any other AI language of choice.
-
Strong algorithmic thinking, creative problem solving and the ability to take ownership and do independent
research.
-
Understanding how things work internally in ML and DL models is a must.
-
Understanding of the fundamentals of Computer Vision and Image Processing techniques would be a plus.
-
Expertise in OpenCV, ML/Neural networks technologies and frameworks such as PyTorch, Tensorflow would be a
plus.
-
Educational background in any quantitative field (Computer Science / Mathematics / Computational Sciences and related disciplines) will be given preference.
Education: BE/ BTech/ B.Sc.(Physics or Mathematics)/Masters in Mathematics, Physics or related branches.
- 4+ years of experience Solid understanding of Python, Java and general software development skills (source code management, debugging, testing, deployment etc.).
- Experience in working with Solr and ElasticSearch Experience with NLP technologies & the handling of unstructured text Detailed understanding of text pre-processing and normalisation techniques such as tokenisation, lemmatisation, stemming, POS tagging etc.
- Prior experience in implementation of traditional ML solutions - classification, regression or clustering problem Expertise in text-analytics - Sentiment Analysis, Entity Extraction, Language modelling - and associated sequence learning models ( RNN, LSTM, GRU).
- Comfortable working with deep-learning libraries (eg. PyTorch)
- Candidate can even be a fresher with 1 or 2 years of experience IIIT, IIIT, Bits Pilani, top 5 local colleges are preferred colleges and universities.
- A Masters candidate in machine learning.
- Can source candidates from Mu Sigma and Manthan.
Responsibilities Description:
Responsible for the development and implementation of machine learning algorithms and techniques to solve business problems and optimize member experiences. Primary duties may include are but not limited to: Design machine learning projects to address specific business problems determined by consultation with business partners. Work with data-sets of varying degrees of size and complexity including both structured and unstructured data. Piping and processing massive data-streams in distributed computing environments such as Hadoop to facilitate analysis. Implements batch and real-time model scoring to drive actions. Develops machine learning algorithms to build customized solutions that go beyond standard industry tools and lead to innovative solutions. Develop sophisticated visualization of analysis output for business users.
Experience Requirements:
BS/MA/MS/PhD in Statistics, Computer Science, Mathematics, Machine Learning, Econometrics, Physics, Biostatistics or related Quantitative disciplines. 2-4 years of experience in predictive analytics and advanced expertise with software such as Python, or any combination of education and experience which would provide an equivalent background. Experience in the healthcare sector. Experience in Deep Learning strongly preferred.
Required Technical Skill Set:
- Full cycle of building machine learning solutions,
o Understanding of wide range of algorithms and their corresponding problems to solve
o Data preparation and analysis
o Model training and validation
o Model application to the problem
- Experience using the full open source programming tools and utilities
- Experience in working in end-to-end data science project implementation.
- 2+ years of experience with development and deployment of Machine Learning applications
- 2+ years of experience with NLP approaches in a production setting
- Experience in building models using bagging and boosting algorithms
- Exposure/experience in building Deep Learning models for NLP/Computer Vision use cases preferred
- Ability to write efficient code with good understanding of core Data Structures/algorithms is critical
- Strong python skills following software engineering best practices
- Experience in using code versioning tools like GIT, bit bucket
- Experience in working in Agile projects
- Comfort & familiarity with SQL and Hadoop ecosystem of tools including spark
- Experience managing big data with efficient query program good to have
- Good to have experience in training ML models in tools like Sage Maker, Kubeflow etc.
- Good to have experience in frameworks to depict interpretability of models using libraries like Lime, Shap etc.
- Experience with Health care sector is preferred
- MS/M.Tech or PhD is a plus
Senior Engineer – Artificial Intelligence / Computer Vision
(Business Unit – Autonomous Vehicles & Automotive - AVA)
We are seeking an exceptional, experienced senior engineer with deep expertise in Computer Vision, Neural Networks, 3D Scene Understanding and Sensor Data Processing. The expectation is to lead a growing team of engineers to help them build and deliver customized solutions for our clients. A solid engineering as well as team management background is a must.
About MulticoreWare Inc
MulticoreWare Inc is a software and solutions development company with top-notch talent and skill in a variety of micro-architectures, including multi-thread, multi-core, and heterogeneous hardware platforms. It works in sectors including High Performance Computing (HPC), Media & AI Analytics, Video Solutions, Autonomous Vehicle and Automotive software, all of which are rapidly expanding. The Autonomous Vehicles & Automotive business unit specializes in delivering optimized solutions for sophisticated sensor fusion intelligence and the design of algorithms & implementation of software to be deployed on a variety of automotive grade hardware platforms.
Role Responsibilities
● Lead a team to solve the problems in a perception / autonomous-systems scope and turn ideas into code & products
● Drive all technical elements of development, such as project requirements definition, design, implementation, unit testing, integration, and software delivery
● Implementing cutting edge AI solutions on embedded platforms and optimizing them for performance. Hardware architecture aware algorithm design and development
● Contribute to the vision and long-term strategy of the business unit
Required Qualifications (Must Have)
● 3 - 7 years of experience with real world system building, including design, coding (C++/Python) and evaluation/testing (C++/Python)
● Solid experience in 2D / 3D Computer Vision algorithms, Machine Learning and Deep Learning fundamentals – Theory & Practice. Hands-on experience with Deep Learning frameworks like Caffe, TensorFlow or PyTorch
● Expert level knowledge in any of the courses related Signal Data Processing / Autonomous or Robotics software development (Perception, Localization, Prediction, Planning), multi-object tracking, sensor fusion algorithms and familiarity on Kalman filters, particle filters, clustering methods etc.
● Good project management and execution capabilities, as well as good communication and coordination ability
● Bachelor’s degree in Computer Science, Computer Engineering, Electrical Engineering, or related fields
Preferred Qualifications (Nice-to-Have)
● GPU architecture and CUDA programming experience, as well as knowledge of AI inference optimization using Quantization, Compression (or) Model Pruning
● Track record of research excellence with prior publication on top-tier conferences and journals
One large human need is that of sharing thoughts and connecting with people of the same
Koo was founded in March 2020, as a micro-blogging platform in both Indian languages and
Technology Team & Culture
Job Description
We are looking for an outstanding ML Architect (Deployments) with expertise in deploying Machine Learning solutions/models into production and scaling them to serve millions of customers. A candidate with an adaptable and productive working style which fits in a fast-moving environment.
Skills:
- 5+ years deploying Machine Learning pipelines in large enterprise production systems.
- Experience developing end to end ML solutions from business hypothesis to deployment / understanding the entirety of the ML development life cycle.
- Expert in modern software development practices; solid experience using source control management (CI/CD).
- Proficient in designing relevant architecture / microservices to fulfil application integration, model monitoring, training / re-training, model management, model deployment, model experimentation/development, alert mechanisms.
- Experience with public cloud platforms (Azure, AWS, GCP).
- Serverless services like lambda, azure functions, and/or cloud functions.
- Orchestration services like data factory, data pipeline, and/or data flow.
- Data science workbench/managed services like azure machine learning, sagemaker, and/or AI platform.
- Data warehouse services like snowflake, redshift, bigquery, azure sql dw, AWS Redshift.
- Distributed computing services like Pyspark, EMR, Databricks.
- Data storage services like cloud storage, S3, blob, S3 Glacier.
- Data visualization tools like Power BI, Tableau, Quicksight, and/or Qlik.
- Proven experience serving up predictive algorithms and analytics through batch and real-time APIs.
- Solid working experience with software engineers, data scientists, product owners, business analysts, project managers, and business stakeholders to design the holistic solution.
- Strong technical acumen around automated testing.
- Extensive background in statistical analysis and modeling (distributions, hypothesis testing, probability theory, etc.)
- Strong hands-on experience with statistical packages and ML libraries (e.g., Python scikit learn, Spark MLlib, etc.)
- Experience in effective data exploration and visualization (e.g., Excel, Power BI, Tableau, Qlik, etc.)
- Experience in developing and debugging in one or more of the languages Java, Python.
- Ability to work in cross functional teams.
- Apply Machine Learning techniques in production including, but not limited to, neuralnets, regression, decision trees, random forests, ensembles, SVM, Bayesian models, K-Means, etc.
Roles and Responsibilities:
Deploying ML models into production, and scaling them to serve millions of customers.
Technical solutioning skills with deep understanding of technical API integrations, AI / Data Science, BigData and public cloud architectures / deployments in a SaaS environment.
Strong stakeholder relationship management skills - able to influence and manage the expectations of senior executives.
Strong networking skills with the ability to build and maintain strong relationships with both business, operations and technology teams internally and externally.
Provide software design and programming support to projects.
Qualifications & Experience:
Engineering and post graduate candidates, preferably in Computer Science, from premier institutions with proven work experience as a Machine Learning Architect (Deployments) or a similar role for 5-7 years.
empower healthcare payers, providers and members to quickly process medical data to
make informed decisions and reduce health care costs. You will be focusing on research,
development, strategy, operations, people management, and being a thought leader for
team members based out of India. You should have professional healthcare experience
using both structured and unstructured data to build applications. These applications
include but are not limited to machine learning, artificial intelligence, optical character
recognition, natural language processing, and integrating processes into the overall AI
pipeline to mine healthcare and medical information with high recall and other relevant
metrics. The results will be used dually for real-time operational processes with both
automated and human-based decision making as well as contribute to reducing
healthcare administrative costs. We work with all major cloud and big data vendors
offerings including (Azure, AWS, Google, IBM, etc.) to achieve our goals in healthcare and
support
The Director, Data Science will have the opportunity to build a team, shape team culture
and operating norms as a result of the fast-paced nature of a new, high-growth
organization.
• Strong communication and presentation skills to convey progress to a diverse group of stakeholders
• Strong expertise in data science, data engineering, software engineering, cloud vendors, big data technologies, real-time streaming applications, DevOps and product delivery
• Experience building stakeholder trust and confidence in deployed models especially via application of the algorithmic bias, interpretable machine learning,
data integrity, data quality, reproducible research and reliable engineering 24x7x365 product availability, scalability
• Expertise in healthcare privacy, federated learning, continuous integration and deployment, DevOps support
• Provide mentoring to data scientists and machine learning engineers as well as career development
• Meet project related team members for individual specific needs on a regular basis related to project/product deliverables
• Provide training and guidance for team members when required
• Provide performance feedback when required by leadership
The Experience You’ll Need (Required):
• MS/M.Tech degree or PhD in Computer Science, Mathematics, Physics or related STEM fields
• Significant healthcare data experience including but not limited to usage of claims data
• Delivered multiple data science and machine learning projects over 8+ years with values exceeding $10 Million or more and has worked on platform members exceeding 10 million lives
• 9+ years of industry experience in data science, machine learning, and artificial intelligence
• Strong expertise in data science, data engineering, software engineering, cloud vendors, big data technologies, real time streaming applications, DevOps, and product delivery
• Knows how to solve and launch real artificial intelligence and data science related problems and products along with managing and coordinating the
business process change, IT / cloud operations, meeting production level code standards
• Ownerships of key workflows part of data science life cycle like data acquisition, data quality, and results
• Experience building stakeholder trust and confidence in deployed models especially via application of algorithmic bias, interpretable machine learning,
data integrity, data quality, reproducible research, and reliable engineering 24x7x365 product availability, scalability
• Expertise in healthcare privacy, federated learning, continuous integration and deployment, DevOps support
• 3+ Years of experience managing directly five (5) or more senior level data scientists, machine learning engineers with advanced degrees and directly
made staff decisions
• Very strong understanding of mathematical concepts including but not limited to linear algebra, advanced calculus, partial differential equations, and
statistics including Bayesian approaches at master’s degree level and above
• 6+ years of programming experience in C++ or Java or Scala and data science programming languages like Python and R including strong understanding of
concepts like data structures, algorithms, compression techniques, high performance computing, distributed computing, and various computer architecture
• Very strong understanding and experience with traditional data science approaches like sampling techniques, feature engineering, classification, and
regressions, SVM, trees, model evaluations with several projects over 3+ years
• Very strong understanding and experience in Natural Language Processing,
reasoning, and understanding, information retrieval, text mining, search, with
3+ years of hands on experience
• Experience with developing and deploying several products in production with
experience in two or more of the following languages (Python, C++, Java, Scala)
• Strong Unix/Linux background and experience with at least one of the
following cloud vendors like AWS, Azure, and Google
• Three plus (3+) years hands on experience with MapR \ Cloudera \ Databricks
Big Data platform with Spark, Hive, Kafka etc.
• Three plus (3+) years of experience with high-performance computing like
Dask, CUDA distributed GPU, TPU etc.
• Presented at major conferences and/or published materials