About us DataWeave provides Retailers and Brands with “Competitive Intelligence as a Service” that enables them to take key decisions that impact their revenue. Powered by AI, we provide easily consumable and actionable competitive intelligence by aggregating and analyzing billions of publicly available data points on the Web to help businesses develop data-driven strategies and make smarter decisions.Data Science@DataWeaveWe the Data Science team at DataWeave (called Semantics internally) build the core machine learning backend and structured domain knowledge needed to deliver insights through our data products. Our underpinnings are: innovation, business awareness, long term thinking, and pushing the envelope. We are a fast paced labs within the org applying the latest research in Computer Vision, Natural Language Processing, and Deep Learning to hard problems in different domains.How we work?It's hard to tell what we love more, problems or solutions! Every day, we choose to address some of the hardest data problems that there are. We are in the business of making sense of messy public data on the web. At serious scale!What do we offer?- Some of the most challenging research problems in NLP and Computer Vision. Huge text and image datasets that you can play with!- Ability to see the impact of your work and the value you're adding to our customers almost immediately.- Opportunity to work on different problems and explore a wide variety of tools to figure out what really excites you.- A culture of openness. Fun work environment. A flat hierarchy. Organization wide visibility. Flexible working hours.- Learning opportunities with courses and tech conferences. Mentorship from seniors in the team.- Last but not the least, competitive salary packages and fast paced growth opportunities.Who are we looking for?The ideal candidate is a strong software developer or a researcher with experience building and shipping production grade data science applications at scale. Such a candidate has keen interest in liaising with the business and product teams to understand a business problem, and translate that into a data science problem. You are also expected to develop capabilities that open up new business productization opportunities. We are looking for someone with 6+ years of relevant experience working on problems in NLP or Computer Vision with a Master's degree (PhD preferred). Key problem areas- Preprocessing and feature extraction noisy and unstructured data -- both text as well as images.- Keyphrase extraction, sequence labeling, entity relationship mining from texts in different domains.- Document clustering, attribute tagging, data normalization, classification, summarization, sentiment analysis.- Image based clustering and classification, segmentation, object detection, extracting text from images, generative models, recommender systems.- Ensemble approaches for all the above problems using multiple text and image based techniques.Relevant set of skills- Have a strong grasp of concepts in computer science, probability and statistics, linear algebra, calculus, optimization, algorithms and complexity.- Background in one or more of information retrieval, data mining, statistical techniques, natural language processing, and computer vision.- Excellent coding skills on multiple programming languages with experience building production grade systems. Prior experience with Python is a bonus.- Experience building and shipping machine learning models that solve real world engineering problems. Prior experience with deep learning is a bonus.- Experience building robust clustering and classification models on unstructured data (text, images, etc). Experience working with Retail domain data is a bonus.- Ability to process noisy and unstructured data to enrich it and extract meaningful relationships.- Experience working with a variety of tools and libraries for machine learning and visualization, including numpy, matplotlib, scikit-learn, Keras, PyTorch, Tensorflow.- Use the command line like a pro. Be proficient in Git and other essential software development tools.- Working knowledge of large-scale computational models such as MapReduce and Spark is a bonus.- Be a self-starter—someone who thrives in fast paced environments with minimal ‘management’.- It's a huge bonus if you have some personal projects (including open source contributions) that you work on during your spare time. Show off some of your projects you have hosted on GitHub.Role and responsibilities- Understand the business problems we are solving. Build data science capability that align with our product strategy.- Conduct research. Do experiments. Quickly build throw away prototypes to solve problems pertaining to the Retail domain.- Build robust clustering and classification models in an iterative manner that can be used in production.- Constantly think scale, think automation. Measure everything. Optimize proactively.- Take end to end ownership of the projects you are working on. Work with minimal supervision.- Help scale our delivery, customer success, and data quality teams with constant algorithmic improvements and automation.- Take initiatives to build new capabilities. Develop business awareness. Explore productization opportunities.- Be a tech thought leader. Add passion and vibrance to the team. Push the envelope. Be a mentor to junior members of the team.- Stay on top of latest research in deep learning, NLP, Computer Vision, and other relevant areas.
Company Overview Akridata is a US based early stage startup working in the area of providing data processing and data management solutions for certain edge to data center pipelines related to AI workloads catering to use cases generating data in the order of 100TB per day. We are a VC funded startup incorporated in May 2018 with all SW engineering done from the Bangalore center thus providing ample opportunities for ‘from-scratch’ design and development. Role We are in the process of setting up a “data science and algorithms” team and this role is for a lead engineer who will form the team and lead all development activities for this team. This team will develop high performance algorithms around the areas of data summarization and importance scoring on data. If you are excited with the idea of developing a data-science toolkit Vs using an existing toolkit, then this role would be an ideal match. What we are looking for Master’s degree with a background in statistics and computer science. Good foundation on theory around advanced linear algebra. Hands-on programming experience with Python in area of data science 4+ years of experience working with data science activities involving building models for analysing large amounts of data. Experience with publishing academic papers and/or implementing prototypes based on ongoing research ideas. Good to have Experience with developing high performance algorithms considering various system aspects like memory consumption, CPU utilization, synchronization overheads, communication overheads etc. Understanding of GPU architecture and experience implementing algorithms on GPUs. Experience leading a team of 2-3 data science engineers.