Home | Techmatters
Founded in 2018, digital consumer goods company Thrasio is the largest acquirer of Amazon FBA businesses, boasting a massive innovation platform that brings high quality products to market across digital marketplaces and retailers globally. With the experience of evaluating more than 5,000 Amazon companies, acquiring nearly 100 top-rated brands, and managing the scale of nearly 14,000 category-leading products, Thrasio’s brands outperform almost every other seller on Amazon. Since our founding in 2018, the team has grown to nearly 600 people globally--78% of that growth has occurred during the COVID-19 pandemic. Hiring people who share a passion for their craft in the eCommerce space is the reason we’re projected to grow more than 10x in the next few years. This growth is supported by investors whose portfolios include Facebook, Google, Jet.com, StitchFix, and Lululemon. We do our best work when we’re surrounded by people who are insatiably curious, agile, and who thrive in collaborative, check-your-ego-at-the door working environments. Sound like you? We’d love to chat. Thrasio Tech Team: The mission of the technology team is to provide the super power infrastructure that enables Thrasio to operate the most efficient, innovative, and effective next generation consumer products company in the world. If you are interested in working with large data systems, applying data science and predictive analytics, or building powerful interfaces for internal business customers then you will find those opportunities at Thrasio. The Role: The Data Connectivity Team designs, builds and maintains the integrated platform that securely procures and links critical business data from disparate internal and external sources. This involves assimilating all structured, unstructured and semi-structured data and normalizing it to a form conducive to Thrasio’s business needs (think supply chain, marketing, finance). We are the first mile in Thrasio’s data engine and enable delivery of business value from data. The Data Engineering Lead will be responsible for defining, maintaining, and updating the core business entities, relationships, and data that runs the data systems of Thrasio outside of the enterprise software. Key Responsibilities Include: Work with other architects and engineers to define, execute, and update core data systems while maintaining a high level of availability and transactional correctness. Provide technical leadership in coordination with data analytics and product management to define, deliver, and update the entities, entity relationships, and their representation as tables, constraints, and stored procedures in the Thrasio data systems. Help define future technical direction for the data systems in collaboration with senior management, product management and stakeholders. What you bring to the party: 10+ years of information and database architecture expertise Demonstrated expertise in: Information Architecture, Data Engineering, and data warehousing Strong familiarity with: Analytics, Data Science and reasonable familiarity with all areas. Excellent quantitative modeling, statistical analysis skills and problem-solving skills Deep experience with SQL including advanced techniques and stored procedures; Deep knowledge of data modeling across various databases Demonstrated data management experience with at least one and preferably 2 major SQL database platforms: MySQL, Postgres, Redshift, Snowflake, Oracle, SQL Server Deep experience with at least one cloud platform; Azure, Amazon, or Google. Thrasio uses Amazon, experience there is a plus Experienced with concepts of ETLs, data pipelines and distributed systems Extensive experience in data warehousing, analytics and business intelligence with a focus on modernized cloud-based platforms, platform as a service databases and analytics tools Proven ability to collaborate, build relationships and influence individuals at all levels in a matrix-management environment (as well as external vendors and service providers) to ensure that segregation and overlapping roles are identified and coordinated. Ability to present complicated analytical methodology and results to a non-technical audience. Direct experience managing cross functional teams in a matrixed organization. Experience working in a fast-paced, high-tech and customer obsessed environment. Experience with Amazon Web Services: RedShift, S3, EC2, EMR, etc. and industry standard Data Warehousing technologies: Snowflake, Spark, Apache Airflow, etc. Related experience in an e-commerce organization with a recurring revenue model is a strong plus Nice to Have, but Not Required: Experience with Amazon Web Services: RedShift, S3, EC2, EMR, etc and industry standard. Data Warehousing technologies: Snowflake, Spark, Apache Airflow, celery, etc. Experience using data transformation tools such as panda, pyspark, etc. Related experience in an e-commerce organization with a recurring revenue model is a strong plus THRASIO IS PROUD TO BE AN EQUAL OPPORTUNITY EMPLOYER AND CONSIDERS ALL QUALIFIED APPLICANTS FOR EMPLOYMENT WITHOUT REGARD TO RACE, COLOR, RELIGION, SEX, GENDER, SEXUAL ORIENTATION, GENDER IDENTITY, ANCESTRY, AGE, OR NATIONAL ORIGIN. FURTHER, QUALIFIED APPLICANTS WILL NOT BE DISCRIMINATED AGAINST ON THE BASIS OF DISABILITY, PROTECTED CLASSES, OR PROTECTED VETERAN STATUS. THRASIO PARTICIPATES IN E-VERIFY.
About Us: Helical IT, based out of Hyderabad, is a software company that specializes in Open Source Data Warehousing & Business Intelligence, servicing clients in various domains like Manufacturing, HR, Energy, Insurance, Social Media Analytics, E-commerce, Travel, etc. Job Description: Hands-on Experience with AWS and AWS Glue Mandatory Demonstrated strength in data modeling, ETL development, and data warehousing Hands-on Experience using big data technologies (Hadoop, Hive, Hbase, Spark, etc.) Apache Spark mandatory Hands-on Experience using Spark, SQL Hands-on Experience using programming language – Scala, python, R, or Java (any one) Strong database knowledge Proven success in communicating with users, other technical teams, and senior management to collect requirements, describe data modeling decisions and data engineering strategy Agile development and understanding Nice to Have: Experience using business intelligence reporting tools Experience on AWS Quicksight Database understanding Postgres, SQL Server, Cassandra, S3, Hadoop Performance tuning of spark jobs Any BI tool knowledge like tableau, Jasper, Pentaho, helical insight Skills and Qualification: BE, B.Tech / MS Degree in Computer Science, Engineering or a related subject. Having experience of 2+ years. Ability to work independently. Good written and oral communication skills
Must-Have : 6+ years of experience in solving data science problems using machine learning, data mining algorithms, and big data tools. Strong experience with advanced SQL and good experience in big data ecosystem - Hive/Pig/MapReduce, Spark Experience in delivering at least one product with involvement in business problem identification, proposing solutions and evaluating them, identifying required data sources, building a data pipeline, visualizing the outputs, and taking actions based on the data outputs. Strong Experience with at least one programming language - e.g. Java, Python, R ( Exposure to multiple is a plus) Storing Experience in delivering data science projects leveraging cloud infrastructure. Experience with Agile framework. Experience in leading a data science team is a plus. Highly passionate about making an impact on business using data science. Believes in continuous learning and sharing knowledge with peers.
What is the roles objective : Assemble large, complex data sets that meet functional / non-functional business requirements. Identify, design, and implement internal process improvements: automating manual processes, optimizing data delivery, re-designing infrastructure for greater scalability, etc. Build the infrastructure required for optimal extraction, transformation, and loading of data from a wide variety of data sources using SQL and AWS ‘big data’ technologies. Build analytics tools that utilize the data pipeline to provide actionable insights into customer acquisition, operational efficiency and other key business performance metrics. Work with stakeholders including the Executive, Product, Data and Design teams to assist with data-related technical issues and support their data infrastructure needs. Keep our data separated and secure across national boundaries through multiple data centers and AWS regions. Create data tools for analytics and data scientist team members that assist them in building and optimizing our product into an innovative industry leader. Work with data and analytics experts to strive for greater functionality in our data systems. What skills do you need to possess? Advanced working SQL knowledge and experience working with relational databases, query authoring (SQL) as well as working familiarity with a variety of databases. Experience building and optimizing ‘big data’ data pipelines, architectures and data sets. Experience performing root cause analysis on internal and external data and processes to answer specific business questions and identify opportunities for improvement. Strong analytic skills related to working with unstructured data-sets. Build processes supporting data transformation, data structures, metadata, dependency and workload management. A successful history of manipulating, processing and extracting value from large disconnected data-sets. Working knowledge of message queuing, stream processing, and highly scalable ‘big data’ data stores. Strong project management and organizational skills. Experience supporting and working with cross-functional teams in a dynamic environment. Experience with big data tools: Hadoop, Spark, Kafka, etc. Experience with relational SQL and No SQL databases, including Postgres and Cassandra. Experience with data pipeline and workflow management tools: Azkaban, Luigi, Airflow, etc. Experience with AWS cloud services: EC2, EMR, RDS, Redshift Experience with stream-processing systems: Storm, Spark-Streaming, etc. Experience with object-oriented/object function scripting languages: Python, Java, C++, Scala, etc.
Job Description:• Help build a Data Science team which will be engaged in researching, designing,implementing, and deploying full-stack scalable data analytics vision and machine learningsolutions to challenge various business issues.• Modelling complex algorithms, discovering insights and identifying businessopportunities through the use of algorithmic, statistical, visualization, and mining techniques• Translates business requirements into quick prototypes and enable thedevelopment of big data capabilities driving business outcomes• Responsible for data governance and defining data collection and collationguidelines.• Must be able to advice, guide and train other junior data engineers in their job.Must Have:• 4+ experience in a leadership role as a Data Scientist• Preferably from retail, Manufacturing, Healthcare industry(not mandatory)• Willing to work from scratch and build up a team of Data Scientists• Open for taking up the challenges with end to end ownership• Confident with excellent communication skills along with a good decision maker
Responsibilities: The Machine & Deep Machine Learning Software Engineer (Expertise in Computer Vision) will be an early member of a growing team with responsibilities for designing and developing highly scalable machine learning solutions that impact many areas of our business. The individual in this role will help in the design and development of Neural Network (especially Convolution Neural Networks) & ML solutions based on our reference architecture which is underpinned by big data & cloud technology, micro-service architecture and high performing compute infrastructure. Typical daily activities include contributing to all phases of algorithm development including ideation, prototyping, design, and development production implementation. Required Skills: An ideal candidate will have a background in software engineering and data science with expertise in machine learning algorithms, statistical analysis tools, and distributed systems. Experience in building machine learning applications, and broad knowledge of machine learning APIs, tools, and open-source libraries Strong coding skills and fundamentals in data structures, predictive modeling, and big data concepts Experience in designing full stack ML solutions in a distributed computing environment Experience working with Python, Tensor Flow, Kera’s, Sci-kit, pandas, NumPy, AZURE, AWS GPU Excellent communication skills with multiple levels of the organization Image CNN, Image processing, MRCNN, FRCNN experience is a must.
Hands-on development/maintenance experience in Tableau: Developing, maintaining, and managing advanced reporting, analytics, dashboards and other BI solutions using Tableau Reviewing and improving existing Tableau dashboards and data models/ systems and collaborating with teams to integrate new systems Provide support and expertise to the business community to assist with better utilization of Tableau Understand business requirements, conduct analysis and recommend solution options for intelligent dashboards in Tableau Experience with Data Extraction, Transformation and Load (ETL) – knowledge of how to extract, transform and load data Execute SQL data queries across multiple data sources in support of business intelligence reporting needs. Format query results / reports in various ways Participates in QA testing, liaising with other project team members and being responsive to client's needs, all with an eye on details in a fast-paced environment Performing and documenting data analysis, data validation, and data mapping/design Key Performance Indicators (Indicate how performance will be measured: indicators, activities…) KPIs will be outlined in detail in the goal sheet Ideal Background (State the minimum and desirable education and experience level) Education Minimum: Graduation, preferably in Science Experience requirement: · Minimum: 2-3 years’ relevant work experience in the field of reporting and data analytics using Tableau. · Tableau certifications would be preferred · Work experience in the regulated medical device / Pharmaceutical industry would be an added advantage, but not mandatory Languages: Minimum: English (written and spoken) Specific Professional Competencies: Indicate any other soft/technical/professional knowledge and skills requirements Extensive experience in developing, maintaining and managing Tableau driven dashboards & analytics and working knowledge of Tableau administration /architecture. A solid understanding of SQL, rational databases, and normalization Proficiency in use of query and reporting analysis tools Competency in Excel (macros, pivot tables, etc.) Degree in Mathematics, Computer Science, Information Systems, or related field.
The JobThe Architect, Machine Learning and Artificial Intelligence including Computer Vision will grow and lead a team of talented Machine Learning (ML), Computer Vision (CV) and Artificial Intelligence (AI) researchers and engineers to develop innovative machine learning algorithms, scalable ML system, and AI applications for Racetrack. This role will be focused on developing and deploying personalization and recommender system, search, experimentation, audience, and content AI solutions to drive user experience and growth. The Daily Develop innovative data science solutions that utilize machine learning and deep learning algorithms, statistical and quantitative modelling approaches to support product, engineering, content, and marketing initiatives. Build and lead a world-class team of ML and AI scientists and engineers. Be a hands-on leader to mentor the team in latest machine learning and deep learning approaches, and to introduce new technologies and processes. Single headedly manage the MVP and PoCs Work with ML engineers to design solution architecture and develop scalable machine learning system to accelerate learning cycle. Identify data science opportunities that deliver business value. Develop ML/AI/CV roadmap and educate both internal and external stakeholders at all levels to drive implementation and measurement. Hands on experience in Image processing for auto industry BFSI domain knowledge is a plus Provide thought leadership to enable ML/AI applications. Manage products priorities and ensure timely delivery. Develop and evangelize best practices for scoping, building, validating, deploying, and monitoring ML/AI products. Prepare and present ML modelling results and analytical insights that help drive the business to senior leadership. The Essentials 8 + years of work experience in Machine Learning, AI and Data Science with a proven track record to drive innovation and business impacts 4 + years of managing a team of data scientists, ML and AI researchers and engineers Strong machine learning, deep learning, and statistical modelling expertise, such as causal inference modelling, ensembles, neural networks, reinforcement learning, NLP, and computer vision Advanced knowledge of SQL and experience with big data platform (AWS, Snowflake, Spark, Google Cloud etc.) Proficiency in machine learning and deep learning languages and platforms (Python, R, TensorFlow, Keras, PyTorch, MXNet etc.) Experience in deploying machine learning algorithms and advanced modelling solutions Experience in developing advanced analytics and ML infrastructure and system Self-starter and self-motivated with the proven ability to deliver results in a fast-paced, high-energy environment Strong communication skills and the ability to explain complex analysis and algorithms to non-technical audience Works effectively cross functional teams to build trusted partnership Working experience in digital media and entertainment industry preferred Experience with Agile methodologies preferred
About us DataWeave provides Retailers and Brands with “Competitive Intelligence as a Service” that enables them to take key decisions that impact their revenue. Powered by AI, we provide easily consumable and actionable competitive intelligence by aggregating and analyzing billions of publicly available data points on the Web to help businesses develop data-driven strategies and make smarter decisions.Data Science@DataWeaveWe the Data Science team at DataWeave (called Semantics internally) build the core machine learning backend and structured domain knowledge needed to deliver insights through our data products. Our underpinnings are: innovation, business awareness, long term thinking, and pushing the envelope. We are a fast paced labs within the org applying the latest research in Computer Vision, Natural Language Processing, and Deep Learning to hard problems in different domains.How we work?It's hard to tell what we love more, problems or solutions! Every day, we choose to address some of the hardest data problems that there are. We are in the business of making sense of messy public data on the web. At serious scale!What do we offer?- Some of the most challenging research problems in NLP and Computer Vision. Huge text and image datasets that you can play with!- Ability to see the impact of your work and the value you're adding to our customers almost immediately.- Opportunity to work on different problems and explore a wide variety of tools to figure out what really excites you.- A culture of openness. Fun work environment. A flat hierarchy. Organization wide visibility. Flexible working hours.- Learning opportunities with courses and tech conferences. Mentorship from seniors in the team.- Last but not the least, competitive salary packages and fast paced growth opportunities.Who are we looking for?The ideal candidate is a strong software developer or a researcher with experience building and shipping production grade data science applications at scale. Such a candidate has keen interest in liaising with the business and product teams to understand a business problem, and translate that into a data science problem. You are also expected to develop capabilities that open up new business productization opportunities. We are looking for someone with 6+ years of relevant experience working on problems in NLP or Computer Vision with a Master's degree (PhD preferred). Key problem areas- Preprocessing and feature extraction noisy and unstructured data -- both text as well as images.- Keyphrase extraction, sequence labeling, entity relationship mining from texts in different domains.- Document clustering, attribute tagging, data normalization, classification, summarization, sentiment analysis.- Image based clustering and classification, segmentation, object detection, extracting text from images, generative models, recommender systems.- Ensemble approaches for all the above problems using multiple text and image based techniques.Relevant set of skills- Have a strong grasp of concepts in computer science, probability and statistics, linear algebra, calculus, optimization, algorithms and complexity.- Background in one or more of information retrieval, data mining, statistical techniques, natural language processing, and computer vision.- Excellent coding skills on multiple programming languages with experience building production grade systems. Prior experience with Python is a bonus.- Experience building and shipping machine learning models that solve real world engineering problems. Prior experience with deep learning is a bonus.- Experience building robust clustering and classification models on unstructured data (text, images, etc). Experience working with Retail domain data is a bonus.- Ability to process noisy and unstructured data to enrich it and extract meaningful relationships.- Experience working with a variety of tools and libraries for machine learning and visualization, including numpy, matplotlib, scikit-learn, Keras, PyTorch, Tensorflow.- Use the command line like a pro. Be proficient in Git and other essential software development tools.- Working knowledge of large-scale computational models such as MapReduce and Spark is a bonus.- Be a self-starter—someone who thrives in fast paced environments with minimal ‘management’.- It's a huge bonus if you have some personal projects (including open source contributions) that you work on during your spare time. Show off some of your projects you have hosted on GitHub.Role and responsibilities- Understand the business problems we are solving. Build data science capability that align with our product strategy.- Conduct research. Do experiments. Quickly build throw away prototypes to solve problems pertaining to the Retail domain.- Build robust clustering and classification models in an iterative manner that can be used in production.- Constantly think scale, think automation. Measure everything. Optimize proactively.- Take end to end ownership of the projects you are working on. Work with minimal supervision.- Help scale our delivery, customer success, and data quality teams with constant algorithmic improvements and automation.- Take initiatives to build new capabilities. Develop business awareness. Explore productization opportunities.- Be a tech thought leader. Add passion and vibrance to the team. Push the envelope. Be a mentor to junior members of the team.- Stay on top of latest research in deep learning, NLP, Computer Vision, and other relevant areas.