Must Have Skills:
- Solid Knowledge on DWH, ETL and Big Data Concepts
- Excellent SQL Skills (With knowledge of SQL Analytics Functions)
- Working Experience on any ETL tool i.e. SSIS / Informatica
- Working Experience on any Azure or AWS Big Data Tools.
- Experience on Implementing Data Jobs (Batch / Real time Streaming)
- Excellent written and verbal communication skills in English, Self-motivated with strong sense of ownership and Ready to learn new tools and technologies
- Experience on Py-Spark / Spark SQL
- AWS Data Tools (AWS Glue, AWS Athena)
- Azure Data Tools (Azure Databricks, Azure Data Factory)
- Knowledge about Azure Blob, Azure File Storage, AWS S3, Elastic Search / Redis Search
- Knowledge on domain/function (across pricing, promotions and assortment).
- Implementation Experience on Schema and Data Validator framework (Python / Java / SQL),
- Knowledge on DQS and MDM.
- Independently work on ETL / DWH / Big data Projects
- Gather and process raw data at scale.
- Design and develop data applications using selected tools and frameworks as required and requested.
- Read, extract, transform, stage and load data to selected tools and frameworks as required and requested.
- Perform tasks such as writing scripts, web scraping, calling APIs, write SQL queries, etc.
- Work closely with the engineering team to integrate your work into our production systems.
- Process unstructured data into a form suitable for analysis.
- Analyse processed data.
- Support business decisions with ad hoc analysis as needed.
- Monitoring data performance and modifying infrastructure as needed.
Responsibility: Smart Resource, having excellent communication skills
We are currently seeking talented and highly motivated Data Analyst to lead in the development of our discovery and support platform. The successful candidate will join a small, global team of data focused associates that have successfully built, and maintained a best of class traditional, Kimball based, SQL server founded, data warehouse and Qlik Sense based BI Dashboards. The successful candidate will lead the conversion of managing our master data set, developing reports and analytics dashboards.
To do well in this role you need a very fine eye for detail, experience as a data analyst, and deep understanding of the popular data analysis tools and databases.
Specific responsibilities will be to:
- Managing master data, including creation, updates, and deletion.
- Managing users and user roles.
- Provide quality assurance of imported data, working with quality assurance analysts if necessary.
- Commissioning and decommissioning of data sets.
- Processing confidential data and information according to various compliance.
- Helping develop reports and analysis.
- Managing and designing the reporting environment, including data sources, security, and metadata.
- Supporting the data warehouse in identifying and revising reporting requirements.
- Supporting initiatives for data integrity and normalization.
- Assessing tests and implementing new or upgraded software and assisting with strategic decisions on new systems.
- Generating reports from single or multiple systems.
- Troubleshooting the reporting database environment and reports.
- Evaluating changes and updates to source production systems.
- Training end-users on new reports and dashboards.
- Providing technical expertise in data storage structures, data mining, and data cleansing.
- Master’s Degree (or equivalent experience) in computer science, data science or a scientific field that has relevance to healthcare in the United States.
- Work experience as a data analyst or in a related field for more than 5 years.
- Proficiency in statistics, data analysis, data visualization and research methods.
- Strong SQL and Excel skills with ability to learn other analytic tools.
- Experience with BI dashboard tools like Qlik Sense, Tableau, Power BI.
- Experience with AWS services like EC2, S3, Athena and QuickSight.
- Ability to work with stakeholders to assess potential risks.
- Ability to analyze existing tools and databases and provide software solution recommendations.
- Ability to translate business requirements into non-technical, lay terms.
- High-level experience in methodologies and processes for managing large-scale databases.
- Demonstrated experience in handling large data sets and relational databases.
- Understanding of addressing and metadata standards.
Job Title – Data Scientist (Forecasting)
Anicca Data is seeking a Data Scientist (Forecasting) who is motivated to apply his/her/their skill set to solve complex and challenging problems. The focus of the role will center around applying deep learning models to real-world applications. The candidate should have experience in training, testing deep learning architectures. This candidate is expected to work on existing codebases or write an optimized codebase at Anicca Data. The ideal addition to our team is self-motivated, highly organized, and a team player who thrives in a fast-paced environment with the ability to learn quickly and work independently.
Job Location: Remote (for time being) and Bangalore, India (post-COVID crisis)
- At least 3+ years of experience in a Data Scientist role
- Bachelor's/Master’s degree in Computer Science, Engineering, Statistics, Mathematics, or similar quantitative discipline. D. will add merit to the application process
- Experience with large data sets, big data, and analytics
- Exposure to statistical modeling, forecasting, and machine learning. Deep theoretical and practical knowledge of deep learning, machine learning, statistics, probability, time series forecasting
- Training Machine Learning (ML) algorithms in areas of forecasting and prediction
- Experience in developing and deploying machine learning solutions in a cloud environment (AWS, Azure, Google Cloud) for production systems
- Research and enhance existing in-house, open-source models, integrate innovative techniques, or create new algorithms to solve complex business problems
- Experience in translating business needs into problem statements, prototypes, and minimum viable products
- Experience managing complex projects including scoping, requirements gathering, resource estimations, sprint planning, and management of internal and external communication and resources
- Write C++ and Python code along with TensorFlow, PyTorch to build and enhance the platform that is used for training ML models
- Worked on forecasting projects – both classical and ML models
- Experience with training time series forecasting methods like Moving Average (MA) and Autoregressive Integrated Moving Average (ARIMA) with Neural Networks (NN) models as Feed-forward NN and Nonlinear Autoregressive
- Strong background in forecasting accuracy drivers
- Experience in Advanced Analytics techniques such as regression, classification, and clustering
- Ability to explain complex topics in simple terms, ability to explain use cases and tell stories
About the Role:
Freight Tiger is growing exponentially, and technology is at the centre of it. Our Engineers love solving complex industry problems by building modular and scalable solutions using cutting-edge technology. Your peers will be an exceptional group of Software Engineers, Quality Assurance Engineers, DevOps Engineers, and Infrastructure and Solution Architects.
This role is responsible for developing data pipelines and data engineering components to support strategic initiatives and ongoing business processes. This role works with leads, analysts, and data scientists to understand requirements, develop technical solutions, and ensure the reliability and performance of the data engineering solutions.
This role provides an opportunity to directly impact business outcomes for sales, underwriting, claims and operations functions across multiple use cases by providing them data for their analytical modelling needs.
- Create and maintain a data pipeline.
- Build and deploy ETL infrastructure for optimal data delivery.
- Work with various product, design and executive teams to troubleshoot data-related issues.
- Create tools for data analysts and scientists to help them build and optimise the product.
- Implement systems and processes for data access controls and guarantees.
- Distil the knowledge from experts in the field outside the org and optimise internal data systems.
- Should have 5+ years of relevant experience.
- Strong analytical skills.
- Degree in Computer Science, Statistics, Informatics, Information Systems.
- Strong project management and organisational skills.
- Experience supporting and working with cross-functional teams in a dynamic environment.
- SQL guru with hands-on experience on various databases.
- NoSQL databases like Cassandra, and MongoDB.
- Experience with Snowflake, Redshift.
- Experience with tools like Airflow, and Hevo.
- Experience with Hadoop, Spark, Kafka, and Flink.
- Programming experience in Python, Java, and Scala.
We are looking for a Machine Learning engineer for on of our premium client.
Experience: 2-9 years
Python, PySpark, the Python Scientific Stack; MLFlow, Grafana, Prometheus for machine learning pipeline management and monitoring; SQL, Airflow, Databricks, our own open-source data pipelining framework called Kedro, Dask/RAPIDS; Django, GraphQL and ReactJS for horizontal product development; container technologies such as Docker and Kubernetes, CircleCI/Jenkins for CI/CD, cloud solutions such as AWS, GCP, and Azure as well as Terraform and Cloudformation for deployment
https://www.freshprints.com/home" target="_blank">Fresh Prints is a New York-based custom apparel startup. We find incredible students and give them the working capital, training, and support to build the business at their schools. We have 400+ students who will do $15 million in sales over the next 12 months.
You’ll be focused on the next $50 million. Data is a product that can be used to drive team behaviors and generate revenue growth.
- How do we use our data to drive up account value?
- How do we develop additional revenue channels?
- How do we increase operational efficiency?
- How do we usher in the next stage at Fresh Prints?
Those are the questions the members of our cross-functional Growth Teamwork on every day. They do so not as data analysts, developers, or marketers, but as entrepreneurs, determined to drive the business forward.
You’d be our first dedicated Data Engineer. As such, you’ll touch every aspect of data at Fresh Prints. You’ll work with the rest of the Data Science team to sanitize the data systems, automate pipelines, and build test cases for post batch and regular quality evaluation.
You will develop alert systems for fire drill activities, build an immediate cleanup plan, and document the evolution of the ingestion process. You will also assist in mining that data for insights and building visualizations that help the entire company utilize those insights.
This role reports to https://www.linkedin.com/in/abilashls/" target="_blank">Abilash Reddy, our Head of Data & CRM and you will work with an extremely talented and professional team.
- Work closely with every team at Fresh Prints to uncover ways in which data shapes how effective they are
- Designing, building, and operationalizing large-scale enterprise data solutions and applications using GCP data pipeline and automation services
- Create, restructure, and manage large datasets, files, and systems
- Take complete ownership of the implementation of data pipelines and hybrid data systems along with test cases as we scale
- Create data tools for analytics and data scientist team members that assist them in building and optimizing our product into an innovative industry leader
- Build analytics tools that utilize the data pipeline to provide actionable insights into customer acquisition, operational efficiency, and other key business performance metrics
- Monitor and maintain overall Tableau server health
- 3+ years of experience in Cloud data ingestion and automation from a variety of sources using GCP services and building hybrid data architecture on top
- 3+ years of strong experience in SQL & Python
- 3+ years of experience with Excel and/or Google Sheets
- 2+ years of experience with Tableau server or online
- A successful history of manipulating, processing, and extracting value from large disconnected datasets
- Experience in an agile development environment
- Perfect English fluency is a must
- Able to connect the dots between data and business value
- Strong attention to detail
- Proactive. You believe it’s always on you to make sure anything you do is a success
- In love with a challenge. You revel in solving problems and want a job that pushes you out of your comfort zone
- Goal-oriented. You’re incredibly ambitious. You’re dedicated to a long-term vision of who you are and where you want to go
- Open to change. You’re inspired by the endless ways in which everything we do can always be improved
- Calm under pressure. You have a sense of urgency but channel it into productively working through any issues
- Google cloud certified (preferred)
- Bachelor in computer science or information management is a strong plus
Compensation & Benefits
- Competitive salary
- Health insurance (India & Philippines)
- Learning opportunities
- Working in a great culture
- This is a permanent WFH role, could be based in India or the Philippines. Candidates from other countries may be considered
- We will have an office in Hyderabad, India but working from the office is completely optional
- 3:30 PM to 11:30 PM IST
Fresh Prints is an equal employment opportunity employer and promotes diversity; actively encouraging people of all backgrounds, ages, LGBTQ+, and those with disabilities to apply.
Should design and operate data pipe lines.
Build and manage analytics platform using Elastic search, Redshift, Mongo db.
Strong programming fundamentals in Datastructures and algorithms.
- Creating, designing and developing data models
- Prepare plans for all ETL (Extract/Transformation/Load) procedures and architectures
- Validating results and creating business reports
- Monitoring and tuning data loads and queries
- Develop and prepare a schedule for a new data warehouse
- Analyze large databases and recommend appropriate optimization for the same
- Administer all requirements and design various functional specifications for data
- Provide support to the Software Development Life cycle
- Prepare various code designs and ensure efficient implementation of the same
- Evaluate all codes and ensure the quality of all project deliverables
- Monitor data warehouse work and provide subject matter expertise
- Hands-on BI practices, data structures, data modeling, SQL skills
- Minimum 1 year experience in Pyspark
Data Analyst Job Duties
Data analyst responsibilities include conducting full lifecycle analysis to include requirements, activities and design. Data analysts will develop analysis and reporting capabilities. They will also monitor performance and quality control plans to identify improvements.
Interpret data, analyze results using statistical techniques and provide ongoing reports
Develop and implement databases, data collection systems, data analytics and other strategies that optimize statistical efficiency and quality
Acquire data from primary or secondary data sources and maintain databases/data systems
Identify, analyze, and interpret trends or patterns in complex data sets
Filter and “clean” data by reviewing computer reports, printouts, and performance indicators to locate and correct code problems
Work with management to prioritize business and information needs
Locate and define new process improvement opportunities
Proven working experience as a Data Analyst or Business Data Analyst
Technical expertise regarding data models, database design development, data mining and segmentation techniques
Knowledge of statistics and experience using statistical packages for analyzing datasets (Excel, SPSS, SAS etc)
Strong analytical skills with the ability to collect, organize, analyze, and disseminate significant amounts of information with attention to detail and accuracy.
Adept at queries, report writing and presenting findings
BS in Mathematics, Economics, Computer Science, Information Management or Statistics
High Level Scope of Work :
- Work with AI / Analytics team to priorities MACHINE LEARNING Identified USE CASES for Development and Rollout
- Meet and understand current retail / Marketing Requirements and how AI/ML solution will address and automate the decision process.
- Develop AI/ML Programs using DATAIKU Solution & Python or open source tech with focus to deliver high Quality and accurate ML prediction Model
- Gather additional and external data sources to support the AI/ML Model as desired .
- Support the ML Model and FINE TUNEit to ensure high accuracy all the time.
- Example of use cases (Customer Segmentation , Product Recommendation, Price Optimization, Retail Customer Personalization Offers, Next Best Location for Business Est, CCTV Computer Vision, NLP and Voice Recognition Solutions)
Required technology expertise :
- Deep Knowledge & Understanding on MACHINE LEARNING ALGORITHMS (Supervised / Unsupervised Learning / Deep Learning Models)
- Hands on EXP for at least 5+ years with PYTHON and R STATISTICS PROGRAMMING Languages
- Strong Database Development knowledge using SQL and PL/SQL
- Must have EXP using Commercial Data Science Solution particularly DATAIKU and (Altryx, SAS, Azure ML, Google ML, Oracle ML is a plus)
- Strong hands on EXP with BIG DATA Solution Architecture and Optimization for AI/ML Workload.
- Data Analytics and BI Tools Hand on EXP particularly (Oracle OBIEE and Power BI)
- Have implemented and Developed at least 3 successful AI/ML Projects with tangible Business Outcomes In retail Focused Industry
- Have at least 5+ Years EXP in Retail Industry and Customer Focus Business.
- Ability to communicate with Business Owner & stakeholders to understand their current issues and provide MACHINE LEARNING Solution accordingly.
- Bachelor Degree or Master Degree in Data Science, Artificial Intelligent, Computer Science
- Certified as DATA SCIENTIST or MACHINE LEARNING Expert.
DataWeave provides Retailers and Brands with “Competitive Intelligence as a Service” that enables them to take key decisions that impact their revenue. Powered by AI, we provide easily consumable and actionable competitive intelligence by aggregating and analyzing billions of publicly available data points on the Web to help businesses develop data-driven strategies and make smarter decisions.
Data [email protected]
We the Data Science team at DataWeave (called Semantics internally) build the core machine learning backend and structured domain knowledge needed to deliver insights through our data products. Our underpinnings are: innovation, business awareness, long term thinking, and pushing the envelope. We are a fast paced labs within the org applying the latest research in Computer Vision, Natural Language Processing, and Deep Learning to hard problems in different domains.
How we work?
It's hard to tell what we love more, problems or solutions! Every day, we choose to address some of the hardest data problems that there are. We are in the business of making sense of messy public data on the web. At serious scale!
What do we offer?
- Some of the most challenging research problems in NLP and Computer Vision. Huge text and image datasets that you can play with!
- Ability to see the impact of your work and the value you're adding to our customers almost immediately.
- Opportunity to work on different problems and explore a wide variety of tools to figure out what really excites you.
- A culture of openness. Fun work environment. A flat hierarchy. Organization wide visibility. Flexible working hours.
- Learning opportunities with courses and tech conferences. Mentorship from seniors in the team.
- Last but not the least, competitive salary packages and fast paced growth opportunities.
Who are we looking for?
The ideal candidate is a strong software developer or a researcher with experience building and shipping production grade data science applications at scale. Such a candidate has keen interest in liaising with the business and product teams to understand a business problem, and translate that into a data science problem. You are also expected to develop capabilities that open up new business productization opportunities.
We are looking for someone with 6+ years of relevant experience working on problems in NLP or Computer Vision with a Master's degree (PhD preferred).
Key problem areas
- Preprocessing and feature extraction noisy and unstructured data -- both text as well as images.
- Keyphrase extraction, sequence labeling, entity relationship mining from texts in different domains.
- Document clustering, attribute tagging, data normalization, classification, summarization, sentiment analysis.
- Image based clustering and classification, segmentation, object detection, extracting text from images, generative models, recommender systems.
- Ensemble approaches for all the above problems using multiple text and image based techniques.
Relevant set of skills
- Have a strong grasp of concepts in computer science, probability and statistics, linear algebra, calculus, optimization, algorithms and complexity.
- Background in one or more of information retrieval, data mining, statistical techniques, natural language processing, and computer vision.
- Excellent coding skills on multiple programming languages with experience building production grade systems. Prior experience with Python is a bonus.
- Experience building and shipping machine learning models that solve real world engineering problems. Prior experience with deep learning is a bonus.
- Experience building robust clustering and classification models on unstructured data (text, images, etc). Experience working with Retail domain data is a bonus.
- Ability to process noisy and unstructured data to enrich it and extract meaningful relationships.
- Experience working with a variety of tools and libraries for machine learning and visualization, including numpy, matplotlib, scikit-learn, Keras, PyTorch, Tensorflow.
- Use the command line like a pro. Be proficient in Git and other essential software development tools.
- Working knowledge of large-scale computational models such as MapReduce and Spark is a bonus.
- Be a self-starter—someone who thrives in fast paced environments with minimal ‘management’.
- It's a huge bonus if you have some personal projects (including open source contributions) that you work on during your spare time. Show off some of your projects you have hosted on GitHub.
Role and responsibilities
- Understand the business problems we are solving. Build data science capability that align with our product strategy.
- Conduct research. Do experiments. Quickly build throw away prototypes to solve problems pertaining to the Retail domain.
- Build robust clustering and classification models in an iterative manner that can be used in production.
- Constantly think scale, think automation. Measure everything. Optimize proactively.
- Take end to end ownership of the projects you are working on. Work with minimal supervision.
- Help scale our delivery, customer success, and data quality teams with constant algorithmic improvements and automation.
- Take initiatives to build new capabilities. Develop business awareness. Explore productization opportunities.
- Be a tech thought leader. Add passion and vibrance to the team. Push the envelope. Be a mentor to junior members of the team.
- Stay on top of latest research in deep learning, NLP, Computer Vision, and other relevant areas.