About Response Informatics
Job Description: Data Scientist
At Propellor.ai, we derive insights that allow our clients to make scientific decisions. We believe in demanding more from the fields of Mathematics, Computer Science, and Business Logic. Combine these and we show our clients a 360-degree view of their business. In this role, the Data Scientist will be expected to work on Procurement problems along with a team-based across the globe.
We are a Remote-First Company.
Read more about us here: https://www.propellor.ai/consulting" target="_blank">https://www.propellor.ai/consulting
What will help you be successful in this role
- High Energy
- Passion to learn
- High sense of ownership
- Ability to work in a fast-paced and deadline-driven environment
- Loves technology
- Highly skilled at Data Interpretation
- Problem solver
- Ability to narrate the story to the business stakeholders
- Generate insights and the ability to turn them into actions and decisions
Skills to work in a challenging, complex project environment
- Need you to be naturally curious and have a passion for understanding consumer behavior
- A high level of motivation, passion, and high sense of ownership
- Excellent communication skills needed to manage an incredibly diverse slate of work, clients, and team personalities
- Flexibility to work on multiple projects and deadline-driven fast-paced environment
- Ability to work in ambiguity and manage the chaos
- Analyze data to unlock insights: Ability to identify relevant insights and actions from data. Use regression, cluster analysis, time series, etc. to explore relationships and trends in response to stakeholder questions and business challenges.
- Bring in experience for AI and ML: Bring in Industry experience and apply the same to build efficient and optimal Machine Learning solutions.
- Exploratory Data Analysis (EDA) and Generate Insights: Analyse internal and external datasets using analytical techniques, tools, and visualization methods. Ensure pre-processing/cleansing of data and evaluate data points across the enterprise landscape and/or external data points that can be leveraged in machine learning models to generate insights.
- DS and ML Model Identification and Training: Identity, test, and train machine learning models that need to be leveraged for business use cases. Evaluate models based on interpretability, performance, and accuracy as required. Experiment and identify features from datasets that will help influence model outputs. Determine what models will need to be deployed, data points that need to be fed into models, and aid in the deployment and maintenance of models.
An enthusiastic individual with the following skills. Please do not hesitate to apply if you do not match all of them. We are open to promising candidates who are passionate about their work, fast learners and are team players.
- Strong experience with machine learning and AI including regression, forecasting, time series, cluster analysis, classification, Image recognition, NLP, Text Analytics and Computer Vision.
- Strong experience with advanced analytics tools for Object-oriented/object function scripting using languages such as Python, or similar.
- Strong experience with popular database programming languages including SQL.
- Strong experience in Spark/Pyspark
- Experience in working in Databricks
What are the company benefits you get, when you join us as?
- Permanent Work from Home Opportunity
- Opportunity to work with Business Decision Makers and an internationally based team
- The work environment that offers limitless learning
- A culture void of any bureaucracy, hierarchy
- A culture of being open, direct, and with mutual respect
- A fun, high-caliber team that trusts you and provides the support and mentorship to help you grow
- The opportunity to work on high-impact business problems that are already defining the future of Marketing and improving real lives
Whom will you work with?
You will closely work with other Senior Data Scientists and Data Engineers.
Immediate to 15-day Joiners will be preferred.
Proficiency in Linux.
Must have SQL knowledge and experience working with relational databases,
query authoring (SQL) as well as familiarity with databases including Mysql,
Mongo, Cassandra, and Athena.
Must have experience with Python/Scala.
Must have experience with Big Data technologies like Apache Spark.
Must have experience with Apache Airflow.
Experience with data pipeline and ETL tools like AWS Glue.
Experience working with AWS cloud services: EC2, S3, RDS, Redshift.
This role requires a person to support business charters & accompanying products by aligning with the Analytics
Manager’s vision, understanding tactical requirements and helping in successful execution. Split would be approx.
70% management + 30% individual contributor. Responsibilities include
- Understand business needs and objectives.
- Refine use cases and plan iterations and deliverables - able to pivot as required.
- Estimate efforts and conduct regular task updates to ensure timeline adherence.
- Set and manage stakeholder expectations as required
- Help BA and SBA resources with requirement gathering and final presentations.
- Resolve blockers regarding technical challenges and decision-making.
- Check final deliverables for correctness and review codes, along with Manager.
KPIs and metrics
- Orchestrate metrics building, maintenance, and performance monitoring.
- Owns and manages data models, data sources, and data definition repo.
- Makes low-level design choices during execution.
- Help Analytics Manager during regular one-on-ones + check-ins + recruitment.
- Provide technical guidance whenever required.
- Improve benchmarking and decision-making skills at execution-level.
- Train and get new resources up-to-speed.
- Knowledge building (methodologies) to better position the team for complex problems.
- Upstream to document and discuss execution challenges, process inefficiencies, and feedback loops.
- Downstream and parallel for context-building, mentoring, stakeholder management.
- Analytics : Python / R + SQL + Excel / PPT, Colab notebooks
- Database : PostgreSQL, Amazon Redshift, DynamoDB, Aerospike
- Warehouse : Amazon Redshift
- ETL : Lots of Python + custom-made
- Business Intelligence / Visualization : Metabase + Python/R libraries (location data)
- Deployment pipeline : Docker, Git, Jenkins, AWS Lambda
The SQL Server DBA will be responsible for the implementation, configuration, maintenance, and performance of critical SQL Server RDBMS systems, to ensure the availability and consistent performance of our corporate applications. This is a “hands-on” position requiring solid technical skills, as well as excellent interpersonal and communication skills.
The successful candidate will be responsible for the development and sustainment of the SQL Server Warehouse, ensuring its operational readiness (security, health and performance), executing data loads, and performing data modeling in support of multiple development teams. The data warehouse supports an enterprise application suite of program management tools. Must be capable of working independently and collaboratively.
- Conducting advanced statistical analysis to provide actionable insights, identify trends, and measure performance
- Performing data exploration, cleaning, preparation and feature engineering; in addition to executing tasks such as building a POC, validation/ AB testing
- Collaborating with data engineers & architects to implement and deploy scalable solutions
- Communicating results to diverse audiences with effective writing and visualizations
- Identifying and executing on high impact projects, triage external requests, and ensure timely completion for the results to be useful
- Providing thought leadership by researching best practices, conducting experiments, and collaborating with industry leaders
What you need to have:
- 2-4 year experience in machine learning algorithms, predictive analytics, demand forecasting in real-world projects
- Strong statistical background in descriptive and inferential statistics, regression, forecasting techniques.
- Strong Programming background in Python (including packages like Tensorflow), R, D3.js , Tableau, Spark, SQL, MongoDB.
- Preferred exposure to Optimization & Meta-heuristic algorithm and related applications
- Background in a highly quantitative field like Data Science, Computer Science, Statistics, Applied Mathematics,Operations Research, Industrial Engineering, or similar fields.
- Should have 2-4 years of experience in Data Science algorithm design and implementation, data analysis in different applied problems.
- DS Mandatory skills : Python, R, SQL, Deep learning, predictive analysis, applied statistics
We are looking out for a technically driven "Full-Stack Engineer" for one of our premium client
• Bachelor's degree in computer science or related field; Master's degree is a plus
• 3+ years of relevant work experience
• Meaningful experience with at least two of the following technologies: Python, Scala, Java
• Strong proven experience on distributed processing frameworks (Spark, Hadoop, EMR) and SQL is very
• Commercial client-facing project experience is helpful, including working in close-knit teams
• Ability to work across structured, semi-structured, and unstructured data, extracting information and
identifying linkages across disparate data sets
• Confirmed ability in clearly communicating complex solutions
• Understandings on Information Security principles to ensure compliant handling and management of
• Experience and interest in Cloud platforms such as: AWS, Azure, Google Platform or Databricks
• Extraordinary attention to detail
- Creating, designing and developing data models
- Prepare plans for all ETL (Extract/Transformation/Load) procedures and architectures
- Validating results and creating business reports
- Monitoring and tuning data loads and queries
- Develop and prepare a schedule for a new data warehouse
- Analyze large databases and recommend appropriate optimization for the same
- Administer all requirements and design various functional specifications for data
- Provide support to the Software Development Life cycle
- Prepare various code designs and ensure efficient implementation of the same
- Evaluate all codes and ensure the quality of all project deliverables
- Monitor data warehouse work and provide subject matter expertise
- Hands-on BI practices, data structures, data modeling, SQL skills
5 Years - 10 Years
Must have Skills: SQL
Hard Skills for a Data Warehouse Developer:
Soft Skills for Data Warehouse Developers:
- Building and operationalizing large scale enterprise data solutions and applications using one or more of AZURE data and analytics services in combination with custom solutions - Azure Synapse/Azure SQL DWH, Azure Data Lake, Azure Blob Storage, Spark, HDInsights, Databricks, CosmosDB, EventHub/IOTHub.
- Experience in migrating on-premise data warehouses to data platforms on AZURE cloud.
- Designing and implementing data engineering, ingestion, and transformation functions
Azure Synapse or Azure SQL data warehouse
Spark on Azure is available in HD insights and data bricks
What you will get to do:
- Understand the business, define problems and design structured approaches to solve them.
- Perform exploratory analysis on large volumes of data to validate/disregard hypotheses.
- Identify opportunities using insights for product/process improvement.
- Create dashboards and automated reports to track KPIs.
- Manage the reporting of KPIs, present analysis & findings to steer the team’s strategic vision.
- Partner with cross-functional stakeholders (engineering, design, sales and operations teams) on a regular basis to drive product/process changes and improve business intelligence.
- Analyze rich user and transaction data to surface patterns and trends.
- Perform analytical deep-dives to identify problems, opportunities and actions required.
- Process data from disparate sources using SQL, R, Python, or other scripting and statistical tools;
- Perform ad hoc data analysis to remove roadblocks and ensure operations are running smoothly.
What you will need to apply:
- Flair for numbers, strong analytical skills and structured process thinking, attention to detail
- Basic business sense and an understanding of common statistics/analytics techniques
- Advanced Excel Proficiency and SQL (experience of working in Python/R is preferable)
- Strong ownership, drive and the experience of working independently in unstructured environments
- Ability to work closely with cross-functional teams within tight timelines to execute on decisions
- An appreciation for the connection between your work and the outcome (the impact it has on the organization and the experience it delivers to the customers)
- Interview Focus: Puzzles, Guesstimates, Problem Solving, Analytical tool proficiency, Data Interpretation
- Sr. Data Engineer:
Core Skills – Data Engineering, Big Data, Pyspark, Spark SQL and Python
Candidate with prior Palantir Cloud Foundry OR Clinical Trial Data Model background is preferred
- Responsible for Data Engineering, Foundry Data Pipeline Creation, Foundry Analysis & Reporting, Slate Application development, re-usable code development & management and Integrating Internal or External System with Foundry for data ingestion with high quality.
- Have good understanding on Foundry Platform landscape and it’s capabilities
- Performs data analysis required to troubleshoot data related issues and assist in the resolution of data issues.
- Defines company data assets (data models), Pyspark, spark SQL, jobs to populate data models.
- Designs data integrations and data quality framework.
- Design & Implement integration with Internal, External Systems, F1 AWS platform using Foundry Data Connector or Magritte Agent
- Collaboration with data scientists, data analyst and technology teams to document and leverage their understanding of the Foundry integration with different data sources - Actively participate in agile work practices
- Coordinating with Quality Engineer to ensure the all quality controls, naming convention & best practices have been followed
Desired Candidate Profile :
- Strong data engineering background
- Experience with Clinical Data Model is preferred
- Experience in
- SQL Server ,Postgres, Cassandra, Hadoop, and Spark for distributed data storage and parallel computing
- Java and Groovy for our back-end applications and data integration tools
- Python for data processing and analysis
- Cloud infrastructure based on AWS EC2 and S3
- 7+ years IT experience, 2+ years’ experience in Palantir Foundry Platform, 4+ years’ experience in Big Data platform
- 5+ years of Python and Pyspark development experience
- Strong troubleshooting and problem solving skills
- BTech or master's degree in computer science or a related technical field
- Experience designing, building, and maintaining big data pipelines systems
- Hands-on experience on Palantir Foundry Platform and Foundry custom Apps development
- Able to design and implement data integration between Palantir Foundry and external Apps based on Foundry data connector framework
- Hands-on in programming languages primarily Python, R, Java, Unix shell scripts
- Hand-on experience in AWS / Azure cloud platform and stack
- Strong in API based architecture and concept, able to do quick PoC using API integration and development
- Knowledge of machine learning and AI
- Skill and comfort working in a rapidly changing environment with dynamic objectives and iteration with users.
Demonstrated ability to continuously learn, work independently, and make decisions with minimal supervision