Similar jobs
Big Data Engineer
at Multinational Company providing Automation digital Solutions
- At least 4 to 7 years of relevant experience as Big Data Engineer
- Hands-on experience in Scala or Python
- Hands-on experience on major components in Hadoop Ecosystem like HDFS, Map Reduce, Hive, Impala.
- Strong programming experience in building applications/platform using Scala or Python.
- Experienced in implementing Spark RDD Transformations, actions to implement business analysis
We are specialized in productizing solutions of new technology.
Our vision is to build engineers with entrepreneurial and leadership mindsets who can create highly impactful products and solutions using technology to deliver immense value to our clients.
We strive to develop innovation and passion into everything we do, whether it is services or products, or solutions.
Proficiency in Linux.
Must have SQL knowledge and experience working with relational databases,
query authoring (SQL) as well as familiarity with databases including Mysql,
Mongo, Cassandra, and Athena.
Must have experience with Python/Scala.
Must have experience with Big Data technologies like Apache Spark.
Must have experience with Apache Airflow.
Experience with data pipeline and ETL tools like AWS Glue.
Experience working with AWS cloud services: EC2, S3, RDS, Redshift.
The Sr. Data Scientist will be located in Pune, India or alternative location and working closely with our Analytics teams in New York City, India, and Bosnia. The role will be part of our Clinical Insights line of analytics, seeking to support internal and external Business partners in generating analyses and insights of Outcomes product (measurement of campaign outcomes / script lift), as well as general Deep Intent product suite. Activities in this position include conducting exploratory data analysis / discovery, creating and scoring audiences, reading campaign results by analyzing medical claims, clinical, demographic and clickstream data; performing analysis and creating actionable insights, summarizing them and presenting results and recommended actions to internal stakeholders and external clients, as needed. This role will report directly to the Sr. Director of Outcomes Insights.
Key Responsibilities:
- Time-series modeling and forecasting
- Predictive modeling (e.g. xgboost, deep learning) on large datasets
- Building data ingestion pipelines and transform data into metrics useful for analytics and modeling
- Hypothesis Testing, Experimental Design & AB Testing
- Write production level code in Python,, SQL in BigQuery/Spark and Git experience
- Support business development and client analytics and insights process, under supervision of the director / sr. data scientist, utilizing consumer demographic, clickstream and clinical data (claims and medications)
- Core activities to include: Campaign audience sizing estimates, generating lookalike & campaign audiences, generating standardized reporting deliverables on media performance, and packaging insights into relevant client stories
- Extract, explore, visualize and analyze large healthcare claims data, consumer demographic, prospecting and clickstream data using SQL, Python or R libraries.
- Generate scripts for audience creation using SQL, Python / R and API call infrastructure.
- Understand objectives of client campaigns, audience selection (diagnostics), creative and channel.
- Support internal product development of data tools, dashboards and forecasts, as needed.
- You have a working understanding of the ad-tech / digital marketing and advertising data and campaigns, and interest (and aptitude) for learning US healthcare patient and provider systems (e.g. medical claims, medications etc.).
- Desire to work in a rapidly growing and scaling startup, with a strong culture of fast-paced cross functional collaboration.
- Hands-on predictive modeling experience (decision trees, boosting algorithms and regression models).
- Orientation and interest in translating complex quantitative results into meaningful findings and interpretable deliverables, and communicate with the less technical audience.
- Hypothesis oriented curiosity and tenacity in obtaining meaningful results through iterative data analysis and data prep.
- “Can do” attitude, outstanding technical troubleshooting and problem-solving abilities, aptitude to rapidly develop working knowledge of new tools, open source libraries, data sources etc.
- Ability to meet deadlines and flexibility to work constructively with shifting priorities.
- You have strong communication & presentation skills backed with strong hold of critical thinking.
- Bachelor’s degree in a STEM field, such as Statistics, Mathematics, Engineering, Biostatistics, Econometrics, Economics, Finance, or Data Science.
- Minimum of 5 years of working experience as Data Analyst, Engineer, Data Scientist or Researcher in digital marketing, consumer advertisement, telecom, healthcare or other areas requiring customer level predictive analytics.
- Proficiency in performing statistical analysis in R or Python, including relevant libraries is required. Prior experience in using these tools in analytical R&D strongly preferred.
- Advanced ability to use relevant technology/software to wrangle data, perform analytics, and visualize for consumption is required.
- Experience with SQL is required.
- Advanced experience: with basic Office Suite (Excel, Powerpoint) is required.
- Familiarity with medical and healthcare data preferred (medical claims, Rx, etc.).
- Experience with cloud technologies such as AWS or Google Cloud, required
- Exposure to big data tools (hadoop, pyspark) is preferred.
- Experience with Git/version control and Jira/ticketing system is strongly preferred.
- Experience with a visualization tool such as Looker and / or Tableau, preferred.
Lead Data Engineer
at Discite Analytics Private Limited
1. Communicate with the clients and understand their business requirements.
2. Build, train, and manage your own team of junior data engineers.
3. Assemble large, complex data sets that meet the client’s business requirements.
4. Identify, design and implement internal process improvements: automating manual processes, optimizing data delivery, re-designing infrastructure for greater scalability, etc.
5. Build the infrastructure required for optimal extraction, transformation, and loading of data from a wide variety of data sources, including the cloud.
6. Assist clients with data-related technical issues and support their data infrastructure requirements.
7. Work with data scientists and analytics experts to strive for greater functionality.
Skills required: (experience with at least most of these)
1. Experience with Big Data tools-Hadoop, Spark, Apache Beam, Kafka etc.
2. Experience with object-oriented/object function scripting languages: Python, Java, C++, Scala, etc.
3. Experience in ETL and Data Warehousing.
4. Experience and firm understanding of relational and non-relational databases like MySQL, MS SQL Server, Postgres, MongoDB, Cassandra etc.
5. Experience with cloud platforms like AWS, GCP and Azure.
6. Experience with workflow management using tools like Apache Airflow.
- Gather information from multiple data sources make Approval Decisions mechanically
- Read and interpret credit related information to the borrowers
- Interpret, analyze and assess all forms of complex information
- Embark on risk assessment analysis
- Maintain the credit exposure of the company within certain risk level with set limit in mind
- Build strategies to minimize risk and increase approval rates
- Design Champion and Challenger tests, implement and read test results
- Build Line assignment strategies
- Credit Risk Modeling
- Statistical Data Understanding and interpretation
- Basic Regression and Advanced Machine Learning Models
- Conversant with coding on Python using libraries like Sklearn etc.
- Build and understand decision trees
Data Engineer
at Prescience Decision Solutions
The Data Engineer would be responsible for selecting and integrating Big Data tools and frameworks required. Would implement Data Ingestion & ETL/ELT processes
Required Experience, Skills and Qualifications:
- Hands on experience on Big Data tools/technologies like Spark, Databricks, Map Reduce, Hive, HDFS.
- Expertise and excellent understanding of big data toolset such as Sqoop, Spark-streaming, Kafka, NiFi
- Proficiency in any of the programming language: Python/ Scala/ Java with 4+ years’ experience
- Experience in Cloud infrastructures like MS Azure, Data lake etc
- Good working knowledge in NoSQL DB (Mongo, HBase, Casandra)
- Desire to explore new technology and break new ground.
- Are passionate about Open Source technology, continuous learning, and innovation.
- Have the problem-solving skills, grit, and commitment to complete challenging work assignments and meet deadlines.
Qualifications
- Engineer enterprise-class, large-scale deployments, and deliver Cloud-based Serverless solutions to our customers.
- You will work in a fast-paced environment with leading microservice and cloud technologies, and continue to develop your all-around technical skills.
- Participate in code reviews and provide meaningful feedback to other team members.
- Create technical documentation.
- Develop thorough Unit Tests to ensure code quality.
Skills and Experience
- Advanced skills in troubleshooting and tuning AWS Lambda functions developed with Java and/or Python.
- Experience with event-driven architecture design patterns and practices
- Experience in database design and architecture principles and strong SQL abilities
- Message brokers like Kafka and Kinesis
- Experience with Hadoop, Hive, and Spark (either PySpark or Scala)
- Demonstrated experience owning enterprise-class applications and delivering highly available distributed, fault-tolerant, globally accessible services at scale.
- Good understanding of distributed systems.
- Candidates will be self-motivated and display initiative, ownership, and flexibility.
Preferred Qualifications
- AWS Lambda function development experience with Java and/or Python.
- Lambda triggers such as SNS, SES, or cron.
- Databricks
- Cloud development experience with AWS services, including:
- IAM
- S3
- EC2
- AWS CLI
- API Gateway
- ECR
- CloudWatch
- Glue
- Kinesis
- DynamoDB
- Java 8 or higher
- ETL data pipeline building
- Data Lake Experience
- Python
- Docker
- MongoDB or similar NoSQL DB.
- Relational Databases (e.g., MySQL, PostgreSQL, Oracle, etc.).
- Gradle and/or Maven.
- JUnit
- Git
- Scrum
- Experience with Unix and/or macOS.
- Immediate Joiners
Nice to have:
- AWS / GCP / Azure Certification.
- Cloud development experience with Google Cloud or Azure
Python Developer
at Reval Analytical Services Pvt Ltd
Position Name: Software Developer
Required Experience: 3+ Years
Number of positions: 4
Qualifications: Master’s or Bachelor s degree in Engineering, Computer Science, or equivalent (BE/BTech or MS in Computer Science).
Key Skills: Python, Django, Ngnix, Linux, Sanic, Pandas, Numpy, Snowflake, SciPy, Data Visualization, RedShift, BigData, Charting
Compensation - As per industry standards.
Joining - Immediate joining is preferrable.
Required Skills:
- Strong Experience in Python and web frameworks like Django, Tornado and/or Flask
- Experience in data analytics using standard python libraries using Pandas, NumPy, MatPlotLib
- Conversant in implementing charts using charting libraries like Highcharts, d3.js, c3.js, dc.js and data Visualization tools like Plotly, GGPlot
- Handling and using large databases and Datawarehouse technologies like MongoDB, MySQL, BigData, Snowflake, Redshift.
- Experience in building APIs, Multi-threading for tasks on Linux platform
- Exposure to finance and capital markets will be added advantage.
- Strong understanding of software design principles, algorithms, data structures, design patterns, and multithreading concepts.
- Worked on building highly-available distributed systems on cloud infrastructure or have had exposure to architectural pattern of a large, high-scale web application.
- Strong understanding of software design principles, algorithms, data structures, design patterns, and multithreading concepts.
- Basic understanding of front-end technologies, such as JavaScript, HTML5, and CSS3
Company Description:
Reval Analytical Services is a fully-owned subsidiary of Virtua Research Inc. US. It is a financial services technology company focused on consensus analytics, peer analytics and Web-enabled information delivery. The Company’s unique combination of investment research experience, modeling expertise, and software development capabilities enables it to provide industry-leading financial research tools and services for investors, analysts, and corporate management.
Website: http://www.virtuaresearch.com" target="_blank">www.virtuaresearch.com
Data Scientist
at Woodcutter Film Technologies Pvt. Ltd.