Job Location: Chennai
Job Summary
The Engineering team is seeking a Data Architect. As a Data Architect, you will drive a
Data Architecture strategy across various Data Lake platforms. You will help develop
reference architecture and roadmaps to build highly available, scalable and distributed
data platforms using cloud based solutions to process high volume, high velocity and
wide variety of structured and unstructured data. This role is also responsible for driving
innovation, prototyping, and recommending solutions. Above all, you will influence how
users interact with Conde Nast’s industry-leading journalism.
Primary Responsibilities
Data Architect is responsible for
• Demonstrated technology and personal leadership experience in architecting,
designing, and building highly scalable solutions and products.
• Enterprise scale expertise in data management best practices such as data integration,
data security, data warehousing, metadata management and data quality.
• Extensive knowledge and experience in architecting modern data integration
frameworks, highly scalable distributed systems using open source and emerging data
architecture designs/patterns.
• Experience building external cloud (e.g. GCP, AWS) data applications and capabilities is
highly desirable.
• Expert ability to evaluate, prototype and recommend data solutions and vendor
technologies and platforms.
• Proven experience in relational, NoSQL, ELT/ETL technologies and in-memory
databases.
• Experience with DevOps, Continuous Integration and Continuous Delivery technologies
is desirable.
• This role requires 15+ years of data solution architecture, design and development
delivery experience.
• Solid experience in Agile methodologies (Kanban and SCRUM)
Required Skills
• Very Strong Experience in building Large Scale High Performance Data Platforms.
• Passionate about technology and delivering solutions for difficult and intricate
problems. Current on Relational Databases and No sql databases on cloud.
• Proven leadership skills, demonstrated ability to mentor, influence and partner with
cross teams to deliver scalable robust solutions..
• Mastery of relational database, NoSQL, ETL (such as Informatica, Datastage etc) /ELT
and data integration technologies.
• Experience in any one of Object Oriented Programming (Java, Scala, Python) and
Spark.
• Creative view of markets and technologies combined with a passion to create the
future.
• Knowledge on cloud based Distributed/Hybrid data-warehousing solutions and Data
Lake knowledge is mandate.
• Good understanding of emerging technologies and its applications.
• Understanding of code versioning tools such as GitHub, SVN, CVS etc.
• Understanding of Hadoop Architecture and Hive SQL
• Knowledge in any one of the workflow orchestration
• Understanding of Agile framework and delivery
•
Preferred Skills:
● Experience in AWS and EMR would be a plus
● Exposure in Workflow Orchestration like Airflow is a plus
● Exposure in any one of the NoSQL database would be a plus
● Experience in Databricks along with PySpark/Spark SQL would be a plus
● Experience with the Digital Media and Publishing domain would be a
plus
● Understanding of Digital web events, ad streams, context models
About Condé Nast
CONDÉ NAST INDIA (DATA)
Over the years, Condé Nast successfully expanded and diversified into digital, TV, and social
platforms - in other words, a staggering amount of user data. Condé Nast made the right
move to invest heavily in understanding this data and formed a whole new Data team
entirely dedicated to data processing, engineering, analytics, and visualization. This team
helps drive engagement, fuel process innovation, further content enrichment, and increase
market revenue. The Data team aimed to create a company culture where data was the
common language and facilitate an environment where insights shared in real-time could
improve performance.
The Global Data team operates out of Los Angeles, New York, Chennai, and London. The
team at Condé Nast Chennai works extensively with data to amplify its brands' digital
capabilities and boost online revenue. We are broadly divided into four groups, Data
Intelligence, Data Engineering, Data Science, and Operations (including Product and
Marketing Ops, Client Services) along with Data Strategy and monetization. The teams built
capabilities and products to create data-driven solutions for better audience engagement.
What we look forward to:
We want to welcome bright, new minds into our midst and work together to create diverse
forms of self-expression. At Condé Nast, we encourage the imaginative and celebrate the
extraordinary. We are a media company for the future, with a remarkable past. We are
Condé Nast, and It Starts Here.
About Amagi Media Labs
Similar jobs
Sr. Data Scientist (Global Media Agency)
at Global Media Agency - A client of Merito
Our client combines Adtech and Martech platform strategy with data science & data engineering expertise, helping our clients make advertising work better for people.
- Act as primary day-to-day contact on analytics to agency-client leads
- Develop bespoke analytics proposals for presentation to agencies & clients, for delivery within the teams
- Ensure delivery of projects and services across the analytics team meets our stakeholder requirements (time, quality, cost)
- Hands on platforms to perform data pre-processing that involves data transformation as well as data cleaning
- Ensure data quality and integrity
- Interpret and analyse data problems
- Build analytic systems and predictive models
- Increasing the performance and accuracy of machine learning algorithms through fine-tuning and further
- Visualize data and create reports
- Experiment with new models and techniques
- Align data projects with organizational goals
Requirements
- Min 6 - 7 years’ experience working in Data Science
- Prior experience as a Data Scientist within a digital media is desirable
- Solid understanding of machine learning
- A degree in a quantitative field (e.g. economics, computer science, mathematics, statistics, engineering, physics, etc.)
- Experience with SQL/ Big Query/GMP tech stack / Clean rooms such as ADH
- A knack for statistical analysis and predictive modelling
- Good knowledge of R, Python
- Experience with SQL, MYSQL, PostgreSQL databases
- Knowledge of data management and visualization techniques
- Hands-on experience on BI/Visual Analytics Tools like PowerBI or Tableau or Data Studio
- Evidence of technical comfort and good understanding of internet functionality desirable
- Analytical pedigree - evidence of having approached problems from a mathematical perspective and working through to a solution in a logical way
- Proactive and results-oriented
- A positive, can-do attitude with a thirst to continually learn new things
- An ability to work independently and collaboratively with a wide range of teams
- Excellent communication skills, both written and oral
Machine Learning Engineer
at Uber9 Business Process Services Pvt Ltd
Working along with the highly motivated advanced Machine Learning team, with key responsibilities are to research, design, develop, and implement applications that will be integrated into our workflows. Responsibilities and Accountabilities:
I Experience:
III Skill Set & Personality Traits required:
|
**Education:
Qualification – Any engineering graduate with STRONG programming and logical reasoning skills.
**Minimum years of Experience:2 – 5 years**
Required Skills:
Previous experience as a Data Engineer or in a similar role.
Technical expertise with data models, data mining, and segmentation techniques.
**Knowledge of programming languages (e. g. Java and Python).
Hands-on experience with SQL Programming
Hands-on experience with Python Programming
Knowledge of these tools DBT, ADF, Snowflakes, and Databricks would be added advantage for our current project.**
Strong numerical and analytical skills.
Experience in dealing directly with customers and internal sales organizations.
Strong written and verbal communication, including technical writing skills.
Good to have: Hands-on experience in Cloud services.
Knowledge with ML
Data Warehouse builds (DB, SQL, ETL, Reporting Tools like Power BI…)
Do share your profile to gayathrirajagopalan @jmangroup.com
Ability to interpret and map business, functional and non functional requirements to technical specifications
Interact with diverse stakeholders like clients, project manager/scrum master, business analysts, testing and
other cross-functional teams as part of Business Intelligence projects
Develop solutions following established technical design, application development standards and quality processes in projects to deliver efficient, reusable and reliable code with complete ownership
Assess the impacts on technical design because of the changes in functional requirements.
Developing the full life cycle of a BI project (data movement and visualization) which includes requirements analysis , platform selection , architecture , application design and development , testing and deployment
Provide architectural leadership with a strong emphasis on data architecture around ETL and governance
Troubleshoot highly complex technical problems in a OLAP/OLTP/DW based environments
Provide support specific to application bugs or issues within defined SLAs
Proactively identify and communicate technical risks, issues, and challenges with mitigations
Perform independent code reviews and guide junior team members for correction
Must Have Technical Skills :
3+ years of experience leading data warehouse implementation with technical architectures , ETL / ELT, reporting / analytic tools and scripting (end to end implementation)
Experienced in Microsoft Azure (Azure SQL Managed Instance , Data Factory , Azure Synapse, Azure Monitoring, Azure DevOps , Event Hubs , Azure AD Security)
Deep experience in using any BI tools such as Power BI/Tableau, QlikView/SAP-BO etc.,
Experienced in ETL tools such as SSIS, Talend/Informatica/Pentaho
Expertise in using RDBMSes like Oracle, SQL Server as source or target and online analytical processing (OLAP)
Experienced in SQL/T-SQL/ DML/DDL statements, stored procedure, function, trigger, indexes, cursor
Expertise in building and organizing advanced DAX calculations and SSAS cubes
Experience in data/dimensional modelling, analysis, design, testing, development, and implementation
Experienced in advanced data warehouse concepts using structured, semi-structured and un-structured data
Experienced with real time ingestion, change data capture, real time & batch processing
Good knowledge of meta data management and data governance
Great problem solving skills, with a strong bias for quality and design excellence
Experienced in developing dashboards with a focus on usability, performance, flexibility, testability, and
standardization.
Familiarity with development in cloud environments like AWS / Azure / Google
Machine Learning Engineer
at Top Management Consulting Company
We are looking for a Machine Learning engineer for on of our premium client.
Experience: 2-9 years
Location: Gurgaon/Bangalore
Tech Stack:
Python, PySpark, the Python Scientific Stack; MLFlow, Grafana, Prometheus for machine learning pipeline management and monitoring; SQL, Airflow, Databricks, our own open-source data pipelining framework called Kedro, Dask/RAPIDS; Django, GraphQL and ReactJS for horizontal product development; container technologies such as Docker and Kubernetes, CircleCI/Jenkins for CI/CD, cloud solutions such as AWS, GCP, and Azure as well as Terraform and Cloudformation for deployment
Senior Data Scientist (Health Metrics)
at Biostrap
Introduction
The Biostrap platform extracts many metrics related to health, sleep, and activity. Many algorithms are designed through research and often based on scientific literature, and in some cases they are augmented with or entirely designed using machine learning techniques. Biostrap is seeking a Data Scientist to design, develop, and implement algorithms to improve existing metrics and measure new ones.
Job Description
As a Data Scientist at Biostrap, you will take on projects to improve or develop algorithms to measure health metrics, including:
- Research: search literature for starting points of the algorithm
- Design: decide on the general idea of the algorithm, in particular whether to use machine learning, mathematical techniques, or something else.
- Implement: program the algorithm in Python, and help deploy it.
The algorithms and their implementation will have to be accurate, efficient, and well-documented.
Requirements
- A Master’s degree in a computational field, with a strong mathematical background.
- Strong knowledge of, and experience with, different machine learning techniques, including their theoretical background.
- Strong experience with Python
- Experience with Keras/TensorFlow, and preferably also with RNNs
- Experience with AWS or similar services for data pipelining and machine learning.
- Ability and drive to work independently on an open problem.
- Fluency in English.
empower healthcare payers, providers and members to quickly process medical data to
make informed decisions and reduce health care costs. You will be focusing on research,
development, strategy, operations, people management, and being a thought leader for
team members based out of India. You should have professional healthcare experience
using both structured and unstructured data to build applications. These applications
include but are not limited to machine learning, artificial intelligence, optical character
recognition, natural language processing, and integrating processes into the overall AI
pipeline to mine healthcare and medical information with high recall and other relevant
metrics. The results will be used dually for real-time operational processes with both
automated and human-based decision making as well as contribute to reducing
healthcare administrative costs. We work with all major cloud and big data vendors
offerings including (Azure, AWS, Google, IBM, etc.) to achieve our goals in healthcare and
support
The Director, Data Science will have the opportunity to build a team, shape team culture
and operating norms as a result of the fast-paced nature of a new, high-growth
organization.
• Strong communication and presentation skills to convey progress to a diverse group of stakeholders
• Strong expertise in data science, data engineering, software engineering, cloud vendors, big data technologies, real-time streaming applications, DevOps and product delivery
• Experience building stakeholder trust and confidence in deployed models especially via application of the algorithmic bias, interpretable machine learning,
data integrity, data quality, reproducible research and reliable engineering 24x7x365 product availability, scalability
• Expertise in healthcare privacy, federated learning, continuous integration and deployment, DevOps support
• Provide mentoring to data scientists and machine learning engineers as well as career development
• Meet project related team members for individual specific needs on a regular basis related to project/product deliverables
• Provide training and guidance for team members when required
• Provide performance feedback when required by leadership
The Experience You’ll Need (Required):
• MS/M.Tech degree or PhD in Computer Science, Mathematics, Physics or related STEM fields
• Significant healthcare data experience including but not limited to usage of claims data
• Delivered multiple data science and machine learning projects over 8+ years with values exceeding $10 Million or more and has worked on platform members exceeding 10 million lives
• 9+ years of industry experience in data science, machine learning, and artificial intelligence
• Strong expertise in data science, data engineering, software engineering, cloud vendors, big data technologies, real time streaming applications, DevOps, and product delivery
• Knows how to solve and launch real artificial intelligence and data science related problems and products along with managing and coordinating the
business process change, IT / cloud operations, meeting production level code standards
• Ownerships of key workflows part of data science life cycle like data acquisition, data quality, and results
• Experience building stakeholder trust and confidence in deployed models especially via application of algorithmic bias, interpretable machine learning,
data integrity, data quality, reproducible research, and reliable engineering 24x7x365 product availability, scalability
• Expertise in healthcare privacy, federated learning, continuous integration and deployment, DevOps support
• 3+ Years of experience managing directly five (5) or more senior level data scientists, machine learning engineers with advanced degrees and directly
made staff decisions
• Very strong understanding of mathematical concepts including but not limited to linear algebra, advanced calculus, partial differential equations, and
statistics including Bayesian approaches at master’s degree level and above
• 6+ years of programming experience in C++ or Java or Scala and data science programming languages like Python and R including strong understanding of
concepts like data structures, algorithms, compression techniques, high performance computing, distributed computing, and various computer architecture
• Very strong understanding and experience with traditional data science approaches like sampling techniques, feature engineering, classification, and
regressions, SVM, trees, model evaluations with several projects over 3+ years
• Very strong understanding and experience in Natural Language Processing,
reasoning, and understanding, information retrieval, text mining, search, with
3+ years of hands on experience
• Experience with developing and deploying several products in production with
experience in two or more of the following languages (Python, C++, Java, Scala)
• Strong Unix/Linux background and experience with at least one of the
following cloud vendors like AWS, Azure, and Google
• Three plus (3+) years hands on experience with MapR \ Cloudera \ Databricks
Big Data platform with Spark, Hive, Kafka etc.
• Three plus (3+) years of experience with high-performance computing like
Dask, CUDA distributed GPU, TPU etc.
• Presented at major conferences and/or published materials
culture and operating norms as a result of the fast-paced nature of a new, high-growth
organization.
• 7+ years of Industry experience primarily related to Unstructured Text Data and NLP
(PhD work and internships will be considered if they are related to unstructured text
in lieu of industry experience but not more than 2 years will be accounted towards
industry experience)
• Develop Natural Language Medical/Healthcare documents comprehension related
products to support Health business objectives, products and improve
processing efficiency, reducing overall healthcare costs
• Gather external data sets; build synthetic data and label data sets as per the needs
for NLP/NLR/NLU
• Apply expert software engineering skills to build Natural Language products to
improve automation and improve user experiences leveraging unstructured data storage, Entity Recognition, POS Tagging, ontologies, taxonomies, data mining,
information retrieval techniques, machine learning approach, distributed and cloud
computing platforms
• Own the Natural Language and Text Mining products — from platforms to systems
for model training, versioning, deploying, storage and testing models with creating
real time feedback loops to fully automated services
• Work closely and collaborate with Data Scientists, Machine Learning engineers, IT
teams and Business stakeholders spread out across various locations in US and India
to achieve business goals
• Provide mentoring to other Data Scientist and Machine Learning Engineers
• Strong understanding of mathematical concepts including but not limited to linear
algebra, Advanced calculus, partial differential equations and statistics including
Bayesian approaches
• Strong programming experience including understanding of concepts in data
structures, algorithms, compression techniques, high performance computing,
distributed computing, and various computer architecture
• Good understanding and experience with traditional data science approaches like
sampling techniques, feature engineering, classification and regressions, SVM, trees,
model evaluations
• Additional course work, projects, research participation and/or publications in
Natural Language processing, reasoning and understanding, information retrieval,
text mining, search, computational linguistics, ontologies, semantics
• Experience with developing and deploying products in production with experience
in two or more of the following languages (Python, C++, Java, Scala)
• Strong Unix/Linux background and experience with at least one of the following
cloud vendors like AWS, Azure, and Google for 2+ years
• Hands on experience with one or more of high-performance computing and
distributed computing like Spark, Dask, Hadoop, CUDA distributed GPU (2+ years)
• Thorough understanding of deep learning architectures and hands on experience
with one or more frameworks like tensorflow, pytorch, keras (2+ years)
• Hands on experience with libraries and tools like Spacy, NLTK, Stanford core NLP,
Genism, johnsnowlabs for 5+ years
• Understanding business use cases and be able to translate them to team with a
vision on how to implement
• Identify enhancements and build best practices that can help to improve the
productivity of the team.