
About Vedantu
Similar jobs
Requirements:
● Understanding our data sets and how to bring them together.
● Working with our engineering team to support custom solutions offered to the product development.
● Filling the gap between development, engineering and data ops.
● Creating, maintaining and documenting scripts to support ongoing custom solutions.
● Excellent organizational skills, including attention to precise details
● Strong multitasking skills and ability to work in a fast-paced environment
● 5+ years experience with Python to develop scripts.
● Know your way around RESTFUL APIs.[Able to integrate not necessary to publish]
● You are familiar with pulling and pushing files from SFTP and AWS S3.
● Experience with any Cloud solutions including GCP / AWS / OCI / Azure.
● Familiarity with SQL programming to query and transform data from relational Databases.
● Familiarity to work with Linux (and Linux work environment).
● Excellent written and verbal communication skills
● Extracting, transforming, and loading data into internal databases and Hadoop
● Optimizing our new and existing data pipelines for speed and reliability
● Deploying product build and product improvements
● Documenting and managing multiple repositories of code
● Experience with SQL and NoSQL databases (Casendra, MySQL)
● Hands-on experience in data pipelining and ETL. (Any of these frameworks/tools: Hadoop, BigQuery,
RedShift, Athena)
● Hands-on experience in AirFlow
● Understanding of best practices, common coding patterns and good practices around
● storing, partitioning, warehousing and indexing of data
● Experience in reading the data from Kafka topic (both live stream and offline)
● Experience in PySpark and Data frames
Responsibilities:
You’ll
● Collaborating across an agile team to continuously design, iterate, and develop big data systems.
● Extracting, transforming, and loading data into internal databases.
● Optimizing our new and existing data pipelines for speed and reliability.
● Deploying new products and product improvements.
● Documenting and managing multiple repositories of code.
Job Location: Chennai
Job Summary
The Engineering team is seeking a Data Architect. As a Data Architect, you will drive a
Data Architecture strategy across various Data Lake platforms. You will help develop
reference architecture and roadmaps to build highly available, scalable and distributed
data platforms using cloud based solutions to process high volume, high velocity and
wide variety of structured and unstructured data. This role is also responsible for driving
innovation, prototyping, and recommending solutions. Above all, you will influence how
users interact with Conde Nast’s industry-leading journalism.
Primary Responsibilities
Data Architect is responsible for
• Demonstrated technology and personal leadership experience in architecting,
designing, and building highly scalable solutions and products.
• Enterprise scale expertise in data management best practices such as data integration,
data security, data warehousing, metadata management and data quality.
• Extensive knowledge and experience in architecting modern data integration
frameworks, highly scalable distributed systems using open source and emerging data
architecture designs/patterns.
• Experience building external cloud (e.g. GCP, AWS) data applications and capabilities is
highly desirable.
• Expert ability to evaluate, prototype and recommend data solutions and vendor
technologies and platforms.
• Proven experience in relational, NoSQL, ELT/ETL technologies and in-memory
databases.
• Experience with DevOps, Continuous Integration and Continuous Delivery technologies
is desirable.
• This role requires 15+ years of data solution architecture, design and development
delivery experience.
• Solid experience in Agile methodologies (Kanban and SCRUM)
Required Skills
• Very Strong Experience in building Large Scale High Performance Data Platforms.
• Passionate about technology and delivering solutions for difficult and intricate
problems. Current on Relational Databases and No sql databases on cloud.
• Proven leadership skills, demonstrated ability to mentor, influence and partner with
cross teams to deliver scalable robust solutions..
• Mastery of relational database, NoSQL, ETL (such as Informatica, Datastage etc) /ELT
and data integration technologies.
• Experience in any one of Object Oriented Programming (Java, Scala, Python) and
Spark.
• Creative view of markets and technologies combined with a passion to create the
future.
• Knowledge on cloud based Distributed/Hybrid data-warehousing solutions and Data
Lake knowledge is mandate.
• Good understanding of emerging technologies and its applications.
• Understanding of code versioning tools such as GitHub, SVN, CVS etc.
• Understanding of Hadoop Architecture and Hive SQL
• Knowledge in any one of the workflow orchestration
• Understanding of Agile framework and delivery
•
Preferred Skills:
● Experience in AWS and EMR would be a plus
● Exposure in Workflow Orchestration like Airflow is a plus
● Exposure in any one of the NoSQL database would be a plus
● Experience in Databricks along with PySpark/Spark SQL would be a plus
● Experience with the Digital Media and Publishing domain would be a
plus
● Understanding of Digital web events, ad streams, context models
About Condé Nast
CONDÉ NAST INDIA (DATA)
Over the years, Condé Nast successfully expanded and diversified into digital, TV, and social
platforms - in other words, a staggering amount of user data. Condé Nast made the right
move to invest heavily in understanding this data and formed a whole new Data team
entirely dedicated to data processing, engineering, analytics, and visualization. This team
helps drive engagement, fuel process innovation, further content enrichment, and increase
market revenue. The Data team aimed to create a company culture where data was the
common language and facilitate an environment where insights shared in real-time could
improve performance.
The Global Data team operates out of Los Angeles, New York, Chennai, and London. The
team at Condé Nast Chennai works extensively with data to amplify its brands' digital
capabilities and boost online revenue. We are broadly divided into four groups, Data
Intelligence, Data Engineering, Data Science, and Operations (including Product and
Marketing Ops, Client Services) along with Data Strategy and monetization. The teams built
capabilities and products to create data-driven solutions for better audience engagement.
What we look forward to:
We want to welcome bright, new minds into our midst and work together to create diverse
forms of self-expression. At Condé Nast, we encourage the imaginative and celebrate the
extraordinary. We are a media company for the future, with a remarkable past. We are
Condé Nast, and It Starts Here.
Duties and Responsibilities:
Research and Develop Innovative Use Cases, Solutions and Quantitative Models
Quantitative Models in Video and Image Recognition and Signal Processing for cloudbloom’s
cross-industry business (e.g., Retail, Energy, Industry, Mobility, Smart Life and
Entertainment).
Design, Implement and Demonstrate Proof-of-Concept and Working Proto-types
Provide R&D support to productize research prototypes.
Explore emerging tools, techniques, and technologies, and work with academia for cutting-
edge solutions.
Collaborate with cross-functional teams and eco-system partners for mutual business benefit.
Team Management Skills
Academic Qualification
7+ years of professional hands-on work experience in data science, statistical modelling, data
engineering, and predictive analytics assignments
Mandatory Requirements: Bachelor’s degree with STEM background (Science, Technology,
Engineering and Management) with strong quantitative flavour
Innovative and creative in data analysis, problem solving and presentation of solutions.
Ability to establish effective cross-functional partnerships and relationships at all levels in a
highly collaborative environment
Strong experience in handling multi-national client engagements
Good verbal, writing & presentation skills
Core Expertise
Excellent understanding of basics in mathematics and statistics (such as differential
equations, linear algebra, matrix, combinatorics, probability, Bayesian statistics, eigen
vectors, Markov models, Fourier analysis).
Building data analytics models using Python, ML libraries, Jupyter/Anaconda and Knowledge
database query languages like SQL
Good knowledge of machine learning methods like k-Nearest Neighbors, Naive Bayes, SVM,
Decision Forests.
Strong Math Skills (Multivariable Calculus and Linear Algebra) - understanding the
fundamentals of Multivariable Calculus and Linear Algebra is important as they form the basis
of a lot of predictive performance or algorithm optimization techniques.
Deep learning : CNN, neural Network, RNN, tensorflow, pytorch, computervision,
Large-scale data extraction/mining, data cleansing, diagnostics, preparation for Modeling
Good applied statistical skills, including knowledge of statistical tests, distributions,
regression, maximum likelihood estimators, Multivariate techniques & predictive modeling
cluster analysis, discriminant analysis, CHAID, logistic & multiple regression analysis
Experience with Data Visualization Tools like Tableau, Power BI, Qlik Sense that help to
visually encode data
Excellent Communication Skills – it is incredibly important to describe findings to a technical
and non-technical audience
Capability for continuous learning and knowledge acquisition.
Mentor colleagues for growth and success
Strong Software Engineering Background
Hands-on experience with data science tools
Data Scientist
Requirements
● B.Tech/Masters in Mathematics, Statistics, Computer Science or another
quantitative field
● 2-3+ years of work experience in ML domain ( 2-5 years experience )
● Hands-on coding experience in Python
● Experience in machine learning techniques such as Regression, Classification,
Predictive modeling, Clustering, Deep Learning stack, NLP
● Working knowledge of Tensorflow/PyTorch
Optional Add-ons-
● Experience with distributed computing frameworks: Map/Reduce, Hadoop, Spark
etc.
● Experience with databases: MongoDB
Job Title : Analyst / Sr. Analyst – Data Science Developer - Python
Exp : 2 to 5 yrs
Loc : B’lore / Hyd / Chennai
NP: Candidate should join us in 2 months (Max) / Immediate Joiners Pref.
About the role:
We are looking for an Analyst / Senior Analyst who works in the analytics domain with a strong python background.
Desired Skills, Competencies & Experience:
• • 2-4 years of experience in working in the analytics domain with a strong python background. • • Visualization skills in python with plotly, matplotlib, seaborn etc. Ability to create customized plots using such tools. • • Ability to write effective, scalable and modular code. Should be able to understand, test and debug existing python project modules quickly and contribute to that. • • Should be familiarized with Git workflows.
Good to Have: • • Familiarity with cloud platforms like AWS, AzureML, Databricks, GCP etc. • • Understanding of shell scripting, python package development. • • Experienced with Python data science packages like Pandas, numpy, sklearn etc. • • ML model building and evaluation experience using sklearn.
|
We are looking for an outstanding Big Data Engineer with experience setting up and maintaining Data Warehouse and Data Lakes for an Organization. This role would closely collaborate with the Data Science team and assist the team build and deploy machine learning and deep learning models on big data analytics platforms.
Roles and Responsibilities:
- Develop and maintain scalable data pipelines and build out new integrations and processes required for optimal extraction, transformation, and loading of data from a wide variety of data sources using 'Big Data' technologies.
- Develop programs in Scala and Python as part of data cleaning and processing.
- Assemble large, complex data sets that meet functional / non-functional business requirements and fostering data-driven decision making across the organization.
- Responsible to design and develop distributed, high volume, high velocity multi-threaded event processing systems.
- Implement processes and systems to validate data, monitor data quality, ensuring production data is always accurate and available for key stakeholders and business processes that depend on it.
- Perform root cause analysis on internal and external data and processes to answer specific business questions and identify opportunities for improvement.
- Provide high operational excellence guaranteeing high availability and platform stability.
- Closely collaborate with the Data Science team and assist the team build and deploy machine learning and deep learning models on big data analytics platforms.
Skills:
- Experience with Big Data pipeline, Big Data analytics, Data warehousing.
- Experience with SQL/No-SQL, schema design and dimensional data modeling.
- Strong understanding of Hadoop Architecture, HDFS ecosystem and eexperience with Big Data technology stack such as HBase, Hadoop, Hive, MapReduce.
- Experience in designing systems that process structured as well as unstructured data at large scale.
- Experience in AWS/Spark/Java/Scala/Python development.
- Should have Strong skills in PySpark (Python & SPARK). Ability to create, manage and manipulate Spark Dataframes. Expertise in Spark query tuning and performance optimization.
- Experience in developing efficient software code/frameworks for multiple use cases leveraging Python and big data technologies.
- Prior exposure to streaming data sources such as Kafka.
- Should have knowledge on Shell Scripting and Python scripting.
- High proficiency in database skills (e.g., Complex SQL), for data preparation, cleaning, and data wrangling/munging, with the ability to write advanced queries and create stored procedures.
- Experience with NoSQL databases such as Cassandra / MongoDB.
- Solid experience in all phases of Software Development Lifecycle - plan, design, develop, test, release, maintain and support, decommission.
- Experience with DevOps tools (GitHub, Travis CI, and JIRA) and methodologies (Lean, Agile, Scrum, Test Driven Development).
- Experience building and deploying applications on on-premise and cloud-based infrastructure.
- Having a good understanding of machine learning landscape and concepts.
Qualifications and Experience:
Engineering and post graduate candidates, preferably in Computer Science, from premier institutions with proven work experience as a Big Data Engineer or a similar role for 3-5 years.
Certifications:
Good to have at least one of the Certifications listed here:
AZ 900 - Azure Fundamentals
DP 200, DP 201, DP 203, AZ 204 - Data Engineering
AZ 400 - Devops Certification
• Excellent understanding of machine learning techniques and algorithms, such as SVM, Decision Forests, k-NN, Naive Bayes etc.
• Experience in selecting features, building and optimizing classifiers using machine learning techniques.
• Prior experience with data visualization tools, such as D3.js, GGplot, etc..
• Good knowledge on statistics skills, such as distributions, statistical testing, regression, etc..
• Adequate presentation and communication skills to explain results and methodologies to non-technical stakeholders.
• Basic understanding of the banking industry is value add
Develop, process, cleanse and enhance data collection procedures from multiple data sources.
• Conduct & deliver experiments and proof of concepts to validate business ideas and potential value.
• Test, troubleshoot and enhance the developed models in a distributed environments to improve it's accuracy.
• Work closely with product teams to implement algorithms with Python and/or R.
• Design and implement scalable predictive models, classifiers leveraging machine learning, data regression.
• Facilitate integration with enterprise applications using APIs to enrich implementations

