Role & responsibilities:
- Developing ETL pipelines for data replication
- Analyze, query and manipulate data according to defined business rules and procedures
- Manage very large-scale data from a multitude of sources into appropriate sets for research and development for data science and analysts across the company
- Convert prototypes into production data engineering solutions through rigorous software engineering practices and modern deployment pipelines
- Resolve internal and external data exceptions in timely and accurate manner
- Improve multi-environment data flow quality, security, and performance
Skills & qualifications:
- Must have experience with:
- virtualization, containers, and orchestration (Docker, Kubernetes)
- creating log ingestion pipelines (Apache Beam) both batch and streaming processing (Pub/Sub, Kafka)
- workflow orchestration tools (Argo, Airflow)
- supporting machine learning models in production
- Have a desire to continually keep up with advancements in data engineering practices
- Strong Python programming and exploratory data analysis skills
- Ability to work independently and with team members from different backgrounds
- At least a bachelor's degree in an analytical or technical field. This could be applied mathematics, statistics, computer science, operations research, economics, etc. Higher education is welcome and encouraged.
- 3+ years of work in software/data engineering.
- Superior interpersonal, independent judgment, complex problem-solving skills
- Global orientation, experience working across countries, regions and time zones
Similar jobs
Senior Executive - Analytics
Overview of job :-
Our Client is the world’s largest media investment company which is a part of WPP. They are a global digital transformation agency with 1200 employees across 21 nations. Our team of experts support clients in programmatic, social, paid search, analytics, technology, organic search, affiliate marketing, e-commerce and across traditional channels.
We are currently looking for a Sr Executive – Analytics to join us. In this role, you will be responsible for a massive opportunity to build and be a part of largest performance marketing setup APAC is committed to fostering a culture of diversity and inclusion. Our people are our strength so we respect and nurture their individual talent and potential.
Reporting of the role - This role reports to the Director - Analytics,
3 best things about the job:
1. Responsible for data & analytics projects and developing data strategies by diving into data and extrapolating insights and providing guidance to clients
2. Build and be a part of a dynamic team
3. Being part of a global organisations with rapid growth opportunities
Responsibilities of the role:
Build Marketing-Mix and Multi-Touch attribution models using a range of tools, including free and paid.
Work with large data sets via hands-on data processing to produce structured data sets for analysis.
Design and build Visualization, Dashboard and reports for both Internal and external clients using Tableau, Power BI, Datorama or R Shiny/Python.
What you will need:
Degree in Mathematics, Statistics, Economics, Engineering, Data Science, Computer Science or quantitative field.
2-3 years’ experience in Marketing/Data Analytics or related field with hands-on experience in building Marketing-Mix and Attribution models. Proficiency in one or more coding languages – preferred languages: Python, R
Proficiency in one or more Visualization Tools – Tableau, Datorama, Power BI
Proficiency in using SQL.
Proficiency with one or more statistical tools is a plus – Example: SPSS, SAS, MATLAB, Mathcad.
Working experience using big data technologies (Hive/Hadoop) is a plus
We are looking out for a technically driven "ML OPS Engineer" for one of our premium client
COMPANY DESCRIPTION:
Key Skills
• Excellent hands-on expert knowledge of cloud platform infrastructure and administration
(Azure/AWS/GCP) with strong knowledge of cloud services integration, and cloud security
• Expertise setting up CI/CD processes, building and maintaining secure DevOps pipelines with at
least 2 major DevOps stacks (e.g., Azure DevOps, Gitlab, Argo)
• Experience with modern development methods and tooling: Containers (e.g., docker) and
container orchestration (K8s), CI/CD tools (e.g., Circle CI, Jenkins, GitHub actions, Azure
DevOps), version control (Git, GitHub, GitLab), orchestration/DAGs tools (e.g., Argo, Airflow,
Kubeflow)
• Hands-on coding skills Python 3 (e.g., API including automated testing frameworks and libraries
(e.g., pytest) and Infrastructure as Code (e.g., Terraform) and Kubernetes artifacts (e.g.,
deployments, operators, helm charts)
• Experience setting up at least one contemporary MLOps tooling (e.g., experiment tracking,
model governance, packaging, deployment, feature store)
• Practical knowledge delivering and maintaining production software such as APIs and cloud
infrastructure
• Knowledge of SQL (intermediate level or more preferred) and familiarity working with at least
one common RDBMS (MySQL, Postgres, SQL Server, Oracle)
- Creating and managing ETL/ELT pipelines based on requirements
- Build PowerBI dashboards and manage datasets needed.
- Work with stakeholders to identify data structures needed for future and perform any transformations including aggregations.
- Build data cubes for real-time visualisation needs and CXO dashboards.
Required Tech Skills
- Microsoft PowerBI & DAX
- Python, Pandas, PyArrow, Jupyter Noteboks, ApacheSpark
- Azure Synapse, Azure DataBricks, Azure HDInsight, Azure Data Factory
Job Responsibilities
- Design, build & test ETL processes using Python & SQL for the corporate data warehouse
- Inform, influence, support, and execute our product decisions
- Maintain advertising data integrity by working closely with R&D to organize and store data in a format that provides accurate data and allows the business to quickly identify issues.
- Evaluate and prototype new technologies in the area of data processing
- Think quickly, communicate clearly and work collaboratively with product, data, engineering, QA and operations teams
- High energy level, strong team player and good work ethic
- Data analysis, understanding of business requirements and translation into logical pipelines & processes
- Identification, analysis & resolution of production & development bugs
- Support the release process including completing & reviewing documentation
- Configure data mappings & transformations to orchestrate data integration & validation
- Provide subject matter expertise
- Document solutions, tools & processes
- Create & support test plans with hands-on testing
- Peer reviews of work developed by other data engineers within the team
- Establish good working relationships & communication channels with relevant departments
Skills and Qualifications we look for
- University degree 2.1 or higher (or equivalent) in a relevant subject. Master’s degree in any data subject will be a strong advantage.
- 4 - 6 years experience with data engineering.
- Strong coding ability and software development experience in Python.
- Strong hands-on experience with SQL and Data Processing.
- Google cloud platform (Cloud composer, Dataflow, Cloud function, Bigquery, Cloud storage, dataproc)
- Good working experience in any one of the ETL tools (Airflow would be preferable).
- Should possess strong analytical and problem solving skills.
- Good to have skills - Apache pyspark, CircleCI, Terraform
- Motivated, self-directed, able to work with ambiguity and interested in emerging technologies, agile and collaborative processes.
- Understanding & experience of agile / scrum delivery methodology
Job description:
- Selecting features, building and optimizing classifiers using machine learning techniques
- Mining data as and when required
- Enhancing data collection procedures to include information that is relevant for building analytic systems
- Processing, cleansing, and verifying the integrity of data used for analysis
- Doing ad-hoc analysis and presenting results in a clear manner
- Creating automated anomaly detection systems and constant tracking of its performance
- Efficient stakeholder management
Skills and Qualifications
- Excellent understanding of machine learning techniques and algorithms, such as k-NN, Naive Bayes, SVM, Decision Forests, etc.
- Good applied statistics skills, such as distributions, statistical testing, regression, etc.
- Experience with common data science toolkits.
- Great communication skills
- Experience with data visualisation tools
- Proficiency in using query languages such as SQL
- Good scripting and programming skills
- Data-oriented personality
- B.Tech, M.Tech, B.S., M.S., MBA
Requirement / Desired Skills
Data Scientist -- Data mining skills , SQL, Advanced ML Techniques, NLP (natural Language Processing)
- Handling Survey Scripting Process through the use of survey software platform such as Toluna, QuestionPro, Decipher.
- Mining large & complex data sets using SQL, Hadoop, NoSQL or Spark.
- Delivering complex consumer data analysis through the use of software like R, Python, Excel and etc such as
- Working on Basic Statistical Analysis such as:T-Test &Correlation
- Performing more complex data analysis processes through Machine Learning technique such as:
- Classification
- Regression
- Clustering
- Text
- Analysis
- Neural Networking
- Creating an Interactive Dashboard Creation through the use of software like Tableau or any other software you are able to use.
- Working on Statistical and mathematical modelling, application of ML and AI algorithms
What you need to have:
- Bachelor or Master's degree in highly quantitative field (CS, machine learning, mathematics, statistics, economics) or equivalent experience.
- An opportunity for one, who is eager of proving his or her data analytical skills with one of the Biggest FMCG market player.
- Data Steward :
Data Steward will collaborate and work closely within the group software engineering and business division. Data Steward has overall accountability for the group's / Divisions overall data and reporting posture by responsibly managing data assets, data lineage, and data access, supporting sound data analysis. This role requires focus on data strategy, execution, and support for projects, programs, application enhancements, and production data fixes. Makes well-thought-out decisions on complex or ambiguous data issues and establishes the data stewardship and information management strategy and direction for the group. Effectively communicates to individuals at various levels of the technical and business communities. This individual will become part of the corporate Data Quality and Data management/entity resolution team supporting various systems across the board.
Primary Responsibilities:
- Responsible for data quality and data accuracy across all group/division delivery initiatives.
- Responsible for data analysis, data profiling, data modeling, and data mapping capabilities.
- Responsible for reviewing and governing data queries and DML.
- Accountable for the assessment, delivery, quality, accuracy, and tracking of any production data fixes.
- Accountable for the performance, quality, and alignment to requirements for all data query design and development.
- Responsible for defining standards and best practices for data analysis, modeling, and queries.
- Responsible for understanding end-to-end data flows and identifying data dependencies in support of delivery, release, and change management.
- Responsible for the development and maintenance of an enterprise data dictionary that is aligned to data assets and the business glossary for the group responsible for the definition and maintenance of the group's data landscape including overlays with the technology landscape, end-to-end data flow/transformations, and data lineage.
- Responsible for rationalizing the group's reporting posture through the definition and maintenance of a reporting strategy and roadmap.
- Partners with the data governance team to ensure data solutions adhere to the organization’s data principles and guidelines.
- Owns group's data assets including reports, data warehouse, etc.
- Understand customer business use cases and be able to translate them to technical specifications and vision on how to implement a solution.
- Accountable for defining the performance tuning needs for all group data assets and managing the implementation of those requirements within the context of group initiatives as well as steady-state production.
- Partners with others in test data management and masking strategies and the creation of a reusable test data repository.
- Responsible for solving data-related issues and communicating resolutions with other solution domains.
- Actively and consistently support all efforts to simplify and enhance the Clinical Trial Predication use cases.
- Apply knowledge in analytic and statistical algorithms to help customers explore methods to improve their business.
- Contribute toward analytical research projects through all stages including concept formulation, determination of appropriate statistical methodology, data manipulation, research evaluation, and final research report.
- Visualize and report data findings creatively in a variety of visual formats that appropriately provide insight to the stakeholders.
- Achieve defined project goals within customer deadlines; proactively communicate status and escalate issues as needed.
Additional Responsibilities:
- Strong understanding of the Software Development Life Cycle (SDLC) with Agile Methodologies
- Knowledge and understanding of industry-standard/best practices requirements gathering methodologies.
- Knowledge and understanding of Information Technology systems and software development.
- Experience with data modeling and test data management tools.
- Experience in the data integration project • Good problem solving & decision-making skills.
- Good communication skills within the team, site, and with the customer
Knowledge, Skills and Abilities
- Technical expertise in data architecture principles and design aspects of various DBMS and reporting concepts.
- Solid understanding of key DBMS platforms like SQL Server, Azure SQL
- Results-oriented, diligent, and works with a sense of urgency. Assertive, responsible for his/her own work (self-directed), have a strong affinity for defining work in deliverables, and be willing to commit to deadlines.
- Experience in MDM tools like MS DQ, SAS DM Studio, Tamr, Profisee, Reltio etc.
- Experience in Report and Dashboard development
- Statistical and Machine Learning models
- Python (sklearn, numpy, pandas, genism)
- Nice to Have:
- 1yr of ETL experience
- Natural Language Processing
- Neural networks and Deep learning
- xperience in keras,tensorflow,spacy, nltk, LightGBM python library
Interaction : Frequently interacts with subordinate supervisors.
Education : Bachelor’s degree, preferably in Computer Science, B.E or other quantitative field related to the area of assignment. Professional certification related to the area of assignment may be required
Experience : 7 years of Pharmaceutical /Biotech/life sciences experience, 5 years of Clinical Trials experience and knowledge, Excellent Documentation, Communication, and Presentation Skills including PowerPoint
1. Expert in deep learning and machine learning techniques,
2. Extremely Good in image/video processing,
3. Have a Good understanding of Linear algebra, Optimization techniques, Statistics and pattern recognition.
Then u r the right fit for this position.