Qualification – Any engineering graduate with STRONG programming and logical reasoning skills.
**Minimum years of Experience:2 – 5 years**
Previous experience as a Data Engineer or in a similar role.
Technical expertise with data models, data mining, and segmentation techniques.
**Knowledge of programming languages (e. g. Java and Python).
Hands-on experience with SQL Programming
Hands-on experience with Python Programming
Knowledge of these tools DBT, ADF, Snowflakes, and Databricks would be added advantage for our current project.**
Strong numerical and analytical skills.
Experience in dealing directly with customers and internal sales organizations.
Strong written and verbal communication, including technical writing skills.
Good to have: Hands-on experience in Cloud services.
Knowledge with ML
Data Warehouse builds (DB, SQL, ETL, Reporting Tools like Power BI…)
Do share your profile to gayathrirajagopalan @jmangroup.com
About JMAN group
WHAT YOU WILL DO:
● Create and maintain optimal data pipeline architecture.
● Assemble large, complex data sets that meet functional / non-functional business requirements.
● Identify, design, and implement internal process improvements: automating manual processes,
optimizing data delivery, re-designing infrastructure for greater scalability, etc.
● Build the infrastructure required for optimal extraction, transformation, and loading of data from a wide
variety of data sources using Spark,Hadoop and AWS 'big data' technologies.(EC2, EMR, S3, Athena).
● Build analytics tools that utilize the data pipeline to provide actionable insights into customer acquisition,
operational efficiency and other key business performance metrics.
● Work with stakeholders including the Executive, Product, Data and Design teams to assist with
data-related technical issues and support their data infrastructure needs.
● Keep our data separated and secure across national boundaries through multiple data centers and AWS
● Create data tools for analytics and data scientist team members that assist them in building and
optimizing our product into an innovative industry leader.
● Work with data and analytics experts to strive for greater functionality in our data systems.
REQUIRED SKILLS & QUALIFICATIONS:
● 5+ years of experience in a Data Engineer role.
● Advanced working SQL knowledge and experience working with relational databases, query authoring
(SQL) as well as working familiarity with a variety of databases.
● Experience building and optimizing 'big data' data pipelines, architectures and data sets.
● Experience performing root cause analysis on internal and external data and processes to answer
specific business questions and identify opportunities for improvement.
● Strong analytic skills related to working with unstructured datasets.
● Build processes supporting data transformation, data structures, metadata, dependency and workload
● A successful history of manipulating, processing and extracting value from large disconnected datasets.
● Working knowledge of message queuing, stream processing, and highly scalable 'big data' data stores.
● Strong project management and organizational skills.
● Experience supporting and working with cross-functional teams in a dynamic environment
● Experience with big data tools: Hadoop, Spark, Pig, Vetica, etc.
● Experience with AWS cloud services: EC2, EMR, S3, Athena
● Experience with Linux
● Experience with object-oriented/object function scripting languages: Python, Java, Shell, Scala, etc.
PREFERRED SKILLS & QUALIFICATIONS:
● Graduate degree in Computer Science, Statistics, Informatics, Information Systems or another quantitative field.
- Designing and implementing fine-tuned production ready data/ML pipelines in Hadoop platform.
- Driving optimization, testing and tooling to improve quality.
- Reviewing and approving high level & amp; detailed design to ensure that the solution delivers to the business needs and aligns to the data & analytics architecture principles and roadmap.
- Understanding business requirements and solution design to develop and implement solutions that adhere to big data architectural guidelines and address business requirements.
- Following proper SDLC (Code review, sprint process).
- Identifying, designing, and implementing internal process improvements: automating manual processes, optimizing data delivery, etc.
- Building robust and scalable data infrastructure (both batch processing and real-time) to support needs from internal and external users.
- Understanding various data security standards and using secure data security tools to apply and adhere to the required data controls for user access in the Hadoop platform.
- Supporting and contributing to development guidelines and standards for data ingestion.
- Working with a data scientist and business analytics team to assist in data ingestion and data related technical issues.
- Designing and documenting the development & deployment flow.
- Experience in developing rest API services using one of the Scala frameworks.
- Ability to troubleshoot and optimize complex queries on the Spark platform
- Expert in building and optimizing ‘big data’ data/ML pipelines, architectures and data sets.
- Knowledge in modelling unstructured to structured data design.
- Experience in Big Data access and storage techniques.
- Experience in doing cost estimation based on the design and development.
- Excellent debugging skills for the technical stack mentioned above which even includes analyzing server logs and application logs.
- Highly organized, self-motivated, proactive, and ability to propose best design solutions.
- Good time management and multitasking skills to work to deadlines by working independently and as a part of a team.
Skills and requirements
- Experience analyzing complex and varied data in a commercial or academic setting.
- Desire to solve new and complex problems every day.
- Excellent ability to communicate scientific results to both technical and non-technical team members.
- A degree in a numerically focused discipline such as, Maths, Physics, Chemistry, Engineering or Biological Sciences..
- Hands on experience on Python, Pyspark, SQL
- Hands on experience on building End to End Data Pipelines.
- Hands on Experience on Azure Data Factory, Azure Data Bricks, Data Lake - added advantage
- Hands on Experience in building data pipelines.
- Experience with Bigdata Tools, Hadoop, Hive, Sqoop, Spark, SparkSQL
- Experience with SQL or NoSQL databases for the purposes of data retrieval and management.
- Experience in data warehousing and business intelligence tools, techniques and technology, as well as experience in diving deep on data analysis or technical issues to come up with effective solutions.
- BS degree in math, statistics, computer science or equivalent technical field.
- Experience in data mining structured and unstructured data (SQL, ETL, data warehouse, Machine Learning etc.) in a business environment with large-scale, complex data sets.
- Proven ability to look at solutions in unconventional ways. Sees opportunities to innovate and can lead the way.
- Willing to learn and work on Data Science, ML, AI.
Kwalee is one of the world’s leading multiplatform game publishers and developers, with well over 750 million downloads worldwide for mobile hits such as Draw It, Teacher Simulator, Let’s Be Cops 3D, Traffic Cop 3D and Makeover Studio 3D. Alongside this, we also have a growing PC and Console team of incredible pedigree that is on the hunt for great new titles to join TENS!, Eternal Hope and Die by the Blade.
With a team of talented people collaborating daily between our studios in Leamington Spa, Bangalore and Beijing, or on a remote basis from Turkey, Brazil, the Philippines and many more places, we have a truly global team making games for a global audience. And it’s paying off: Kwalee games have been downloaded in every country on earth! If you think you’re a good fit for one of our remote vacancies, we want to hear from you wherever you are based.
Founded in 2011 by David Darling CBE, a key architect of the UK games industry who previously co-founded and led Codemasters for many years, our team also includes legends such as Andrew Graham (creator of Micro Machines series) and Jason Falcus (programmer of classics including NBA Jam) alongside a growing and diverse team of global gaming experts. Everyone contributes creatively to Kwalee’s success, with all employees eligible to pitch their own game ideas on Creative Wednesdays, and we’re proud to have built our success on this inclusive principle. Could your idea be the next global hit?
What’s the job?
As a Data Scientist you will help utilise masses of data generated by Kwalee players all over the world to solve complex problems using cutting edge techniques.
What you tell your friends you do
"My models optimise the performance of Kwalee games and advertising every day!”
What you will really be doing
- Building intelligent systems which generate value from the data which our players and marketing activities produce.
- Leveraging statistical modelling and machine learning techniques to perform automated decision making on a large scale.
- Developing complex, multi-faceted and highly valuable data products which fuel the growth of Kwalee and our games.
- Owning and managing data science projects from concept to deployment.
- Collaborating with key stakeholders across the company to develop new products and avenues of research.
How you will be doing this
- You’ll be part of an agile, multidisciplinary and creative team and work closely with them to ensure the best results.
- You'll think creatively and be motivated by challenges and constantly striving for the best.
- You’ll work with cutting edge technology, if you need software or hardware to get the job done efficiently, you will get it. We even have a robot!
Our talented team is our signature. We have a highly creative atmosphere with more than 200 staff where you’ll have the opportunity to contribute daily to important decisions. You’ll work within an extremely experienced, passionate and diverse team, including David Darling and the creator of the Micro Machines video games.
Skills and Requirements
- A degree in a numerically focussed degree discipline such as, Maths, Physics, Economics, Chemistry, Engineering, Biological Sciences
- A record of outstanding contribution to data science projects.
- Experience using Python for data analysis and visualisation.
- A good understanding of a deep learning framework such as Tensorflow.
- Experience manipulating data in SQL and/or NoSQL databases
- We want everyone involved in our games to share our success, that’s why we have a generous team profit sharing scheme from day 1 of employment
- In addition to a competitive salary we also offer private medical cover and life assurance
- Creative Wednesdays! (Design and make your own games every Wednesday)
- 20 days of paid holidays plus bank holidays
- Hybrid model available depending on the department and the role
- Relocation support available
- Great work-life balance with flexible working hours
- Quarterly team building days - work hard, play hard!
- Monthly employee awards
- Free snacks, fruit and drinks
We firmly believe in creativity and innovation and that a fundamental requirement for a successful and happy company is having the right mix of individuals. With the right people in the right environment anything and everything is possible.
Kwalee makes games to bring people, their stories, and their interests together. As an employer, we’re dedicated to making sure that everyone can thrive within our team by welcoming and supporting people of all ages, races, colours, beliefs, sexual orientations, genders and circumstances. With the inclusion of diverse voices in our teams, we bring plenty to the table that’s fresh, fun and exciting; it makes for a better environment and helps us to create better games for everyone! This is how we move forward as a company – because these voices are the difference that make all the difference.
2-5 yrs of proven experience in ML, DL, and preferably NLP.
Preferred Educational Background - B.E/B.Tech, M.S./M.Tech, Ph.D.
𝐖𝐡𝐚𝐭 𝐰𝐢𝐥𝐥 𝐲𝐨𝐮 𝐰𝐨𝐫𝐤 𝐨𝐧?
𝟏) Problem formulation and solution designing of ML/NLP applications across complex well-defined as well as open-ended healthcare problems.
2) Cutting-edge machine learning, data mining, and statistical techniques to analyse and utilise large-scale structured and unstructured clinical data.
3) End-to-end development of company proprietary AI engines - data collection, cleaning, data modelling, model training / testing, monitoring, and deployment.
4) Research and innovate novel ML algorithms and their applications suited to the problem at hand.
𝐖𝐡𝐚𝐭 𝐚𝐫𝐞 𝐰𝐞 𝐥𝐨𝐨𝐤𝐢𝐧𝐠 𝐟𝐨𝐫?
𝟏) Deeper understanding of business objectives and ability to formulate the problem as a Data Science problem.
𝟐) Solid expertise in knowledge graphs, graph neural nets, clustering, classification.
𝟑) Strong understanding of data normalization techniques, SVM, Random forest, data visualization techniques.
𝟒) Expertise in RNN, LSTM, and other neural network architectures.
𝟓) DL frameworks: Tensorflow, Pytorch, Keras
𝟔) High proficiency with standard database skills (e.g., SQL, MongoDB, Graph DB), data preparation, cleaning, and wrangling/munging.
𝟕) Comfortable with web scraping, extracting, manipulating, and analyzing complex, high-volume, high-dimensionality data from varying sources.
𝟖) Experience with deploying ML models on cloud platforms like AWS or Azure.
9) Familiarity with version control with GIT, BitBucket, SVN, or similar.
𝐖𝐡𝐲 𝐜𝐡𝐨𝐨𝐬𝐞 𝐮𝐬?
𝟏) We offer Competitive remuneration.
𝟐) We give opportunities to work on exciting and cutting-edge machine learning problems so you contribute towards transforming the healthcare industry.
𝟑) We offer flexibility to choose your tools, methods, and ways to collaborate.
𝟒) We always value and believe in new ideas and encourage creative thinking.
𝟓) We offer open culture where you will work closely with the founding team and have the chance to influence the product design and execution.
𝟔) And, of course, the thrill of being part of an early-stage startup, launching a product, and seeing it in the hands of the users.
Shift timings: 2pm-11 pm/ 3pm -12 pm
|Role & Responsibilities|
|responsible for architecting and implementing very large scale data intelligence solutions|
|Translate requirements for BI and Reporting to Database design and reporting design|
|Understanding & analysis of data transformation and translation requirements|
|Developing ETL pipelines in and out of data warehouse using combination of Python and Snowflakes Snow SQL, Stored Proc|
|Writing SQL queries against Snowflake.|
|Developing scripts Unix, Python etc. to do Extract, Load and Transform data|
|Testing and clearly document implementations|
|Provide production support for Data Warehouse issues such data load problems, transformation translation problems|
- Own the product analytics of bidgely’s end user-facing products, measure and identify areas of improvement through data
- Liaise with Product Managers and Business Leaders to understand the product issues, priorities and hence support them through relevant product analytics
- Own the automation of product analytics through good SQL knowledge
- Develop early warning metrics for production and highlight issues and breakdowns for resolution
- Resolve client escalations and concerns regarding key business metrics
- Define and own execution
- Own the Energy Efficiency program designs, dashboard development, and monitoring of existing Energy efficiency program
- Deliver data-backed analysis and statistically proven solutions
- Research and implement best practices
- Mentor team of analysts
Qualifications and Education Requirements
- B.Tech from a premier institute with 5+ years analytics experience or Full-time MBA from a premier b-school with 3+ years of experience in analytics/business or product analytics
- Bachelor's degree in Business, Computer Science, Computer Information Systems, Engineering, Mathematics, or other business/analytical disciplines
Skills needed to excel
- Proven analytical and quantitative skills and an ability to use data and metrics to back up assumptions, develop business cases, and complete root cause
- Excellent understanding of retention, churn, and acquisition of user base
- Ability to employ statistics and anomaly detection techniques for data-driven
- Ability to put yourself in the shoes of the end customer and understand what
“product excellence” means
- Ability to rethink existing products and use analytics to identify new features and product improvements.
- Ability to rethink existing processes and design new processes for more effective analyses
- Strong SQL knowledge, working experience with Looker and Tableau a great plus
- Strong commitment to quality visible in the thoroughness of analysis and techniques employed
- Strong project management and leadership skills
- Excellent communication (oral and written) and interpersonal skills and an ability to effectively communicate with both business and technical teams
- Ability to coach and mentor analysts on technical and analytical skills
- Good knowledge of statistics, basic machine learning, and AB Testing is
- Experience as a Growth hacker and/or in Product analytics is a big plus
Must Have Skills:
- Solid Knowledge on DWH, ETL and Big Data Concepts
- Excellent SQL Skills (With knowledge of SQL Analytics Functions)
- Working Experience on any ETL tool i.e. SSIS / Informatica
- Working Experience on any Azure or AWS Big Data Tools.
- Experience on Implementing Data Jobs (Batch / Real time Streaming)
- Excellent written and verbal communication skills in English, Self-motivated with strong sense of ownership and Ready to learn new tools and technologies
- Experience on Py-Spark / Spark SQL
- AWS Data Tools (AWS Glue, AWS Athena)
- Azure Data Tools (Azure Databricks, Azure Data Factory)
- Knowledge about Azure Blob, Azure File Storage, AWS S3, Elastic Search / Redis Search
- Knowledge on domain/function (across pricing, promotions and assortment).
- Implementation Experience on Schema and Data Validator framework (Python / Java / SQL),
- Knowledge on DQS and MDM.
- Independently work on ETL / DWH / Big data Projects
- Gather and process raw data at scale.
- Design and develop data applications using selected tools and frameworks as required and requested.
- Read, extract, transform, stage and load data to selected tools and frameworks as required and requested.
- Perform tasks such as writing scripts, web scraping, calling APIs, write SQL queries, etc.
- Work closely with the engineering team to integrate your work into our production systems.
- Process unstructured data into a form suitable for analysis.
- Analyse processed data.
- Support business decisions with ad hoc analysis as needed.
- Monitoring data performance and modifying infrastructure as needed.
Responsibility: Smart Resource, having excellent communication skills
- Understand current state architecture, including pain points.
- Create and document future state architectural options to address specific issues or initiatives using Machine Learning.
- Innovate and scale architectural best practices around building and operating ML workloads by collaborating with stakeholders across the organization.
- Develop CI/CD & ML pipelines that help to achieve end-to-end ML model development lifecycle from data preparation and feature engineering to model deployment and retraining.
- Provide recommendations around security, cost, performance, reliability, and operational efficiency and implement them
- Provide thought leadership around the use of industry standard tools and models (including commercially available models and tools) by leveraging experience and current industry trends.
- Collaborate with the Enterprise Architect, consulting partners and client IT team as warranted to establish and implement strategic initiatives.
- Make recommendations and assess proposals for optimization.
- Identify operational issues and recommend and implement strategies to resolve problems.
- 3+ years of experience in developing CI/CD & ML pipelines for end-to-end ML model/workloads development
- Strong knowledge in ML operations and DevOps workflows and tools such as Git, AWS CodeBuild & CodePipeline, Jenkins, AWS CloudFormation, and others
- Background in ML algorithm development, AI/ML Platforms, Deep Learning, ML Operations in the cloud environment.
- Strong programming skillset with high proficiency in Python, R, etc.
- Strong knowledge of AWS cloud and its technologies such as S3, Redshift, Athena, Glue, SageMaker etc.
- Working knowledge of databases, data warehouses, data preparation and integration tools, along with big data parallel processing layers such as Apache Spark or Hadoop
- Knowledge of pure and applied math, ML and DL frameworks, and ML techniques, such as random forest and neural networks
- Ability to collaborate with Data scientist, Data Engineers, Leaders, and other IT teams
- Ability to work with multiple projects and work streams at one time. Must be able to deliver results based upon project deadlines.
- Willing to flex daily work schedule to allow for time-zone differences for global team communications
- Strong interpersonal and communication skills
Role : Sr Data Scientist / Tech Lead – Data Science
Number of positions : 8
- Lead a team of data scientists, machine learning engineers and big data specialists
- Be the main point of contact for the customers
- Lead data mining and collection procedures
- Ensure data quality and integrity
- Interpret and analyze data problems
- Conceive, plan and prioritize data projects
- Build analytic systems and predictive models
- Test performance of data-driven products
- Visualize data and create reports
- Experiment with new models and techniques
- Align data projects with organizational goals
Requirements (please read carefully)
- Very strong in statistics fundamentals. Not all data is Big Data. The candidate should be able to derive statistical insights from very few data points if required, using traditional statistical methods.
- Msc-Statistics/ Phd.Statistics
- Education – no bar, but preferably from a Statistics academic background (eg MSc-Stats, MSc-Econometrics etc), given the first point
- Strong expertise in Python (any other statistical languages/tools like R, SAS, SPSS etc are just optional, but Python is absolutely essential). If the person is very strong in Python, but has almost nil knowledge in the other statistical tools, he/she will still be considered a good candidate for this role.
- Proven experience as a Data Scientist or similar role, for about 7-8 years
- Solid understanding of machine learning and AI concepts, especially wrt choice of apt candidate algorithms for a use case, and model evaluation.
- Good expertise in writing SQL queries (should not be dependent upon anyone else for pulling in data, joining them, data wrangling etc)
- Knowledge of data management and visualization techniques --- more from a Data Science perspective.
- Should be able to grasp business problems, ask the right questions to better understand the problem breadthwise /depthwise, design apt solutions, and explain that to the business stakeholders.
- Again, the last point above is extremely important --- should be able to identify solutions that can be explained to stakeholders, and furthermore, be able to present them in simple, direct language.