BRIEF DESCRIPTION:
At-least 1 year of Python, Spark, SQL, data engineering experience
Primary Skillset: PySpark, Scala/Python/Spark, Azure Synapse, S3, RedShift/Snowflake
Relevant Experience: Legacy ETL job Migration to AWS Glue / Python & Spark combination
ROLE SCOPE:
Reverse engineer the existing/legacy ETL jobs
Create the workflow diagrams and review the logic diagrams with Tech Leads
Write equivalent logic in Python & Spark
Unit test the Glue jobs and certify the data loads before passing to system testing
Follow the best practices, enable appropriate audit & control mechanism
Analytically skillful, identify the root causes quickly and efficiently debug issues
Take ownership of the deliverables and support the deployments
REQUIREMENTS:
Create data pipelines for data integration into Cloud stacks eg. Azure Synapse
Code data processing jobs in Azure Synapse Analytics, Python, and Spark
Experience in dealing with structured, semi-structured, and unstructured data in batch and real-time environments.
Should be able to process .json, .parquet and .avro files
PREFERRED BACKGROUND:
Tier1/2 candidates from IIT/NIT/IIITs
However, relevant experience, learning attitude takes precedence
Similar jobs
Job Responsibilities:
1. Develop/debug applications using Python.
2. Improve code quality and code coverage for existing or new program.
3. Deploy and Integrate the Machine Learning models.
4. Test and validate the deployments.
5. ML Ops function.
Technical Skills
1. Graduate in Engineering or Technology with strong academic credentials
2. 4 to 8 years of experience as a Python developer.
3. Excellent understanding of SDLC processes
4. Strong knowledge of Unit testing, code quality improvement
5. Cloud based deployment and integration of applications/micro services.
6. Experience with NoSQL databases, such as MongoDB, Cassandra
7. Strong applied statistics skills
8. Knowledge of creating CI/CD pipelines and touchless deployment.
9. Knowledge about API, Data Engineering techniques.
10. AWS
11. Knowledge of Machine Learning and Large Language Model.
Nice to Have
1. Exposure to financial research domain
2. Experience with JIRA, Confluence
3. Understanding of scrum and Agile methodologies
4. Experience with data visualization tools, such as Grafana, GGplot, etc
Objective
Data Engineer will be responsible for expanding and optimizing our data and database architecture, as well as optimizing data flow and collection for cross functional teams. The ideal candidate is an experienced data pipeline builder and data wrangler who enjoys optimizing data systems and building. The Data Engineer will support our software developers, database architects, data analysts and data scientists on data initiatives and will ensure optimal data delivery architecture is consistent throughout ongoing projects. They must be self-directed and comfortable supporting the data needs of multiple teams, systems, and products
Roles and Responsibilities:
- Should be comfortable in building and optimizing performant data pipelines which include data ingestion, data cleansing and curation into a data warehouse, database, or any other data platform using DASK/Spark.
- Experience in distributed computing environment and Spark/DASK architecture.
- Optimize performance for data access requirements by choosing the appropriate file formats (AVRO, Parquet, ORC etc) and compression codec respectively.
- Experience in writing production ready code in Python and test, participate in code reviews to maintain and improve code quality, stability, and supportability.
- Experience in designing data warehouse/data mart.
- Experience with any RDBMS preferably SQL Server and must be able to write complex SQL queries.
- Expertise in requirement gathering, technical design and functional documents.
- Experience in Agile/Scrum practices.
- Experience in leading other developers and guiding them technically.
- Experience in deploying data pipelines using automated CI/CD approach.
- Ability to write modularized reusable code components.
- Proficient in identifying data issues and anomalies during analysis.
- Strong analytical and logical skills.
- Must be able to comfortably tackle new challenges and learn.
- Must have strong verbal and written communication skills.
Required skills:
- Knowledge on GCP
- Expertise in Google BigQuery
- Expertise in Airflow
- Good Hands on SQL
- Data warehousing concepts
- 3+ years of Experience majoring in applying AI/ML/ NLP / deep learning / data-driven statistical analysis & modelling solutions.
- Programming skills in Python, knowledge in Statistics.
- Hands-on experience developing supervised and unsupervised machine learning algorithms (regression, decision trees/random forest, neural networks, feature selection/reduction, clustering, parameter tuning, etc.). Familiarity with reinforcement learning is highly desirable.
- Experience in the financial domain and familiarity with financial models are highly desirable.
- Experience in image processing and computer vision.
- Experience working with building data pipelines.
- Good understanding of Data preparation, Model planning, Model training, Model validation, Model deployment and performance tuning.
- Should have hands on experience with some of these methods: Regression, Decision Trees,CART, Random Forest, Boosting, Evolutionary Programming, Neural Networks, Support Vector Machines, Ensemble Methods, Association Rules, Principal Component Analysis, Clustering, ArtificiAl Intelligence
- Should have experience in using larger data sets using Postgres Database.
Senior Data Scientist-Job Description
The Senior Data Scientist role is a creative problem solver who utilizes statistical/mathematical principles and modelling skills to uncover new insights that will significantly and meaningfully impact business decisions and actions. She/he applies their data science expertise in identifying, defining, and executing state-of-art techniques for academic opportunities and business objectives in collaboration with other Analytics team members. The Senior Data Scientist will execute analyses & outputs spanning test design and measurement, predictive analytics, multivariate analysis, data/text mining, pattern recognition, artificial intelligence, and machine learning.
Key Responsibilities:
- Perform the full range of data science activities including test design and measurement, predictive/advanced analytics, and data mining, and analytic dashboards.
- Extract, manipulate, analyse & interpret data from various corporate data sources developing advanced analytic solutions, deriving key observations, findings, insights, and formulating actionable recommendations.
- Generate clearly understood and intuitive data science / advanced analytics outputs.
- Provide thought leadership and recommendations on business process improvement, analytic solutions to complex problems.
- Participate in best practice sharing and communication platform for advancement of the data science discipline.
- Coach and collaborate with other data scientists and data analysts.
- Present impact, insights, outcomes & recommendations to key business partners and stakeholders.
- Comply with established Service Level Agreements to ensure timely, high quality deliverables with value-add recommendations, clearly articulated key findings and observations.
Qualification:
- Bachelor's Degree (B.A./B.S.) or Master’s Degree (M.A./M.S.) in Computer Science, Statistics, Mathematics, Machine Learning, Physics, or similar degree
- 5+ years of experience in data science in a digitally advanced industry focusing on strategic initiatives, marketing and/or operations.
- Advanced knowledge of best-in-class analytic software tools and languages: Python, SQL, R, SAS, Tableau, Excel, PowerPoint.
- Expertise in statistical methods, statistical analysis, data visualization, and data mining techniques.
- Experience in Test design, Design of Experiments, A/B Testing, Measurement Science Strong influencing skills to drive a robust testing agenda and data driven decision making for process improvements
- Strong Critical thinking skills to track down complex data and engineering issues, evaluate different algorithmic approaches, and analyse data to solve problems.
- Experience in partnering with IT, marketing operations & business operations to deploy predictive analytic solutions.
- Ability to translate/communicate complex analytical/statistical/mathematical concepts with non-technical audience.
- Strong written and verbal communications skills, as well as presentation skills.
In 2018-19, the mobile games market in India generated over $600 million in revenues. With close to 450 people in its Mumbai and Bangalore offices, Games24x7 is India’s largest mobile games business today and is very well positioned to become the 800-pound gorilla of what will be a $2 billion market by 2022. While Games24x7 continues to invest aggressively in its India centric mobile games, it is also diversifying its business by investing in international gaming and other tech opportunities.
Summary of Role
Position/Role Description :
The candidate will be part of a team managing databases (MySQL, MongoDB, Cassandra) and will be involved in designing, configuring and maintaining databases.
Job Responsibilities:
• Complete involvement in the database requirement starting from the design phase for every project.
• Deploying required database assets on production (DDL, DML)
• Good understanding of MySQL Replication (Master-slave, Master-Master, GTID-based)
• Understanding of MySQL partitioning.
• A better understanding of MySQL logs and Configuration.
• Ways to schedule backup and restoration.
• Good understanding of MySQL versions and their features.
• Good understanding of InnoDB-Engine.
• Exploring ways to optimize the current environment and also lay a good platform for new projects.
• Able to understand and resolve any database related production outages.
Job Requirements:
• BE/B.Tech from a reputed institute
• Experience in python scripting.
• Experience in shell scripting.
• General understanding of system hardware.
• Experience in MySQL is a must.
• Experience in MongoDB, Cassandra, Graph db will be preferred.
• Experience with Pecona MySQL tools.
• 6 - 8 years of experience.
Job Location: Bengaluru
Job description
Role : Lead Architecture (Spark, Scala, Big Data/Hadoop, Java)
Primary Location : India-Pune, Hyderabad
Experience : 7 - 12 Years
Management Level: 7
Joining Time: Immediate Joiners are preferred
- Attend requirements gathering workshops, estimation discussions, design meetings and status review meetings
- Experience of Solution Design and Solution Architecture for the data engineer model to build and implement Big Data Projects on-premises and on cloud.
- Align architecture with business requirements and stabilizing the developed solution
- Ability to build prototypes to demonstrate the technical feasibility of your vision
- Professional experience facilitating and leading solution design, architecture and delivery planning activities for data intensive and high throughput platforms and applications
- To be able to benchmark systems, analyses system bottlenecks and propose solutions to eliminate them
- Able to help programmers and project managers in the design, planning and governance of implementing projects of any kind.
- Develop, construct, test and maintain architectures and run Sprints for development and rollout of functionalities
- Data Analysis, Code development experience, ideally in Big Data Spark, Hive, Hadoop, Java, Python, PySpark,
- Execute projects of various types i.e. Design, development, Implementation and migration of functional analytics Models/Business logic across architecture approaches
- Work closely with Business Analysts to understand the core business problems and deliver efficient IT solutions of the product
- Deployment sophisticated analytics program of code using any of cloud application.
Perks and Benefits we Provide!
- Working with Highly Technical and Passionate, mission-driven people
- Subsidized Meals & Snacks
- Flexible Schedule
- Approachable leadership
- Access to various learning tools and programs
- Pet Friendly
- Certification Reimbursement Policy
- Check out more about us on our website below!
www.datametica.com
- Total Experience of 7-10 years and should be interested in teaching and research
- 3+ years’ experience in data engineering which includes data ingestion, preparation, provisioning, automated testing, and quality checks.
- 3+ Hands-on experience in Big Data cloud platforms like AWS and GCP, Data Lakes and Data Warehouses
- 3+ years of Big Data and Analytics Technologies. Experience in SQL, writing code in spark engine using python, scala or java Language. Experience in Spark, Scala
- Experience in designing, building, and maintaining ETL systems
- Experience in data pipeline and workflow management tools like Airflow
- Application Development background along with knowledge of Analytics libraries, opensource Natural Language Processing, statistical and big data computing libraries
- Familiarity with Visualization and Reporting Tools like Tableau, Kibana.
- Should be good at storytelling in Technology
Qualification: B.Tech / BE / M.Sc / MBA / B.Sc, Having Certifications in Big Data Technologies and Cloud platforms like AWS, Azure and GCP will be preferred
Primary Skills: Big Data + Python + Spark + Hive + Cloud Computing
Secondary Skills: NoSQL+ SQL + ETL + Scala + Tableau
Selection Process: 1 Hackathon, 1 Technical round and 1 HR round
Benefit: Free of cost training on Data Science from top notch professors
Who we look for
- Strong technical expertise and building ability
- Ability to envisage how your technical knowledge could be applied outside of academia - with a focus on impact/disruption
- Ability to explain and articulate complex ideas, simply
- Ability to digest difficult questions/information
Job Description
Role requires experience in AWS and also programming experience in Python and Spark
Roles & Responsibilities
You Will:
- Translate functional requirements into technical design
- Interact with clients and internal stakeholders to understand the data and platform requirements in detail and determine core cloud services needed to fulfil the technical design
- Design, Develop and Deliver data integration interfaces in the AWS
- Design, Develop and Deliver data provisioning interfaces to fulfil consumption needs
- Deliver data models on Cloud platform, it could be on AWS Redshift, SQL.
- Design, Develop and Deliver data integration interfaces at scale using Python / Spark
- Automate core activities to minimize the delivery lead times and improve the overall quality
- Optimize platform cost by selecting right platform services and architecting the solution in a cost-effective manner
- Manage code and deploy DevOps and CI CD processes
- Deploy logging and monitoring across the different integration points for critical alerts
You Have:
- Minimum 5 years of software development experience
- Bachelor's and/or Master’s degree in computer science
- Strong Consulting skills in data management including data governance, data quality, security, data integration, processing and provisioning
- Delivered data management projects in any of the AWS
- Translated complex analytical requirements into technical design including data models, ETLs and Dashboards / Reports
- Experience deploying dashboards and self-service analytics solutions on both relational and non-relational databases
- Experience with different computing paradigms in databases such as In-Memory, Distributed, Massively Parallel Processing
- Successfully delivered large scale data management initiatives covering Plan, Design, Build and Deploy phases leveraging different delivery methodologies including Agile
- Strong knowledge of continuous integration, static code analysis and test-driven development
- Experience in delivering projects in a highly collaborative delivery model with teams at onsite and offshore
- Must have Excellent analytical and problem-solving skills
- Delivered change management initiatives focused on driving data platforms adoption across the enterprise
- Strong verbal and written communications skills are a must, as well as the ability to work effectively across internal and external organizations