Data Engineering Lead
What is the role?
You will be responsible for building and maintaining highly scalable data infrastructure for our cloud-hosted SAAS product. You will work closely with the Product Managers and Technical team to define and implement data pipelines for customer-facing and internal reports.
Key Responsibilities
- Design and develop resilient data pipelines.
- Write efficient queries to fetch data from the report database.
- Work closely with application backend engineers on data requirements for their stories.
- Designing and developing report APIs for the front end to consume.
- Focus on building highly available, fault-tolerant report systems.
- Constantly improve the architecture of the application by clearing the technical backlog.
- Adopt a culture of learning and development to constantly keep pace with and adopt new technolgies.
What are we looking for?
An enthusiastic individual with the following skills. Please do not hesitate to apply if you do not match all of it. We are open to promising candidates who are passionate about their work and are team players.
- Education - BE/MCA or equivalent
- Overall 8+ years of experience
- Expert level understanding of database concepts and BI.
- Well verse in databases such as MySQL, MongoDB and hands on experience in creating data models.
- Must have designed and implemented low latency data warehouse systems.
- Must have strong understanding of Kafka and related systems.
- Experience in clickhouse database preferred.
- Must have good knowledge of APIs and should be able to build interfaces for frontend engineers.
- Should be innovative and communicative in approach
- Will be responsible for functional/technical track of a project
Whom will you work with?
You will work with a top-notch tech team, working closely with the CTO and product team.
What can you look for?
A wholesome opportunity in a fast-paced environment that will enable you to juggle between concepts, yet maintain the quality on content, interact and share your ideas and have loads of learning while at work. Work with a team of highly talented young professionals and enjoy the benefits.
We are
A fast-growing SaaS commerce company based in Bangalore with offices in Delhi, Mumbai, SF, Dubai, Singapore and Dublin. We have three products in our portfolio: Plum, Empuls and Compass. Works with over 1000 global clients. We help our clients in engaging and motivating their employees, sales teams, channel partners or consumers for better business results.
About A fast-growing SaaS commerce company permanent WFH & Office
Similar jobs
We are looking out for a technically driven "ML OPS Engineer" for one of our premium client
COMPANY DESCRIPTION:
Key Skills
• Excellent hands-on expert knowledge of cloud platform infrastructure and administration
(Azure/AWS/GCP) with strong knowledge of cloud services integration, and cloud security
• Expertise setting up CI/CD processes, building and maintaining secure DevOps pipelines with at
least 2 major DevOps stacks (e.g., Azure DevOps, Gitlab, Argo)
• Experience with modern development methods and tooling: Containers (e.g., docker) and
container orchestration (K8s), CI/CD tools (e.g., Circle CI, Jenkins, GitHub actions, Azure
DevOps), version control (Git, GitHub, GitLab), orchestration/DAGs tools (e.g., Argo, Airflow,
Kubeflow)
• Hands-on coding skills Python 3 (e.g., API including automated testing frameworks and libraries
(e.g., pytest) and Infrastructure as Code (e.g., Terraform) and Kubernetes artifacts (e.g.,
deployments, operators, helm charts)
• Experience setting up at least one contemporary MLOps tooling (e.g., experiment tracking,
model governance, packaging, deployment, feature store)
• Practical knowledge delivering and maintaining production software such as APIs and cloud
infrastructure
• Knowledge of SQL (intermediate level or more preferred) and familiarity working with at least
one common RDBMS (MySQL, Postgres, SQL Server, Oracle)
- Sound knowledge of Mongo as a primary skill
- . Should have hands on experience of MySQL as a secondary skill will be enough
- . Experience with replication , sharding and scaling.
- . Design, install, maintain highly available systems (includes monitoring, security, backup, and performance tuning)
- . Implement secure database and server installations (privilege access methodology / role based access)
- . Help application team in query writing, performance tuning & other D2D issues
- • Deploy automation techniques for d2d operations
- . Must possess good analytical and problem solving skills
- . Must be willing to work flexible hours as needed
- . Scripting experience a plus
- . Ability to work independently and as a member of a team
- . good verbal and written communication skills
- Data Engineer
Required skill set: AWS GLUE, AWS LAMBDA, AWS SNS/SQS, AWS ATHENA, SPARK, SNOWFLAKE, PYTHON
Mandatory Requirements
- Experience in AWS Glue
- Experience in Apache Parquet
- Proficient in AWS S3 and data lake
- Knowledge of Snowflake
- Understanding of file-based ingestion best practices.
- Scripting language - Python & pyspark
CORE RESPONSIBILITIES
- Create and manage cloud resources in AWS
- Data ingestion from different data sources which exposes data using different technologies, such as: RDBMS, REST HTTP API, flat files, Streams, and Time series data based on various proprietary systems. Implement data ingestion and processing with the help of Big Data technologies
- Data processing/transformation using various technologies such as Spark and Cloud Services. You will need to understand your part of business logic and implement it using the language supported by the base data platform
- Develop automated data quality check to make sure right data enters the platform and verifying the results of the calculations
- Develop an infrastructure to collect, transform, combine and publish/distribute customer data.
- Define process improvement opportunities to optimize data collection, insights and displays.
- Ensure data and results are accessible, scalable, efficient, accurate, complete and flexible
- Identify and interpret trends and patterns from complex data sets
- Construct a framework utilizing data visualization tools and techniques to present consolidated analytical and actionable results to relevant stakeholders.
- Key participant in regular Scrum ceremonies with the agile teams
- Proficient at developing queries, writing reports and presenting findings
- Mentor junior members and bring best industry practices
QUALIFICATIONS
- 5-7+ years’ experience as data engineer in consumer finance or equivalent industry (consumer loans, collections, servicing, optional product, and insurance sales)
- Strong background in math, statistics, computer science, data science or related discipline
- Advanced knowledge one of language: Java, Scala, Python, C#
- Production experience with: HDFS, YARN, Hive, Spark, Kafka, Oozie / Airflow, Amazon Web Services (AWS), Docker / Kubernetes, Snowflake
- Proficient with
- Data mining/programming tools (e.g. SAS, SQL, R, Python)
- Database technologies (e.g. PostgreSQL, Redshift, Snowflake. and Greenplum)
- Data visualization (e.g. Tableau, Looker, MicroStrategy)
- Comfortable learning about and deploying new technologies and tools.
- Organizational skills and the ability to handle multiple projects and priorities simultaneously and meet established deadlines.
- Good written and oral communication skills and ability to present results to non-technical audiences
- Knowledge of business intelligence and analytical tools, technologies and techniques.
Familiarity and experience in the following is a plus:
- AWS certification
- Spark Streaming
- Kafka Streaming / Kafka Connect
- ELK Stack
- Cassandra / MongoDB
- CI/CD: Jenkins, GitLab, Jira, Confluence other related tools
Cloudera Data Warehouse Hive team looking for a passionate senior developer to join our growing engineering team. This group is targeting the biggest enterprises wanting to utilize Cloudera’s services in a private and public cloud environment. Our product is built on open source technologies like Hive, Impala, Hadoop, Kudu, Spark and so many more providing unlimited learning opportunities.A Day in the LifeOver the past 10+ years, Cloudera has experienced tremendous growth making us the leading contributor to Big Data platforms and ecosystems and a leading provider for enterprise solutions based on Apache Hadoop. You will work with some of the best engineers in the industry who are tackling challenges that will continue to shape the Big Data revolution. We foster an engaging, supportive, and productive work environment where you can do your best work. The team culture values engineering excellence, technical depth, grassroots innovation, teamwork, and collaboration.
You will manage product development for our CDP components, develop engineering tools and scalable services to enable efficient development, testing, and release operations. You will be immersed in many exciting, cutting-edge technologies and projects, including collaboration with developers, testers, product, field engineers, and our external partners, both software and hardware vendors.Opportunity:Cloudera is a leader in the fast-growing big data platforms market. This is a rare chance to make a name for yourself in the industry and in the Open Source world. The candidate will responsible for Apache Hive and CDW projects. We are looking for a candidate who would like to work on these projects upstream and downstream. If you are curious about the project and code quality you can check the project and the code at the following link. You can start the development before you join. This is one of the beauties of the OSS world.Apache Hive
Responsibilities:
•Build robust and scalable data infrastructure software
•Design and create services and system architecture for your projects
•Improve code quality through writing unit tests, automation, and code reviews
•The candidate would write Java code and/or build several services in the Cloudera Data Warehouse.
•Worked with a team of engineers who reviewed each other's code/designs and held each other to an extremely high bar for the quality of code/designs
•The candidate has to understand the basics of Kubernetes.
•Build out the production and test infrastructure.
•Develop automation frameworks to reproduce issues and prevent regressions.
•Work closely with other developers providing services to our system.
•Help to analyze and to understand how customers use the product and improve it where necessary.
Qualifications:
•Deep familiarity with Java programming language.
•Hands-on experience with distributed systems.
•Knowledge of database concepts, RDBMS internals.
•Knowledge of the Hadoop stack, containers, or Kubernetes is a strong plus.
•Has experience working in a distributed team.
•Has 3+ years of experience in software development.
Job Title: Oracle PL/SQL Developer
Qualification: (B.E./B.Tech/ Masters in Computer or IT)
Years of Experience: 3 – 7 Years
No. of Open Positions – 3
Job Location: Jaipur
- Proven hands-on Database Development experience
- Develop, design, test and implement complex database programs
- Strong experience with oracle functions, procedures, triggers, packages & performance tuning,
- Ensure that database programs are in compliance with V3 standards.
- Hands-on development using Oracle PL/SQL.
- Performance tune SQL's, application programs and instances.
- Evaluation of new and upcoming technologies.
- Providing technical assistance, problem resolution and troubleshooting support.
- Designing, building, and automating the MongoDB Architecture for open-source MongoDB.
- Good understanding of DB schema design, performance, tuning and capacity planning.
- The ideal candidate has worked with modern open-source MongoDB platforms cloud deployment models and test-driven development in a fast-paced agile environment.
- In depth understanding of data management e g permissions recovery security and monitoring Operational experience with MongoDB.
- Data Modelling Operational experience with Indexes.
- Good understanding of MongoDB replica set Op log and journals.
- Provide advice and support to other development resources interacting with MongoDB.
- Troubleshoot any problems that may come up with the database environments.
- Skilled in performance tuning and optimization using native monitoring and troubleshooting tools.
- Provide guidance in the creation and modification of standards and procedures.
- Experience working with cloud database services a plus.
- Experience working in an Agile Scrum environment.
- Experience working in Aggregation in MongoDB.
- Strong communication documentation skills and technology awareness.
In 2018-19, the mobile games market in India generated over $600 million in revenues. With close to 450 people in its Mumbai and Bangalore offices, Games24x7 is India’s largest mobile games business today and is very well positioned to become the 800-pound gorilla of what will be a $2 billion market by 2022. While Games24x7 continues to invest aggressively in its India centric mobile games, it is also diversifying its business by investing in international gaming and other tech opportunities.
Summary of Role
Position/Role Description :
The candidate will be part of a team managing databases (MySQL, MongoDB, Cassandra) and will be involved in designing, configuring and maintaining databases.
Job Responsibilities:
• Complete involvement in the database requirement starting from the design phase for every project.
• Deploying required database assets on production (DDL, DML)
• Good understanding of MySQL Replication (Master-slave, Master-Master, GTID-based)
• Understanding of MySQL partitioning.
• A better understanding of MySQL logs and Configuration.
• Ways to schedule backup and restoration.
• Good understanding of MySQL versions and their features.
• Good understanding of InnoDB-Engine.
• Exploring ways to optimize the current environment and also lay a good platform for new projects.
• Able to understand and resolve any database related production outages.
Job Requirements:
• BE/B.Tech from a reputed institute
• Experience in python scripting.
• Experience in shell scripting.
• General understanding of system hardware.
• Experience in MySQL is a must.
• Experience in MongoDB, Cassandra, Graph db will be preferred.
• Experience with Pecona MySQL tools.
• 6 - 8 years of experience.
Job Location: Bengaluru
- Key responsibility is to design and develop a data pipeline including the architecture, prototyping, and development of data extraction, transformation/processing, cleansing/standardizing, and loading in Data Warehouse at real-time/near the real-time frequency. Source data can be structured, semi-structured, and/or unstructured format.
- Provide technical expertise to design efficient data ingestion solutions to consolidate data from RDBMS, APIs, Messaging queues, weblogs, images, audios, documents, etc of Enterprise Applications, SAAS applications, external 3rd party sites or APIs, etc through ETL/ELT, API integrations, Change Data Capture, Robotic Process Automation, Custom Python/Java Coding, etc
- Development of complex data transformation using Talend (BigData edition), Python/Java transformation in Talend, SQL/Python/Java UDXs, AWS S3, etc to load in OLAP Data Warehouse in Structured/Semi-structured form
- Development of data model and creating transformation logic to populate models for faster data consumption with simple SQL.
- Implementing automated Audit & Quality assurance checks in Data Pipeline
- Document & maintain data lineage to enable data governance
- Coordination with BIU, IT, and other stakeholders to provide best-in-class data pipeline solutions, exposing data via APIs, loading in down streams, No-SQL Databases, etc
Requirements
- Programming experience using Python / Java, to create functions / UDX
- Extensive technical experience with SQL on RDBMS (Oracle/MySQL/Postgresql etc) including code optimization techniques
- Strong ETL/ELT skillset using Talend BigData Edition. Experience in Talend CDC & MDM functionality will be an advantage.
- Experience & expertise in implementing complex data pipelines, including semi-structured & unstructured data processing
- Expertise to design efficient data ingestion solutions to consolidate data from RDBMS, APIs, Messaging queues, weblogs, images, audios, documents, etc of Enterprise Applications, SAAS applications, external 3rd party sites or APIs, etc through ETL/ELT, API integrations, Change Data Capture, Robotic Process Automation, Custom Python/Java Coding, etc
- Good understanding & working experience in OLAP Data Warehousing solutions (Redshift, Synapse, Snowflake, Teradata, Vertica, etc) and cloud-native Data Lake (S3, ADLS, BigQuery, etc) solutions
- Familiarity with AWS tool stack for Storage & Processing. Able to recommend the right tools/solutions available to address a technical problem
- Good knowledge of database performance and tuning, troubleshooting, query optimization, and tuning
- Good analytical skills with the ability to synthesize data to design and deliver meaningful information
- Good knowledge of Design, Development & Performance tuning of 3NF/Flat/Hybrid Data Model
- Know-how on any No-SQL DB (DynamoDB, MongoDB, CosmosDB, etc) will be an advantage.
- Ability to understand business functionality, processes, and flows
- Good combination of technical and interpersonal skills with strong written and verbal communication; detail-oriented with the ability to work independently
Functional knowledge
- Data Governance & Quality Assurance
- Distributed computing
- Linux
- Data structures and algorithm
- Unstructured Data Processing
Glance – An InMobi Group Company:
Glance is an AI-first Screen Zero content discovery platform, and it’s scaled massively in the last few months to one of the largest platforms in India. Glance is a lock-screen first mobile content platform set up within InMobi. The average mobile phone user unlocks their phone >150 times a day. Glance aims to be there, providing visually rich, easy to consume content to entertain and inform mobile users - one unlock at a time. Glance is live on more than 80 millions of mobile phones in India already, and we are only getting started on this journey! We are now into phase 2 of the Glance story - we are going global!
Roposo is part of the Glance family. It is a short video entertainment platform. All the videos created here are user generated (via upload or Roposo creation tools in camera) and there are many communities creating these videos on various themes we call channels. Around 4 million videos are created every month on Roposo and power Roposo channels, some of the channels are - HaHa TV (for comedy videos), News, Beats (for singing/ dance performances) along with a For You (personalized for a user) and Your Feed (for videos of people a user follows).
What’s the Glance family like?
Consistently featured among the “Great Places to Work” in India since 2017, our culture is our true north, enabling us to think big, solve complex challenges and grow with new opportunities. Glanciers are passionate and driven, creative and fun-loving, take ownership and are results-focused. We invite you to free yourself, dream big and chase your passion.
What can we promise?
We offer an opportunity to have an immediate impact on the company and our products. The work that you shall do will be mission critical for Glance and will be critical for optimizing tech operations, working with highly capable and ambitious peer groups. At Glance, you get food for your body, soul, and mind with daily meals, gym, and yoga classes, cutting-edge training and tools, cocktails at drink cart Thursdays and fun at work on Funky Fridays. We even promise to let you bring your kids and pets to work.
What you will be doing?
Glance is looking for a Data Scientist who will design and develop processes and systems to analyze high volume, diverse "big data" sources using advanced mathematical, statistical, querying, and reporting methods. Will use machine learning techniques and statistical analysis to predict outcomes and behaviors. Interacts with business partners to identify questions for data analysis and experiments. Identifies meaningful insights from large data and metadata sources; interprets and communicates insights and or prepares output from analysis and experiments to business partners.
You will be working with Product leadership, taking high-level objectives and developing solutions that fulfil these requirements. Stakeholder management across Eng, Product and Business teams will be required.
Basic Qualifications:
- Five+ years experience working in a Data Science role
- Extensive experience developing and deploying ML models in real world environments
- Bachelor's degree in Computer Science, Mathematics, Statistics, or other analytical fields
- Exceptional familiarity with Python, Java, Spark or other open-source software with data science libraries
- Experience in advanced math and statistics
- Excellent familiarity with command line linux environment
- Able to understand various data structures and common methods in data transformation
- Experience deploying machine learning models and measuring their impact
- Knowledge of a variety of machine learning techniques (clustering, decision tree learning, artificial neural networks, etc.) and their real-world advantages/drawbacks.
Preferred Qualifications
- Experience developing recommendation systems
- Experience developing and deploying deep learning models
- Bachelor’s or Master's Degree or PhD that included coursework in statistics, machine learning or data analysis
- Five+ years experience working with Hadoop, a NoSQL Database or other big data infrastructure
- Experience with being actively engaged in data science or other research-oriented position
- You would be comfortable collaborating with cross-functional teams.
- Active personal GitHub account.