About the Role We are looking for a Data Engineer to help us scale the existing data infrastructure and in parallel work on building the next generation data platform for analytics at scale, machine learning infrastructure and data validation systems.In this role, you will be responsible for communicating effectively with data consumers to fine-tune data platform systems (existing or new), taking ownership and delivering high performing systems and data pipelines, and helping the team scale them up, to endure ever growing traffic.This is a growing team, which makes for many opportunities to be involved directly with product management, development, sales, and support teams. Everybody on the team is passionate about their work and we’re looking for similarly motivated “get stuff done” kind of people to join us! Roles & Responsibilities Engineer data pipelines (batch and real-time ) that aids in creation of data-driven products for our platform Design, develop and maintain a robust and scalable data-warehouse and data lake Work closely alongside Product managers and data-scientists to bring the various datasets together and cater to our business intelligence and analytics use-cases Design and develop solutions using data science techniques ranging from statistics, algorithms to machine learning Perform hands-on devops work to keep the Data platform secure and reliable Skills Required Bachelor's degree in Computer Science, Information Systems, or related engineering discipline 6 + years’ experience with ETL, Data Mining, Data Modeling, and working with large-scale datasets 6+ years’ experience with an object-oriented programming language such as Python, Scala, Java, etc Extremely proficient in writing performant SQL working with large data volumes Experience with map-reduce, Spark, Kafka, Presto, and the ecosystem. Experience in building automated analytical systems utilizing large data sets. Experience with designing, scaling and optimizing cloud based data warehouses (like AWS Redshift) and data lakes Familiarity with AWS technologies preferred Qualification – B.Tech/M.Tech/MCA(IT/Computer Science) Years of Exp – 6-9
Key Result Areas · Create and maintain optimal data pipeline, · Assemble large, complex data sets that meet functional / non-functional business requirements. · Identify, design, and implement internal process improvements: automating manual processes, optimizing data delivery, re-designing infrastructure for greater scalability, etc. · Keep our data separated and secure · Create data tools for analytics and data scientist team members that assist them in building and optimizing our product into an innovative industry leader. · Build analytics tools that utilize the data pipeline to provide actionable insights into key business performance metrics. · Work with stakeholders including the Executive, Product, Data and Design teams to assist with data-related technical issues and support their data infrastructure needs. · Work with data and analytics experts to strive for greater functionality in our data systems Knowledge, Skills and Experience Core Skills: We are looking for a candidate with 5+ years of experience in a Data Engineer role. They should also have experience using the following software/tools: · Experience in developing Big Data applications using Spark, Hive, Sqoop, Kafka, and Map Reduce · Experience with stream-processing systems: Spark-Streaming, Strom etc. · Experience with object-oriented/object function scripting languages: Python, Scala etc · Experience in designing and building dimensional data models to improve accessibility, efficiency, and quality of data · Should be proficient in writing Advanced SQLs, Expertise in performance tuning of SQLs. Experience with data science and machine learning tools and technologies is a plus · Experience with relational SQL and NoSQL databases, including Postgres and Cassandra. · Experience with Azure cloud services is a plus · Financial Services Knowledge is a plus
Main responsibilities: + Management of a growing technical team + Continued technical Architecture design based on product roadmap + Annual performance reviews + Work with DevOps to design and implement the product infrastructure Strategic: + Testing strategy + Security policy + Performance and performance testing policy + Logging policy Experience: + 9-15 years of experience including that of managing teams of developers + Technical & architectural expertise, and have evolved a growing code base, technology stack and architecture over many years + Have delivered distributed cloud applications + Understand the value of high quality code and can effectively manage technical debt + Stakeholder management + Work experience in consumer focused early stage (Series A, B) startups is a big plus Other innate skills: + Great motivator of people and able to lead by example + Understand how to get the most out of people + Delivery of products to tight deadlines but with a focus on high quality code + Up to date knowledge of technical applications
Do NOT apply if you are :- Want to be a Power Bi, Qlik, or Tableau only developer.- A machine learning aspirant- A data scientist- Wanting to write Python scripts- Want to do AI - Want to do 'BIG' data- Want to do HADOOP- Fresh GraduateApply if you :- Write SQL for complicated analytical queries . - Understand existing business problem of the client and map their needs to the schema that they have.-Can neatly disassemble the problem into components and solve the needs by using SQL. - Have worked on existing BI products.Develop solutions with our exciting new BI product for our clients.You should be very experienced and comfortable with writing SQL against very complicated schema to help answer business questions.Have an analytical thought process.
candidate will be responsible for all aspects of data acquisition, data transformation, and analytics scheduling and operationalization to drive high-visibility, cross-division outcomes. Expected deliverables will include the development of Big Data ELT jobs using a mix of technologies, stitching together complex and seemingly unrelated data sets for mass consumption, and automating and scaling analytics into the GRAND's Data Lake. Key Responsibilities : - Create a GRAND Data Lake and Warehouse which pools all the data from different regions and stores of GRAND in GCC - Ensure Source Data Quality Measurement, enrichment and reporting of Data Quality - Manage All ETL and Data Model Update Routines - Integrate new data sources into DWH - Manage DWH Cloud (AWS/AZURE/Google) and Infrastructure Skills Needed : - Very strong in SQL. Demonstrated experience with RDBMS, Unix Shell scripting preferred (e.g., SQL, Postgres, Mongo DB etc) - Experience with UNIX and comfortable working with the shell (bash or KRON preferred) - Good understanding of Data warehousing concepts. Big data systems : Hadoop, NoSQL, HBase, HDFS, MapReduce - Aligning with the systems engineering team to propose and deploy new hardware and software environments required for Hadoop and to expand existing environments. - Working with data delivery teams to set up new Hadoop users. This job includes setting up Linux users, setting up and testing HDFS, Hive, Pig and MapReduce access for the new users. - Cluster maintenance as well as creation and removal of nodes using tools like Ganglia, Nagios, Cloudera Manager Enterprise, and other tools. - Performance tuning of Hadoop clusters and Hadoop MapReduce routines. - Screen Hadoop cluster job performances and capacity planning - Monitor Hadoop cluster connectivity and security - File system management and monitoring. - HDFS support and maintenance. - Collaborating with application teams to install operating system and - Hadoop updates, patches, version upgrades when required. - Defines, develops, documents and maintains Hive based ETL mappings and scripts