JOB DESCRIPTIONWe are looking for a Data Engineer with a solid background in scalable systems to work with our engineering team to improve and optimize our platform. You will have significant input into the team’s architectural approach and execution. We are looking for a hands-on programmer who enjoys designing and optimizing data pipelines for large-scale data.This is NOT a "data scientist" role, so please don't apply if you're looking for that.RESPONSIBILITIES:1. Build, maintain and test, performant, scalable data pipelines2. Work with data scientists and application developers to implement scalable pipelines for data ingest, processing, machine learning and visualization3. Building interfaces for ingest across various data storesMUST-HAVE:1. A track record of building and deploying data pipelines as a part of work or side projects2. Ability to work with RDBMS, MySQL or Postgres 3. Ability to deploy over cloud infrastructure, at least AWS4. Demonstrated ability and hunger to learnGOOD-TO-HAVE:1. Computer Science degree2. Expertise in at least one of: Python, Java, Scala3. Expertise and experience in deploying solutions based on Spark and Kafka4. Knowledge of container systems like Docker or Kubernetes5. Experience with NoSQL / graph databases:6. Knowledge of Machine LearningKindly apply only if you are skilled in building data pipelines.
Data Architect who leads a team of 5 numbers. Required skills : Spark ,Scala , hadoop
About UpGrad : UpGrad is an online education platform building the careers of tomorrow by offering the most industry relevant programs in an immersive learning experience. Our mission is to create a new digital-first learning experience to deliver tangible career impact to individuals at scale. UpGrad currently offers programs in Data Science, Big Data, Product Management, Digital Marketing, Entrepreneurship and Management. UpGrad was rated as one of the top 10 most innovative companies in India for 2017 - https://www.fastcompany.com/most-innovative-companies/2017/sectors/india . UpGrad is co-founded by 3 IIT-Delhi and Parthenon alumni and the 4th co-founder is serial entrepreneur Ronnie Screwvala. UpGrad has a committed capital of 100Cr and in the first year of operations, has built the largest revenue generating online program in India (PG Diploma in Data Science) and the largest enrolment online program in India (Start-up India learning program). Position : Senior Data Engineer Position Type : Full Time Location : Mumbai We are looking for an experienced Data Engineer for product and business analytics who will design and build mission critical data pipelines in SQL environment. As a Senior Data Engineer, you will: - Engineer data pipelines ( batch and real-time ) that aids in creation of data-driven products for our platform - Design, develop and maintain a robust and scalable data-warehouse - Work closely alongside Product managers and data-scientists to bring the various datasets together and cater to our business intelligence and analytics use-cases - Design and develop solutions using data science techniques ranging from statistics, algorithms to machine learning - Perform hands-on devops work to keep the Data platform secure and reliable Basic Qualifications - Bachelor's degree in Computer Science, Information Systems, or related engineering discipline - 4+ years’ experience with ETL, Data Mining, Data Modeling, and working with large-scale datasets - 1+ years’ experience with an object-oriented programming language such as Python, C++, Java, etc. - Extremely proficient in writing performant SQL working with large data volumes - Experience with map-reduce concepts - Experience in building automated analytical systems utilizing large data sets. - Familiarity with AWS technologies preferred
Looking for Database Architects to handle DB modelling, reporting, ETL activites
candidate will be responsible for all aspects of data acquisition, data transformation, and analytics scheduling and operationalization to drive high-visibility, cross-division outcomes. Expected deliverables will include the development of Big Data ELT jobs using a mix of technologies, stitching together complex and seemingly unrelated data sets for mass consumption, and automating and scaling analytics into the GRAND's Data Lake. Key Responsibilities : - Create a GRAND Data Lake and Warehouse which pools all the data from different regions and stores of GRAND in GCC - Ensure Source Data Quality Measurement, enrichment and reporting of Data Quality - Manage All ETL and Data Model Update Routines - Integrate new data sources into DWH - Manage DWH Cloud (AWS/AZURE/Google) and Infrastructure Skills Needed : - Very strong in SQL. Demonstrated experience with RDBMS, Unix Shell scripting preferred (e.g., SQL, Postgres, Mongo DB etc) - Experience with UNIX and comfortable working with the shell (bash or KRON preferred) - Good understanding of Data warehousing concepts. Big data systems : Hadoop, NoSQL, HBase, HDFS, MapReduce - Aligning with the systems engineering team to propose and deploy new hardware and software environments required for Hadoop and to expand existing environments. - Working with data delivery teams to set up new Hadoop users. This job includes setting up Linux users, setting up and testing HDFS, Hive, Pig and MapReduce access for the new users. - Cluster maintenance as well as creation and removal of nodes using tools like Ganglia, Nagios, Cloudera Manager Enterprise, and other tools. - Performance tuning of Hadoop clusters and Hadoop MapReduce routines. - Screen Hadoop cluster job performances and capacity planning - Monitor Hadoop cluster connectivity and security - File system management and monitoring. - HDFS support and maintenance. - Collaborating with application teams to install operating system and - Hadoop updates, patches, version upgrades when required. - Defines, develops, documents and maintains Hive based ETL mappings and scripts