Responsibilities: Build real-time and batch analytics platform for analytics & machine-learning. Design, propose and develop solutions keeping the growing scale & business requirements in mind. As an integral part of the Data Engineering team, be involved in the entire development lifecycle from conceptualisation to architecture to coding to unit testing. Help us design the Data Model for our data warehouse and other data engineering solutions. Requirements: Deep understanding of real-time as well as batch processing big data solutions (Spark, Storm, Kafka, KSql, Flink, MapReduce, Yarn, Hive, HDFS, Pig etc). Extensive experience developing applications that work with NoSQL stores (e.g.,Elastic Search, HBase, Cassandra, MongoDB). Understands Data very well and has fair Data Modelling experience. Proven programming experience in Java or Scala. Experience in gathering and processing raw data at scale including writing scripts, web scraping, calling APIs, writing SQL queries, etc. Experience in cloud based data stores like Redshift and Big Query is an advantage. Previous experience in a high-growth tech startup would be an advantage.
Responsibilities: You will interact directly with colleagues across all responsibility areas and Director Of Engineering. The successful candidate for this position: - Designs and implements well-architected and scalable solutions - Collaborate with various teams in releasing high-quality software - Performs code reviews and contributes to healthy coding conventions - Assists in integration with customer systems - Provides timely responses to internal technical questions - Demonstrates leadership skills in navigating through tense periods and keeping calm Our Culture: - Integrity and motivation is more important than skill and experience - Cross-company team building and collaboration - Diverse background and highly talented & passionate group of individuals Ideal Candidate: The ideal candidate is a senior engineer having substantial development experience and high standards for code quality & maintainability. Basic Qualifications: - 4-year degree in Computer Science or Computer Engineering Preferred Qualifications: - 5+ years of development experience - Experience in Java or Scala - Experience with all parts of SDLC including CI/CD and testing methodologies - Experience in working with NoSQL technologies and message queue management - Self-motivated and able to work with minimum guidance. - Experience in a startup or rapid-growth product or project - Comfortable with modern version control, and agile development Bonus Points: - Experience in working with micro-services, containers or big data technologies - Working knowledge of cloud technologies like GCE and AWS - Writes blog posts and has a strong record on StackOverflow and similar sites
Job Requirement Installation, configuration and administration of Big Data components (including Hadoop/Spark) for batch and real-time analytics and data hubs Capable of processing large sets of structured, semi-structured and unstructured data Able to assess business rules, collaborate with stakeholders and perform source-to-target data mapping, design and review. Familiar with data architecture for designing data ingestion pipeline design, Hadoop information architecture, data modeling and data mining, machine learning and advanced data processing Optional - Visual communicator ability to convert and present data in an easy comprehensible visualization using tools like D3.js, Tableau To enjoy being challenged, solve complex problems on a daily basis Proficient in executing efficient and robust ETL workflows To be able to work in teams and collaborate with others to clarify requirements To be able to tune Hadoop solutions to improve performance and end-user experience To have strong co-ordination and project management skills to handle complex projects Engineering background
JOB DESCRIPTION: We are looking for a Data Engineer with a solid background in scalable systems to work with our engineering team to improve and optimize our platform. You will have significant input into the team’s architectural approach and execution. We are looking for a hands-on programmer who enjoys designing and optimizing data pipelines for large-scale data. This is NOT a "data scientist" role, so please don't apply if you're looking for that. RESPONSIBILITIES: 1. Build, maintain and test, performant, scalable data pipelines 2. Work with data scientists and application developers to implement scalable pipelines for data ingest, processing, machine learning and visualization 3. Building interfaces for ingest across various data stores MUST-HAVE: 1. A track record of building and deploying data pipelines as a part of work or side projects 2. Ability to work with RDBMS, MySQL or Postgres 3. Ability to deploy over cloud infrastructure, at least AWS 4. Demonstrated ability and hunger to learn GOOD-TO-HAVE: 1. Computer Science degree 2. Expertise in at least one of: Python, Java, Scala 3. Expertise and experience in deploying solutions based on Spark and Kafka 4. Knowledge of container systems like Docker or Kubernetes 5. Experience with NoSQL / graph databases: 6. Knowledge of Machine Learning Kindly apply only if you are skilled in building data pipelines.