Recko Inc. is looking for data engineers to join our kick-ass engineering team. We are looking for smart, dynamic individuals to connect all the pieces of the data ecosystem.
What are we looking for:
3+ years of development experience in at least one of MySQL, Oracle, PostgreSQL or MSSQL and experience in working with Big Data technologies like Big Data frameworks/platforms/data stores like Hadoop, HDFS, Spark, Oozie, Hue, EMR, Scala, Hive, Glue, Kerberos etc.
Strong experience setting up data warehouses, data modeling, data wrangling and dataflow architecture on the cloud
2+ experience with public cloud services such as AWS, Azure, or GCP and languages like Java/ Python etc
2+ years of development experience in Amazon Redshift, Google Bigquery or Azure data warehouse platforms preferred
Knowledge of statistical analysis tools like R, SAS etc
Familiarity with any data visualization software
A growth mindset and passionate about building things from the ground up and most importantly, you should be fun to work with
As a data engineer at Recko, you will:
Create and maintain optimal data pipeline architecture,
Assemble large, complex data sets that meet functional / non-functional business requirements.
Identify, design, and implement internal process improvements: automating manual processes, optimizing data delivery, re-designing infrastructure for greater scalability, etc.
Build the infrastructure required for optimal extraction, transformation, and loading of data from a wide variety of data sources using SQL and AWS ‘big data’ technologies.
Build analytics tools that utilize the data pipeline to provide actionable insights into customer acquisition, operational efficiency and other key business performance metrics.
Work with stakeholders including the Executive, Product, Data and Design teams to assist with data-related technical issues and support their data infrastructure needs.
Keep our data separated and secure across national boundaries through multiple data centers and AWS regions.
Create data tools for analytics and data scientist team members that assist them in building and optimizing our product into an innovative industry leader.
Work with data and analytics experts to strive for greater functionality in our data systems.
Recko was founded in 2017 to organise the world’s transactional information and provide intelligent applications to finance and product teams to make sense of the vast amount of data available. With the proliferation of digital transactions over the past two decades, Enterprises, Banks and Financial institutions are finding it difficult to keep a track on the money flowing across their systems. With the Recko Platform, businesses can build, integrate and adapt innovative and complex financial use cases within the organization and across external payment ecosystems with agility, confidence and at scale. . Today, customer-obsessed brands such as Deliveroo, Meesho, Grofers, Dunzo, Acommerce, etc use Recko so their finance teams can optimize resources with automation and prioritize growth over repetitive and time-consuming tasks around day-to-day operations.
Recko is a Series A funded startup, backed by marquee investors like Vertex Ventures, Prime Venture Partners and Locus Ventures. Traditionally enterprise software is always built around functionality. We believe software is an extension of one’s capability, and it should be delightful and fun to use.
Working at Recko:
We believe that great companies are built by amazing people. At Recko, We are a group of young Engineers, Product Managers, Analysts and Business folks who are on a mission to bring consumer tech DNA to enterprise fintech applications. The current team at Recko is 60+ members strong with stellar experience across fintech, e-commerce, digital domains at companies like Flipkart, PhonePe, Ola Money, Belong, Razorpay, Grofers, Jio, Oracle etc. We are growing aggressively across verticals.
- Experience with Cloud native Data tools/Services such as AWS Athena, AWS Glue, Redshift Spectrum, AWS EMR, AWS Aurora, Big Query, Big Table, S3, etc.
- Strong programming skills in at least one of the following languages: Java, Scala, C++.
- Familiarity with a scripting language like Python as well as Unix/Linux shells.
- Comfortable with multiple AWS components including RDS, AWS Lambda, AWS Glue, AWS Athena, EMR. Equivalent tools in the GCP stack will also suffice.
- Strong analytical skills and advanced SQL knowledge, indexing, query optimization techniques.
- Experience implementing software around data processing, metadata management, and ETL pipeline tools like Airflow.
Experience with the following software/tools is highly desired:
- Apache Spark, Kafka, Hive, etc.
- SQL and NoSQL databases like MySQL, Postgres, DynamoDB.
- Workflow management tools like Airflow.
- AWS cloud services: RDS, AWS Lambda, AWS Glue, AWS Athena, EMR.
- Familiarity with Spark programming paradigms (batch and stream-processing).
- RESTful API services.
Mid / Senior Big Data Engineer
Role: Big Data EngineerNumber of open positions: 5Location: PuneAt Clairvoyant, we're building a thriving big data practice to help enterprises enable and accelerate the adoption of Big data and cloud services. In the big data space, we lead and serve as innovators, troubleshooters, and enablers. Big data practice at Clairvoyant, focuses on solving our customer's business problems by delivering products designed with best in class engineering practices and a commitment to keep the total cost of ownership to a minimum.
- 4-10 years of experience in software development.
- At least 2 years of relevant work experience on large scale Data applications.
- Strong coding experience in Java is mandatory
- Good aptitude, strong problem solving abilities, and analytical skills, ability to take ownership as appropriate
- Should be able to do coding, debugging, performance tuning and deploying the apps to Prod.
- Should have good working experience on
- o Hadoop ecosystem (HDFS, Hive, Yarn, File formats like Avro/Parquet)
- o Kafka
- o J2EE Frameworks (Spring/Hibernate/REST)
- o Spark Streaming or any other streaming technology.
- Strong coding experience in Java is mandatory
- Ability to work on the sprint stories to completion along with Unit test case coverage.
- Experience working in Agile Methodology
- Excellent communication and coordination skills
- Knowledgeable (and preferred hands on) - UNIX environments, different continuous integration tools.
- Must be able to integrate quickly into the team and work independently towards team goals
- Take the complete responsibility of the sprint stories' execution
- Be accountable for the delivery of the tasks in the defined timelines with good quality.
- Follow the processes for project execution and delivery.
- Follow agile methodology
- Work with the team lead closely and contribute to the smooth delivery of the project.
- Understand/define the architecture and discuss the pros-cons of the same with the team
- Involve in the brainstorming sessions and suggest improvements in the architecture/design.
- Work with other team leads to get the architecture/design reviewed.
- Work with the clients and counter-parts (in US) of the project.
- Keep all the stakeholders updated about the project/task status/risks/issues if there are any.
Experience: 4 to 9 years
Keywords: java, scala, spark, software development, hadoop, hive
- Hands-on experience in any Cloud Platform
- Microsoft Azure Experience
- 5+ years of experience building real-time and distributed system architecture, from whiteboard to production
- Strong programming skills in Python, Scala and SQL.
- Versatility. Experience across the entire spectrum of data engineering, including:
- Data stores (e.g., AWS RDS, AWS Athena, AWS Aurora, AWS Redshift)
- Data pipeline and workflow orchestration tools (e.g., Azkaban, Airflow)
- Data processing technologies (e.g., Spark, Pentaho)
- Deployment and monitoring large database clusters in public cloud platforms (e.g., Docker, Terraform, Datadog)
- Creating ETL or ELT pipelines that transform and process petabytes of structured and unstructured data in real-time
- Industry experience building and productionizing innovative end-to-end Machine Learning systems is a plus.
We are looking for talented and driven Data Engineers at various levels to work with customers and data scientists to build the data warehouse, analytical dashboards and ML capabilities as per customer needs.
Required Qualifications :
- 3-5 years of experience of developing and managing streaming and batch data pipelines
- Experience in Big Data, data architecture, data modeling, data warehousing, data wrangling, data integration, data testing and application performance tuning
- Experience with data engineering tools and platforms such as Kafka, Spark, Databricks, Flink, Storm, Druid and Hadoop
- Strong with hands-on programming and scripting for Big Data ecosystem (Python, Scala, Spark, etc)
- Experience building batch and streaming ETL data pipelines using workflow management tools like Airflow, Luigi, NiFi, Talend, etc
- Familiarity with cloud-based platforms like AWS, Azure or GCP
- Experience with cloud data warehouses like Redshift and Snowflake
- Proficient in writing complex SQL queries.
- Experience working with structured and semi-structured data formats like CSV, JSON and XML
- Desire to learn about, explore and invent new tools for solving real-world problems using data
Desired Qualifications :
- Cloud computing experience, Amazon Web Services (AWS)
- Prior experience in Data Warehousing concepts, multi-dimensional data models
- Full command of Analytics concepts including Dimension, KPI, Reports & Dashboards
- Research and develop statistical learning models for data analysis
- Collaborate with product management and engineering departments to understand company needs and devise possible solutions
- Keep up-to-date with latest technology trends
- Communicate results and ideas to key decision makers
- Implement new statistical or other mathematical methodologies as needed for specific models or analysis
- Optimize joint development efforts through appropriate database use and project design
- Masters or PhD in Computer Science, Electrical Engineering, Statistics, Applied Math or equivalent fields with strong mathematical background
- Excellent understanding of machine learning techniques and algorithms, including clustering, anomaly detection, optimization, neural network etc
- 3+ years experiences building data science-driven solutions including data collection, feature selection, model training, post-deployment validation
- Strong hands-on coding skills (preferably in Python) processing large-scale data set and developing machine learning models
- Familiar with one or more machine learning or statistical modeling tools such as Numpy, ScikitLearn, MLlib, Tensorflow
- Good team worker with excellent communication skills written, verbal and presentation
- Experience with AWS, S3, Flink, Spark, Kafka, Elastic Search
- Knowledge and experience with NLP technology
- Previous work in a start-up environment
Good Python developers / Data Engineers / Devops engineers
Work loc: Chennai. / Remote support
ears of Exp: 3-6+ Years
Skills: Scala, Python, Hive, Airflow, Spark
Languages: Java, Python, Shell Scripting
GCP: BigTable, DataProc, BigQuery, GCS, Pubsub
AWS: Athena, Glue, EMR, S3, Redshift
MongoDB, MySQL, Kafka
Platforms: Cloudera / Hortonworks
AdTech domain experience is a plus.
Job Type - Full Time
• 5+ years’ experience developing and maintaining modern ingestion pipeline using
technologies like Spark, Apache Nifi etc).
• 2+ years’ experience with Healthcare Payors (focusing on Membership, Enrollment, Eligibility,
• Claims, Clinical)
• Hands on experience on AWS Cloud and its Native components like S3, Athena, Redshift &
• Jupyter Notebooks
• Strong in Spark Scala & Python pipelines (ETL & Streaming)
• Strong experience in metadata management tools like AWS Glue
• String experience in coding with languages like Java, Python
• Worked on designing ETL & streaming pipelines in Spark Scala / Python
• Good experience in Requirements gathering, Design & Development
• Working with cross-functional teams to meet strategic goals.
• Experience in high volume data environments
• Critical thinking and excellent verbal and written communication skills
• Strong problem-solving and analytical abilities, should be able to work and delivery
• Good-to-have AWS Developer certified, Scala coding experience, Postman-API and Apache
Airflow or similar schedulers experience
• Nice-to-have experience in healthcare messaging standards like HL7, CCDA, EDI, 834, 835, 837
• Good communication skills
As a Data Warehouse Engineer in our team, you should have a proven ability to deliver high-quality work on time and with minimal supervision.
Develops or modifies procedures to solve complex database design problems, including performance, scalability, security and integration issues for various clients (on-site and off-site).
Design, develop, test, and support the data warehouse solution.
Adapt best practices and industry standards, ensuring top quality deliverable''s and playing an integral role in cross-functional system integration.
Design and implement formal data warehouse testing strategies and plans including unit testing, functional testing, integration testing, performance testing, and validation testing.
Evaluate all existing hardware's and software's according to required standards and ability to configure the hardware clusters as per the scale of data.
Data integration using enterprise development tool-sets (e.g. ETL, MDM, Quality, CDC, Data Masking, Quality).
Maintain and develop all logical and physical data models for enterprise data warehouse (EDW).
Contributes to the long-term vision of the enterprise data warehouse (EDW) by delivering Agile solutions.
Interact with end users/clients and translate business language into technical requirements.
Acts independently to expose and resolve problems.
Participate in data warehouse health monitoring and performance optimizations as well as quality documentation.
Job Requirements :
2+ years experience working in software development & data warehouse development for enterprise analytics.
2+ years of working with Python with major experience in Red-shift as a must and exposure to other warehousing tools.
Deep expertise in data warehousing, dimensional modeling and the ability to bring best practices with regard to data management, ETL, API integrations, and data governance.
Experience working with data retrieval and manipulation tools for various data sources like Relational (MySQL, PostgreSQL, Oracle), Cloud-based storage.
Experience with analytic and reporting tools (Tableau, Power BI, SSRS, SSAS). Experience in AWS cloud stack (S3, Glue, Red-shift, Lake Formation).
Experience in various DevOps practices helping the client to deploy and scale the systems as per requirement.
Strong verbal and written communication skills with other developers and business clients.
Knowledge of Logistics and/or Transportation Domain is a plus.
Ability to handle/ingest very huge data sets (both real-time data and batched data) in an efficient manner.