1. ROLE AND RESPONSIBILITIES
1.1. Implement next generation intelligent data platform solutions that help build high performance distributed systems.
1.2. Proactively diagnose problems and envisage long term life of the product focusing on reusable, extensible components.
1.3. Ensure agile delivery processes.
1.4. Work collaboratively with stake holders including product and engineering teams.
1.5. Build best-practices in the engineering team.
2. PRIMARY SKILL REQUIRED
2.1. Having a 2-6 years of core software product development experience.
2.2. Experience of working with data-intensive projects, with a variety of technology stacks including different programming languages (Java,
Python, Scala)
2.3. Experience in building infrastructure required for optimal extraction, transformation, and loading of data from a wide variety of data
sources to support other teams to run pipelines/jobs/reports etc.
2.4. Experience in Open-source stack
2.5. Experiences of working with RDBMS databases, NoSQL Databases
2.6. Knowledge of enterprise data lakes, data analytics, reporting, in-memory data handling, etc.
2.7. Have core computer science academic background
2.8. Aspire to continue to pursue career in technical stream
3. Optional Skill Required:
3.1. Understanding of Big Data technologies and Machine learning/Deep learning
3.2. Understanding of diverse set of databases like MongoDB, Cassandra, Redshift, Postgres, etc.
3.3. Understanding of Cloud Platform: AWS, Azure, GCP, etc.
3.4. Experience in BFSI domain is a plus.
4. PREFERRED SKILLS
4.1. A Startup mentality: comfort with ambiguity, a willingness to test, learn and improve rapidl
Similar jobs
Proficiency in Linux.
Must have SQL knowledge and experience working with relational databases,
query authoring (SQL) as well as familiarity with databases including Mysql,
Mongo, Cassandra, and Athena.
Must have experience with Python/Scala.
Must have experience with Big Data technologies like Apache Spark.
Must have experience with Apache Airflow.
Experience with data pipeline and ETL tools like AWS Glue.
Experience working with AWS cloud services: EC2, S3, RDS, Redshift.
About Climate Connect Digital
Our team is inspired to change the world by making energy greener, and more affordable. Established in 2011 in London, UK, and now headquartered in Gurgaon, India. From unassuming beginnings, we have become a leading energy-AI software player, at the vanguard of accelerating the global energy transition.
In 2020, we were acquired by ReNew Power, India’s largest renewables developer. Following ReNew’s NASDAQ listing in summer 2021, Climate Connect Digital has been newly formed as a fully independent subsidiary. With backing from ReNew for an ambitious and visionary new strategy, for rapid growth.
Our mission has technology at its core and involves unlocking value by leveraging information technology and AI/ML across the energy ecosystem. However, the world is still nascent in using such technologies in the energy sector to create massive value for all stakeholders.
How do you fit in
As we start into our first strong growth phase, we are looking for a Data Engineer to build the data infrastructure to support business and product growth.
You are someone who can see projects through from beginning to end, coach others, and self-manage. We’re looking for an eager individual who can guide our data stack using AWS services with technical knowledge, communication skills, and real-world experience.
The data flowing through our platform directly contributes to decision-making by algorithms & all levels of leadership alike. If you’re passionate about building tools that enhance productivity, improve green energy, reduce waste, and improve work-life harmony for a large and rapidly growing finance user base, come join us!
Job Responsibilities:
- Iterate, build, and implement our data model, data warehousing, and data integration architecture using AWS & GCP services
- Build solutions that ingest data from source and partner systems into our data infrastructure, where the data is transformed, intelligently curated and made available for consumption by downstream operational and analytical processes
- Integrate data from source systems using common ETL tools or programming languages (e.g. Python, Scala, Node.js etc.)
- Develop tailor-made strategies, concepts and solutions for the efficient handling of our growing amounts of data
- Work iteratively with our data scientist to build up fact tables (e.g. container ship movements), dimension tables (e.g. weather data), ETL processes, and build the data catalog
Job Requirements:
- 4+ years of Experience designing, building and maintaining data architecture and warehousing using GCP/AWS services
- Authoritative in ETL optimization, designing, coding, and tuning big data processes using Apache Spark, R, Python, C# and/or similar technologies
- Experience managing GCP/AWS resources using Terraform
- Experience in Data engineering and infrastructure work for analytical and machine learning processes
- Experience with ETL tooling, and migrating ETL code from one technology to another will be a benefit
- Experience with Data visualization / dashboarding tools as QA/QC data processes
- Independent, self-starter who thrives in a fast pace environment
What’s in it for you
We offer competitive salaries based on prevailing market rates. In addition to your introductory package, you can expect to receive the following benefits:
- Remote-first work culture
- Flexible working hours and leave policy
- Learning and development opportunities
- Medical insurance/Term insurance, Gratuity benefits over and above the salaries
At Climate Connect, you get a rare opportunity to join an established company at the early stages of a significant and well-backed global growth push.
We are building a remote-first organisation ingrained in the team ethos. We understand its importance for the success of any next-generation technology company. The team includes passionate and self-driven people with unconventional backgrounds, and we’re seeking a similar spirit with the right potential.
What it’s like to work with us
You become part of a strong network and an accomplished legacy from leading technology and business schools worldwide when you join us. Such as the Indian Institute of Technology, Oxford University, University of Cambridge, University College London, and many more.
We don’t believe in constrained traditional hierarchies and instead work in flexible teams with the freedom to achieve successful business outcomes. We want more people who can thrive in a fast-paced, collaborative environment. Our comprehensive support system comprises a global network of advisors and experts, providing unparalleled opportunities for learning and growth.
*Apply only if you are serving Notice Period
HIRING SQL Developers with max 20 days Of NOTICE PERIOD
Job ID: TNS2023DB01
Who Should apply?
- Only for Serious job seekers who are ready to work in night shift
- Technically Strong Candidates who are willing to take up challenging roles and want to raise their Career graph
- No DBAs & BI Developers, please
Why Think n Solutions Software?
- Exposure to the latest technology
- Opportunity to work on different platforms
- Rapid Career Growth
- Friendly Knowledge-Sharing Environment
Criteria:
- BE/MTech/MCA/MSc
- 2yrs Hands Experience in MS SQL / NOSQL
- Immediate joiners preferred/ Maximum notice period between 15 to 20 days
- Candidates will be selected based on logical/technical and scenario-based testing
- Work time - 10:00 pm to 6:00 am
Note: Candidates who have attended the interview process with TnS in the last 6 months will not be eligible.
Job Description:
- Technical Skills Desired:
- Experience in MS SQL Server, and one of these Relational DB’s, PostgreSQL / AWS Aurora DB / MySQL / any of NoSQL DBs (MongoDB / DynamoDB / DocumentDB) in an application development environment and eagerness to switch
- Design database tables, views, indexes
- Write functions and procedures for Middle Tier Development Team
- Work with any front-end developers in completing the database modules end to end (hands-on experience in the parsing of JSON & XML in Stored Procedures would be an added advantage).
- Query Optimization for performance improvement
- Design & develop SSIS Packages or any other Transformation tools for ETL
- Functional Skills Desired:
- The banking / Insurance / Retail domain would be a
- Interaction with a client a
3. Good to Have Skills:
- Knowledge in a Cloud Platform (AWS / Azure)
- Knowledge on version control system (SVN / Git)
- Exposure to Quality and Process Management
- Knowledge in Agile Methodology
- Soft skills: (additional)
- Team building (attitude to train, work along, and mentor juniors)
- Communication skills (all kinds)
- Quality consciousness
- Analytical acumen to all business requirements
- Think out-of-box for business solution
Experience Range |
2 Years - 10 Years |
Function | Information Technology |
Desired Skills |
Must Have Skills:
• Good experience in Pyspark - Including Dataframe core functions and Spark SQL
• Good experience in SQL DBs - Be able to write queries including fair complexity.
• Should have excellent experience in Big Data programming for data transformation and aggregations
• Good at ELT architecture. Business rules processing and data extraction from Data Lake into data streams for business consumption.
• Good customer communication.
• Good Analytical skills
|
Education Type | Engineering |
Degree / Diploma | Bachelor of Engineering, Bachelor of Computer Applications, Any Engineering |
Specialization / Subject | Any Specialisation |
Job Type | Full Time |
Job ID | 000018 |
Department | Software Development |
Job Description:
We are looking for a Big Data Engineer who have worked across the entire ETL stack. Someone who has ingested data in a batch and live stream format, transformed large volumes of daily and built Data-warehouse to store the transformed data and has integrated different visualization dashboards and applications with the data stores. The primary focus will be on choosing optimal solutions to use for these purposes, then maintaining, implementing, and monitoring them.
Responsibilities:
- Develop, test, and implement data solutions based on functional / non-functional business requirements.
- You would be required to code in Scala and PySpark daily on Cloud as well as on-prem infrastructure
- Build Data Models to store the data in a most optimized manner
- Identify, design, and implement process improvements: automating manual processes, optimizing data delivery, re-designing infrastructure for greater scalability, etc.
- Implementing the ETL process and optimal data pipeline architecture
- Monitoring performance and advising any necessary infrastructure changes.
- Create data tools for analytics and data scientist team members that assist them in building and optimizing our product into an innovative industry leader.
- Work with data and analytics experts to strive for greater functionality in our data systems.
- Proactively identify potential production issues and recommend and implement solutions
- Must be able to write quality code and build secure, highly available systems.
- Create design documents that describe the functionality, capacity, architecture, and process.
- Review peer-codes and pipelines before deploying to Production for optimization issues and code standards
Skill Sets:
- Good understanding of optimal extraction, transformation, and loading of data from a wide variety of data sources using SQL and ‘big data’ technologies.
- Proficient understanding of distributed computing principles
- Experience in working with batch processing/ real-time systems using various open-source technologies like NoSQL, Spark, Pig, Hive, Apache Airflow.
- Implemented complex projects dealing with the considerable data size (PB).
- Optimization techniques (performance, scalability, monitoring, etc.)
- Experience with integration of data from multiple data sources
- Experience with NoSQL databases, such as HBase, Cassandra, MongoDB, etc.,
- Knowledge of various ETL techniques and frameworks, such as Flume
- Experience with various messaging systems, such as Kafka or RabbitMQ
- Creation of DAGs for data engineering
- Expert at Python /Scala programming, especially for data engineering/ ETL purposes
- Work with the application development team to implement database schema, data strategies, build data flows
- Should have expertise across the database. Responsible for gathering and analyzing data requirements and data flows to other systems
- Perform database modeling and SQL scripting
- 8+ years of experience in Database modeling and SQL scripting
- Experience in OLTP and OLAP database modeling
- Experience in Data modeling for Relational (PostgreSQL) and Non-relational Databases
- Strong scripting experience P-SQL (Views, Store procedures), Python, Spark.
- Develop notebooks & jobs using Python language
- Big Data stack: Spark, Hadoop, Sqoop, Pig, Hive, Hbase, Flume, Kafka, Storm
- Should have good understanding on Azure Cosmos DB, Azure DW
empower healthcare payers, providers and members to quickly process medical data to
make informed decisions and reduce health care costs. You will be focusing on research,
development, strategy, operations, people management, and being a thought leader for
team members based out of India. You should have professional healthcare experience
using both structured and unstructured data to build applications. These applications
include but are not limited to machine learning, artificial intelligence, optical character
recognition, natural language processing, and integrating processes into the overall AI
pipeline to mine healthcare and medical information with high recall and other relevant
metrics. The results will be used dually for real-time operational processes with both
automated and human-based decision making as well as contribute to reducing
healthcare administrative costs. We work with all major cloud and big data vendors
offerings including (Azure, AWS, Google, IBM, etc.) to achieve our goals in healthcare and
support
The Director, Data Science will have the opportunity to build a team, shape team culture
and operating norms as a result of the fast-paced nature of a new, high-growth
organization.
• Strong communication and presentation skills to convey progress to a diverse group of stakeholders
• Strong expertise in data science, data engineering, software engineering, cloud vendors, big data technologies, real-time streaming applications, DevOps and product delivery
• Experience building stakeholder trust and confidence in deployed models especially via application of the algorithmic bias, interpretable machine learning,
data integrity, data quality, reproducible research and reliable engineering 24x7x365 product availability, scalability
• Expertise in healthcare privacy, federated learning, continuous integration and deployment, DevOps support
• Provide mentoring to data scientists and machine learning engineers as well as career development
• Meet project related team members for individual specific needs on a regular basis related to project/product deliverables
• Provide training and guidance for team members when required
• Provide performance feedback when required by leadership
The Experience You’ll Need (Required):
• MS/M.Tech degree or PhD in Computer Science, Mathematics, Physics or related STEM fields
• Significant healthcare data experience including but not limited to usage of claims data
• Delivered multiple data science and machine learning projects over 8+ years with values exceeding $10 Million or more and has worked on platform members exceeding 10 million lives
• 9+ years of industry experience in data science, machine learning, and artificial intelligence
• Strong expertise in data science, data engineering, software engineering, cloud vendors, big data technologies, real time streaming applications, DevOps, and product delivery
• Knows how to solve and launch real artificial intelligence and data science related problems and products along with managing and coordinating the
business process change, IT / cloud operations, meeting production level code standards
• Ownerships of key workflows part of data science life cycle like data acquisition, data quality, and results
• Experience building stakeholder trust and confidence in deployed models especially via application of algorithmic bias, interpretable machine learning,
data integrity, data quality, reproducible research, and reliable engineering 24x7x365 product availability, scalability
• Expertise in healthcare privacy, federated learning, continuous integration and deployment, DevOps support
• 3+ Years of experience managing directly five (5) or more senior level data scientists, machine learning engineers with advanced degrees and directly
made staff decisions
• Very strong understanding of mathematical concepts including but not limited to linear algebra, advanced calculus, partial differential equations, and
statistics including Bayesian approaches at master’s degree level and above
• 6+ years of programming experience in C++ or Java or Scala and data science programming languages like Python and R including strong understanding of
concepts like data structures, algorithms, compression techniques, high performance computing, distributed computing, and various computer architecture
• Very strong understanding and experience with traditional data science approaches like sampling techniques, feature engineering, classification, and
regressions, SVM, trees, model evaluations with several projects over 3+ years
• Very strong understanding and experience in Natural Language Processing,
reasoning, and understanding, information retrieval, text mining, search, with
3+ years of hands on experience
• Experience with developing and deploying several products in production with
experience in two or more of the following languages (Python, C++, Java, Scala)
• Strong Unix/Linux background and experience with at least one of the
following cloud vendors like AWS, Azure, and Google
• Three plus (3+) years hands on experience with MapR \ Cloudera \ Databricks
Big Data platform with Spark, Hive, Kafka etc.
• Three plus (3+) years of experience with high-performance computing like
Dask, CUDA distributed GPU, TPU etc.
• Presented at major conferences and/or published materials
High Level Scope of Work :
- Work with AI / Analytics team to priorities MACHINE LEARNING Identified USE CASES for Development and Rollout
- Meet and understand current retail / Marketing Requirements and how AI/ML solution will address and automate the decision process.
- Develop AI/ML Programs using DATAIKU Solution & Python or open source tech with focus to deliver high Quality and accurate ML prediction Model
- Gather additional and external data sources to support the AI/ML Model as desired .
- Support the ML Model and FINE TUNEit to ensure high accuracy all the time.
- Example of use cases (Customer Segmentation , Product Recommendation, Price Optimization, Retail Customer Personalization Offers, Next Best Location for Business Est, CCTV Computer Vision, NLP and Voice Recognition Solutions)
Required technology expertise :
- Deep Knowledge & Understanding on MACHINE LEARNING ALGORITHMS (Supervised / Unsupervised Learning / Deep Learning Models)
- Hands on EXP for at least 5+ years with PYTHON and R STATISTICS PROGRAMMING Languages
- Strong Database Development knowledge using SQL and PL/SQL
- Must have EXP using Commercial Data Science Solution particularly DATAIKU and (Altryx, SAS, Azure ML, Google ML, Oracle ML is a plus)
- Strong hands on EXP with BIG DATA Solution Architecture and Optimization for AI/ML Workload.
- Data Analytics and BI Tools Hand on EXP particularly (Oracle OBIEE and Power BI)
- Have implemented and Developed at least 3 successful AI/ML Projects with tangible Business Outcomes In retail Focused Industry
- Have at least 5+ Years EXP in Retail Industry and Customer Focus Business.
- Ability to communicate with Business Owner & stakeholders to understand their current issues and provide MACHINE LEARNING Solution accordingly.
Qualifications
- Bachelor Degree or Master Degree in Data Science, Artificial Intelligent, Computer Science
- Certified as DATA SCIENTIST or MACHINE LEARNING Expert.