Fix issues with plugins for our Python-based ETL pipelines
Help with automation of standard workflow
Deliver Python microservices for provisioning and managing cloud infrastructure
Responsible for any refactoring of code
Effectively manage challenges associated with handling large volumes of data working to tight deadlines
Manage expectations with internal stakeholders and context-switch in a fast-paced environment
Thrive in an environment that uses AWS and Elasticsearch extensively
Keep abreast of technology and contribute to the engineering strategy
Champion best development practices and provide mentorship to others
First and foremost you are a Python developer, experienced with the Python Data stack
You love and care about data
Your code is an artistic manifest reflecting how elegant you are in what you do
You feel sparks of joy when a new abstraction or pattern arises from your code
You support the manifests DRY (Don’t Repeat Yourself) and KISS (Keep It Short and Simple)
You are a continuous learner
You have a natural willingness to automate tasks
You have critical thinking and an eye for detail
Excellent ability and experience of working to tight deadlines
Sharp analytical and problem-solving skills
Strong sense of ownership and accountability for your work and delivery
Excellent written and oral communication skills
Mature collaboration and mentoring abilities
We are keen to know your digital footprint (community talks, blog posts, certifications, courses you have participated in or you are keen to, your personal projects as well as any kind of contributions to the open-source communities if any)
Delivering complex software, ideally in a FinTech setting
Experience with CI/CD tools such as Jenkins, CircleCI
Experience with code versioning (git / mercurial / subversion)
Our clients can aggregate, search, surveillance and report on trade, communications and market data. SteelEye also enables customers to gain powerful insights from their data, helping them to trade with greater efficiency and profitability. The company has a highly experienced management team and a strong board, who have decades of technology and management experience and worked in senior positions at many leading international financial businesses.
We are a vibrant, fun and exciting group of people that share a passion for technology and data. If you have what it takes to become a part of the SteelEye family, you have come to the right place. This is where you will find information about our people, culture and our current job opportunities.
Principal Accountabilities :
1. Good in communication and converting business requirements to functional requirements
2. Develop data-driven insights and machine learning models to identify and extract facts from sales, supply chain and operational data
3. Sound Knowledge and experience in statistical and data mining techniques: Regression, Random Forest, Boosting Trees, Time Series Forecasting, etc.
5. Experience in SOTA Deep Learning techniques to solve NLP problems.
6. End-to-end data collection, model development and testing, and integration into production environments.
7. Build and prototype analysis pipelines iteratively to provide insights at scale.
8. Experience in querying different data sources
9. Partner with developers and business teams for the business-oriented decisions
10. Looking for someone who dares to move on even when the path is not clear and be creative to overcome challenges in the data.
- Research and develop statistical learning models for data analysis
- Collaborate with product management and engineering departments to understand company needs and devise possible solutions
- Keep up-to-date with latest technology trends
- Communicate results and ideas to key decision makers
- Implement new statistical or other mathematical methodologies as needed for specific models or analysis
- Optimize joint development efforts through appropriate database use and project design
- Masters or PhD in Computer Science, Electrical Engineering, Statistics, Applied Math or equivalent fields with strong mathematical background
- Excellent understanding of machine learning techniques and algorithms, including clustering, anomaly detection, optimization, neural network etc
- 3+ years experiences building data science-driven solutions including data collection, feature selection, model training, post-deployment validation
- Strong hands-on coding skills (preferably in Python) processing large-scale data set and developing machine learning models
- Familiar with one or more machine learning or statistical modeling tools such as Numpy, ScikitLearn, MLlib, Tensorflow
- Good team worker with excellent communication skills written, verbal and presentation
- Experience with AWS, S3, Flink, Spark, Kafka, Elastic Search
- Knowledge and experience with NLP technology
- Previous work in a start-up environment
1) Understand the business objectives, formulate hypotheses and collect the relevant data using SQL/R/Python. Analyse bureau, customer and lending performance data on a periodic basis to generate insights. Present complex information and data in an uncomplicated, easyto-understand way to drive action.
2) Independently Build and refit robust models for achieving game-changing growth while managing risk.
3) Identify and implement new analytical/modelling techniques to improve model performance across customer lifecycle (acquisitions, management, fraud, collections, etc.
4) Help define the data infrastructure strategy for Indian subsidiary.
a. Monitor data quality and quantity.
b. Define a strategy for acquisition, storage, retention, and retrieval of data elements. e.g.: Identify new data types and collaborate with technology teams to capture them.
c. Build a culture of strong automation and monitoring
d. Staying connected to the Analytics industry trends - data, techniques, technology, etc. and leveraging them to continuously evolve data science standards at Credit Saison.
Required Skills & Qualifications:
1) 3+ years working in data science domains with experience in building risk models. Fintech/Financial analysis experience is required.
2) Expert level proficiency in Analytical tools and languages such as SQL, Python, R/SAS, VBA etc.
3) Experience with building models using common modelling techniques (Logistic and linear regressions, decision trees, etc.)
4) Strong familiarity with Tableau//Power BI/Qlik Sense or other data visualization tools
5) Tier 1 college graduate (IIT/IIM/NIT/BITs preferred).
6) Demonstrated autonomy, thought leadership, and learning agility.
- Creating, designing and developing data models
- Prepare plans for all ETL (Extract/Transformation/Load) procedures and architectures
- Validating results and creating business reports
- Monitoring and tuning data loads and queries
- Develop and prepare a schedule for a new data warehouse
- Analyze large databases and recommend appropriate optimization for the same
- Administer all requirements and design various functional specifications for data
- Provide support to the Software Development Life cycle
- Prepare various code designs and ensure efficient implementation of the same
- Evaluate all codes and ensure the quality of all project deliverables
- Monitor data warehouse work and provide subject matter expertise
- Hands-on BI practices, data structures, data modeling, SQL skills
- Minimum 1 year experience in Pyspark
2. Assemble large, complex data sets that meet business requirements
3. Identify, design, and implement internal process improvements
4. Optimize data delivery and re-design infrastructure for greater scalability
5. Build the infrastructure required for optimal extraction, transformation, and loading of data from a wide variety of data sources using SQL and AWS technologies
6. Build analytics tools that utilize the data pipeline to provide actionable insights into customer acquisition, operational efficiency, and other key business performance metrics
7. Work with internal and external stakeholders to assist with data-related technical issues and support data infrastructure needs
8. Create data tools for analytics and data scientist team members
1. Working knowledge of ETL on any cloud (Azure / AWS / GCP)
2. Proficient in Python (Programming / Scripting)
3. Good understanding of any of the data warehousing concepts (Snowflake / AWS Redshift / Azure Synapse Analytics / Google Big Query / Hive)
4. In-depth understanding of principles of database structure
5. Good understanding of any of the ETL technologies (Informatica PowerCenter / AWS Glue / Data Factory / SSIS / Spark / Matillion / Talend / Azure)
6. Proficient in SQL (query solving)
7. Knowledge in Change case Management / Version Control – (VSS / DevOps / TFS / GitHub, Bit bucket, CICD Jenkin)
We are actively seeking a Senior Data Engineer experienced in building data pipelines and integrations from 3rd party data sources by writing custom automated ETL jobs using Python. The role will work in partnership with other members of the Business Analytics team to support the development and implementation of new and existing data warehouse solutions for our clients. This includes designing database import/export processes used to generate client data warehouse deliverables.
- 2+ Years experience as an ETL developer with strong data architecture knowledge around data warehousing concepts, SQL development and optimization, and operational support models.
- Experience using Python to automate ETL/Data Processes jobs.
- Design and develop ETL and data processing solutions using data integration tools, python scripts, and AWS / Azure / On-Premise Environment.
- Experience / Willingness to learn AWS Glue / AWS Data Pipeline / Azure Data Factory for Data Integration.
- Develop and create transformation queries, views, and stored procedures for ETL processes, and process automation.
- Document data mappings, data dictionaries, processes, programs, and solutions as per established standards for data governance.
- Work with the data analytics team to assess and troubleshoot potential data quality issues at key intake points such as validating control totals at intake and then upon transformation, and transparently build lessons learned into future data quality assessments
- Solid experience with data modeling, business logic, and RESTful APIs.
- Solid experience in the Linux environment.
- Experience with NoSQL / PostgreSQL preferred
- Experience working with databases such as MySQL, NoSQL, and Postgres, and enterprise-level connectivity experience (such as connecting over TLS and through proxies).
- Experience with NGINX and SSL.
- Performance tune data processes and SQL queries, and recommend and implement data process optimization and query tuning techniques.
Indium Software is a niche technology solutions company with deep expertise in Digital , QA and Gaming. Indium helps customers in their Digital Transformation journey through a gamut of solutions that enhance business value.
With over 1000+ associates globally, Indium operates through offices in the US, UK and India
Visit www.indiumsoftware.com to know more.
Job Title: Analytics Data Engineer
What will you do:
The Data Engineer must be an expert in SQL development further providing support to the Data and Analytics in database design, data flow and analysis activities. The position of the Data Engineer also plays a key role in the development and deployment of innovative big data platforms for advanced analytics and data processing. The Data Engineer defines and builds the data pipelines that will enable faster, better, data-informed decision-making within the business.
Extensive Experience with SQL and strong ability to process and analyse complex data
The candidate should also have an ability to design, build, and maintain the business’s ETL pipeline and data warehouse The candidate will also demonstrate expertise in data modelling and query performance tuning on SQL Server
Proficiency with analytics experience, especially funnel analysis, and have worked on analytical tools like Mixpanel, Amplitude, Thoughtspot, Google Analytics, and similar tools.
Should work on tools and frameworks required for building efficient and scalable data pipelines
Excellent at communicating and articulating ideas and an ability to influence others as well as drive towards a better solution continuously.
Experience working in python, Hive queries, spark, pysaprk, sparkSQL, presto
- Relate Metrics to product
- Programmatic Thinking
- Edge cases
- Good Communication
- Product functionality understanding
Perks & Benefits:
A dynamic, creative & intelligent team they will make you love being at work.
Autonomous and hands-on role to make an impact you will be joining at an exciting time of growth!
Flexible work hours and Attractive pay package and perks
An inclusive work environment that lets you work in the way that works best for you!
along with metrics to track their progress
Managing available resources such as hardware, data, and personnel so that deadlines
Analysing the ML algorithms that could be used to solve a given problem and ranking
them by their success probability
Exploring and visualizing data to gain an understanding of it, then identifying
differences in data distribution that could affect performance when deploying the model
in the real world
Verifying data quality, and/or ensuring it via data cleaning
Supervising the data acquisition process if more data is needed
Defining validation strategies
Defining the pre-processing or feature engineering to be done on a given dataset
Defining data augmentation pipelines
Training models and tuning their hyper parameters
Analysing the errors of the model and designing strategies to overcome them
Deploying models to production