About A Fintech startup in Dubai
Similar jobs
Looking for freelance?
We are seeking a freelance Data Engineer with 7+ years of experience
Skills Required: Deep knowledge in any cloud (AWS, Azure , Google cloud), Data bricks, Data lakes, Data Ware housing Python/Scala , SQL, BI, and other analytics systems
What we are looking for
We are seeking an experienced Senior Data Engineer with experience in architecture, design, and development of highly scalable data integration and data engineering processes
- The Senior Consultant must have a strong understanding and experience with data & analytics solution architecture, including data warehousing, data lakes, ETL/ELT workload patterns, and related BI & analytics systems
- Strong in scripting languages like Python, Scala
- 5+ years of hands-on experience with one or more of these data integration/ETL tools.
- Experience building on-prem data warehousing solutions.
- Experience with designing and developing ETLs, Data Marts, Star Schema
- Designing a data warehouse solution using Synapse or Azure SQL DB
- Experience building pipelines using Synapse or Azure Data Factory to ingest data from various sources
- Understanding of integration run times available in Azure.
- Advanced working SQL knowledge and experience working with relational databases, and queries. authoring (SQL) as well as working familiarity with a variety of database
- A Data and MLOps Engineering lead that has a good understanding of modern Data engineering frameworks with a focus on Microsoft Azure and Azure Machine Learning and its development lifecycle and DevOps.
- Aims to solve the problems encountered when turning Data into meaningful solutions using transformations and data science code into production Machine Learning systems. Some of these challenges include:
- ML orchestration - how can I automate my ML workflows across multiple environments
- Scalability - how can I take advantage of the huge computational power available in the cloud?
- Serving - how can I make my ML models available to make predictions reliably when needed?
- Monitoring - how can I effectively monitor my ML system in production to ensure reliability? Not just system metrics, but also get insight into how my models are performing over time
- Reuse – how can I profess reuse of artefacts built and establish templates and patterns?
The MLOps team works closely with ML Engineering and DevOps teams. Rather than focus just on individual use cases, the focus would be to specialise in building the platforms and tools that can help adoption of MLOps across the organisation and develop best practices and ways of working to develop a state of the art MLOps capability.
A good understanding of AI/Machine Learning and software engineering best practices such as Cloud Engineering, Infrastructure-as-Code, and CI/CD.
Have excellent communication and consulting skills, while delivering innovative AI solutions on Azure.
Responsibilities will include:
- Building state-of-the-art MLOps platforms and tooling to help adoption of MLOps across organization
- Designing cloud ML architectures and provide a roadmap for flexible patterns
- Optimizing solutions for performance and scalability
- Leading and driving the evolving best practices for MLOps
- Helping to showcase expertise and leadership in this field
Tech stack
These are some of the tools and technologies that we use day to day. Key to success will be attitude and aptitude with a vision to build the next big thing in AI/ML field.
- Python - including poetry for dependency management, pytest for automated testing and fastapi for building APIs
- Microsoft Azure Platform - primarily focused on Databricks, Azure ML
- Containers
- CI/CD – Azure DevOps
- Strong programming skills in Python
- Solid understanding of cloud concepts
- Demonstrable interest in Machine Learning
- Understanding of IaC and CI/CD concepts
- Strong communication and presentation skills.
Remuneration: Best in the industry
Connect: https://www.linkedin.com/in/shweta-gupta-a361511
Job Responsibilities:
- Identify valuable data sources and automate collection processes
- Undertake preprocessing of structured and unstructured data.
- Analyze large amounts of information to discover trends and patterns
- Helping develop reports and analysis.
- Present information using data visualization techniques.
- Assessing tests and implementing new or upgraded software and assisting with strategic decisions on new systems.
- Evaluating changes and updates to source production systems.
- Develop, implement, and maintain leading-edge analytic systems, taking complicated problems and building simple frameworks
- Providing technical expertise in data storage structures, data mining, and data cleansing.
- Propose solutions and strategies to business challenges
Desired Skills and Experience:
- At least 1 year of experience in Data Analysis
- Complete understanding of Operations Research, Data Modelling, ML, and AI concepts.
- Knowledge of Python is mandatory, familiarity with MySQL, SQL, Scala, Java or C++ is an asset
- Experience using visualization tools (e.g. Jupyter Notebook) and data frameworks (e.g. Hadoop)
- Analytical mind and business acumen
- Strong math skills (e.g. statistics, algebra)
- Problem-solving aptitude
- Excellent communication and presentation skills.
- Bachelor’s / Master's Degree in Computer Science, Engineering, Data Science or other quantitative or relevant field is preferred
Work closely with Product Managers to drive product improvements through data driven decisions.
Conduct analysis to determine new project pilot settings, new features, user behaviour, and in-app behaviour.
Present insights and recommendations to leadership using high quality visualizations and concise messaging.
Own the implementation of data collection and tracking, and co-ordinate with engineering and product team.
Create and maintain dashboards for product and business teams.
Requirements
1+ years’ experience in analytics. Experience as Product analyst will be added advantage.
Technical skills: SQL, Advanced Excel
Good to have: R/Python, Dashboarding experience
Ability to translate structured and unstructured problems into analytical framework
Excellent analytical skills
Good communication & interpersonal skills
Ability to work in a fast-paced start-up environment, learn on the job and get things done.
BRIEF DESCRIPTION:
At-least 1 year of Python, Spark, SQL, data engineering experience
Primary Skillset: PySpark, Scala/Python/Spark, Azure Synapse, S3, RedShift/Snowflake
Relevant Experience: Legacy ETL job Migration to AWS Glue / Python & Spark combination
ROLE SCOPE:
Reverse engineer the existing/legacy ETL jobs
Create the workflow diagrams and review the logic diagrams with Tech Leads
Write equivalent logic in Python & Spark
Unit test the Glue jobs and certify the data loads before passing to system testing
Follow the best practices, enable appropriate audit & control mechanism
Analytically skillful, identify the root causes quickly and efficiently debug issues
Take ownership of the deliverables and support the deployments
REQUIREMENTS:
Create data pipelines for data integration into Cloud stacks eg. Azure Synapse
Code data processing jobs in Azure Synapse Analytics, Python, and Spark
Experience in dealing with structured, semi-structured, and unstructured data in batch and real-time environments.
Should be able to process .json, .parquet and .avro files
PREFERRED BACKGROUND:
Tier1/2 candidates from IIT/NIT/IIITs
However, relevant experience, learning attitude takes precedence
- Measure the sales effectiveness efforts using data science/app/digital nudges.
- Should be able to work on the clickstream data
- Should be well versed and willing to work hands-on various Machine Learning techniques
Skills
- Ability to lead a team of 5-6 members.
- Ability to work with large data sets and present conclusions to key stakeholders.
- Develop a clear understanding of the client’s business issue to inform the best approach to the problem.
- Root-cause analysis
- Define data requirements for creating a model and understand the business problem
- Clean, aggregate, analyze, interpret data and carry out quality analysis of it
- Set up data for predictive/prescriptive analysis
- Development of AI/ML models or statistical/econometric models.
- Working along with the team members
- Looking for insight and creating a presentation to demonstrate these insights
- Supporting development and maintenance of proprietary marketing techniques and other knowledge development projects.
• Drive the data engineering implementation
• Strong experience in building data pipelines
• AWS stack experience is must
• Deliver Conceptual, Logical and Physical data models for the implementation
teams.
• SQL stronghold is must. Advanced SQL working knowledge and experience
working with a variety of relational databases, SQL query authoring
• AWS Cloud data pipeline experience is must. Data pipelines and data centric
applications using distributed storage platforms like S3 and distributed processing
platforms like Spark, Airflow, Kafka
• Working knowledge of AWS technologies such as S3, EC2, EMR, RDS, Lambda,
Elasticsearch
• Ability to use a major programming (e.g. Python /Java) to process data for
modelling.
Required Experience: 5 - 7 Years
Skills : ADF, Azure, SSIS, python
Job Description
Azure Data Engineer with hands on SSIS migrations and ADF expertise.
Roles & Responsibilities
•Overall, 6+ years’ experience in Cloud Data Engineering, with hands on experience in ADF (Azure Data Factory) is required.
Hands on experience with SSIS to ADF migration is preferred.
SQL Server Integration Services (SSIS) workloads to SSIS in ADF. ( Must have done at least one migration)
Hands on experience implementing Azure Data Factory frameworks, scheduling, and performance tuning.
Hands on experience in migrating SSIS solutions to ADF
Hands on experience in ADF coding side.
Hands on experience with MPP Database architecture
Hands on experience in python
Role and Responsibilities
- Execute data mining projects, training and deploying models over a typical duration of 2 -12 months.
- The ideal candidate should be able to innovate, analyze the customer requirement, develop a solution in the time box of the project plan, execute and deploy the solution.
- Integrate the data mining projects embedded data mining applications in the FogHorn platform (on Docker or Android).
Core Qualifications
Candidates must meet ALL of the following qualifications:
- Have analyzed, trained and deployed at least three data mining models in the past. If the candidate did not directly deploy their own models, they will have worked with others who have put their models into production. The models should have been validated as robust over at least an initial time period.
- Three years of industry work experience, developing data mining models which were deployed and used.
- Programming experience in Python is core using data mining related libraries like Scikit-Learn. Other relevant Python mining libraries include NumPy, SciPy and Pandas.
- Data mining algorithm experience in at least 3 algorithms across: prediction (statistical regression, neural nets, deep learning, decision trees, SVM, ensembles), clustering (k-means, DBSCAN or other) or Bayesian networks
Bonus Qualifications
Any of the following extra qualifications will make a candidate more competitive:
- Soft Skills
- Sets expectations, develops project plans and meets expectations.
- Experience adapting technical dialogue to the right level for the audience (i.e. executives) or specific jargon for a given vertical market and job function.
- Technical skills
- Commonly, candidates have a MS or Ph.D. in Computer Science, Math, Statistics or an engineering technical discipline. BS candidates with experience are considered.
- Have managed past models in production over their full life cycle until model replacement is needed. Have developed automated model refreshing on newer data. Have developed frameworks for model automation as a prototype for product.
- Training or experience in Deep Learning, such as TensorFlow, Keras, convolutional neural networks (CNN) or Long Short Term Memory (LSTM) neural network architectures. If you don’t have deep learning experience, we will train you on the job.
- Shrinking deep learning models, optimizing to speed up execution time of scoring or inference.
- OpenCV or other image processing tools or libraries
- Cloud computing: Google Cloud, Amazon AWS or Microsoft Azure. We have integration with Google Cloud and are working on other integrations.
- Decision trees like XGBoost or Random Forests is helpful.
- Complex Event Processing (CEP) or other streaming data as a data source for data mining analysis
- Time series algorithms from ARIMA to LSTM to Digital Signal Processing (DSP).
- Bayesian Networks (BN), a.k.a. Bayesian Belief Networks (BBN) or Graphical Belief Networks (GBN)
- Experience with PMML is of interest (see www.DMG.org).
- Vertical experience in Industrial Internet of Things (IoT) applications:
- Energy: Oil and Gas, Wind Turbines
- Manufacturing: Motors, chemical processes, tools, automotive
- Smart Cities: Elevators, cameras on population or cars, power grid
- Transportation: Cars, truck fleets, trains
About FogHorn Systems
FogHorn is a leading developer of “edge intelligence” software for industrial and commercial IoT application solutions. FogHorn’s Lightning software platform brings the power of advanced analytics and machine learning to the on-premise edge environment enabling a new class of applications for advanced monitoring and diagnostics, machine performance optimization, proactive maintenance and operational intelligence use cases. FogHorn’s technology is ideally suited for OEMs, systems integrators and end customers in manufacturing, power and water, oil and gas, renewable energy, mining, transportation, healthcare, retail, as well as Smart Grid, Smart City, Smart Building and connected vehicle applications.
Press: https://www.foghorn.io/press-room/">https://www.foghorn.io/press-room/
Awards: https://www.foghorn.io/awards-and-recognition/">https://www.foghorn.io/awards-and-recognition/
- 2019 Edge Computing Company of the Year – Compass Intelligence
- 2019 Internet of Things 50: 10 Coolest Industrial IoT Companies – CRN
- 2018 IoT Planforms Leadership Award & Edge Computing Excellence – IoT Evolution World Magazine
- 2018 10 Hot IoT Startups to Watch – Network World. (Gartner estimated 20 billion connected things in use worldwide by 2020)
- 2018 Winner in Artificial Intelligence and Machine Learning – Globe Awards
- 2018 Ten Edge Computing Vendors to Watch – ZDNet & 451 Research
- 2018 The 10 Most Innovative AI Solution Providers – Insights Success
- 2018 The AI 100 – CB Insights
- 2017 Cool Vendor in IoT Edge Computing – Gartner
- 2017 20 Most Promising AI Service Providers – CIO Review
Our Series A round was for $15 million. Our Series B round was for $30 million October 2017. Investors include: Saudi Aramco Energy Ventures, Intel Capital, GE, Dell, Bosch, Honeywell and The Hive.
About the Data Science Solutions team
In 2018, our Data Science Solutions team grew from 4 to 9. We are growing again from 11. We work on revenue generating projects for clients, such as predictive maintenance, time to failure, manufacturing defects. About half of our projects have been related to vision recognition or deep learning. We are not only working on consulting projects but developing vertical solution applications that run on our Lightning platform, with embedded data mining.
Our data scientists like our team because:
- We care about “best practices”
- Have a direct impact on the company’s revenue
- Give or receive mentoring as part of the collaborative process
- Questions and challenging the status quo with data is safe
- Intellectual curiosity balanced with humility
- Present papers or projects in our “Thought Leadership” meeting series, to support continuous learning