Looking for freelance?
We are seeking a freelance Data Engineer with 7+ years of experience
Skills Required: Deep knowledge in any cloud (AWS, Azure , Google cloud), Data bricks, Data lakes, Data Ware housing Python/Scala , SQL, BI, and other analytics systems
What we are looking for
We are seeking an experienced Senior Data Engineer with experience in architecture, design, and development of highly scalable data integration and data engineering processes
- The Senior Consultant must have a strong understanding and experience with data & analytics solution architecture, including data warehousing, data lakes, ETL/ELT workload patterns, and related BI & analytics systems
- Strong in scripting languages like Python, Scala
- 5+ years of hands-on experience with one or more of these data integration/ETL tools.
- Experience building on-prem data warehousing solutions.
- Experience with designing and developing ETLs, Data Marts, Star Schema
- Designing a data warehouse solution using Synapse or Azure SQL DB
- Experience building pipelines using Synapse or Azure Data Factory to ingest data from various sources
- Understanding of integration run times available in Azure.
- Advanced working SQL knowledge and experience working with relational databases, and queries. authoring (SQL) as well as working familiarity with a variety of database
Subodh PopalwarSoftware Engineer, Memorres
About Staffbee Solutions INC
Lightning Job By Cutshort ⚡
As part of this feature, you can expect status updates about your application and replies within 72 hours (once the screening questions are answered)
Redica Systems is a data analytics platform built to help life sciences companies improve their quality and stay on top of evolving regulations. Our proprietary processes transform one of the industry’s most complete data sets, aggregated from hundreds of health agencies and unique Freedom of Information Act (FOIA) sourcing, into meaningful answers and insights that reduce regulatory and compliance risk. Founded in 2010, Redica Systems serves over 200 customers in the Pharma, BioPharma, MedTech, and Food and Cosmetics industries, including 19 of the top 20 Pharma companies and 9 of the 10 top MedTech companies. Redica Systems’ headquarters are in Pleasanton, CA, but we are a geographically distributed company. More information is available at redica.com.
We’re looking for an experienced Senior Data Engineer II to join our team as we continue to develop the first-of-its-kind quality and regulatory intelligence (QRI) platform for the life science industry. The ideal candidate will come with experience leading/mentoring a team of developers and maintaining a high bar of quality while remaining hands-on in the code.
Full understanding of the technical architecture and the different sub-systems
Able to work as a lead in an Agile Scrum environment, with a keen focus on delivering sustainable, high-performance, scalable, and easily maintainable enterprise solutions
Helps prioritize technical issues with engineering managers
Proactively guides technical decisions in a domain of expertise
Recommend and validate different ways to improve data reliability, efficiency, and quality
Identify optimal approaches for resolving data quality or consistency issues
Ensure successful system delivery to the production environment and assist the operations and support team in resolving production issues, as necessary
Lead the acquisition of data from a variety of sources, intelligent change monitoring, data mapping, transformations, and analysis
Develop, test, and maintain architectures for data stores, databases, processing systems, and microservices
Integrate various sub-systems or components to deliver end-to-end solutions
Integrate data pipeline with NLP/ML services
5+ years of senior or lead developer experience with an emphasis on technical mentorship, code/system architecture, and quality output
Extensive experience designing and building data pipelines, data APIs, and ETL/ELT processes
Extensive experience in data modelling and datawarehouse concepts
Deep, hands-on experience in Python
Hands-on experience setting up, configuring, and maintaining SQL and no-SQL databases (MySQL/MariaDB, PostgreSQL, MongoDB, Snowflake)
Computer Science, Computer Engineering, or similar technical degree
Experience with the data engineering stack within AWS is a major plus (S3, Lake Formation, Lambda, Fargate, Kinesis Data Streams/Data Firehose, DynamoDB, Neptune DB)
Experience with event-driven data architectures
Experience with the ELK stack is a major plus (ElasticSearch, LogStash, Kibana)
● Create and maintain optimal data pipeline architecture.
● Assemble large, complex data sets that meet functional / non-functional
● Building and optimizing ‘big data’ data pipelines, architectures and data sets.
● Maintain, organize & automate data processes for various use cases.
● Identifying trends, doing follow-up analysis, preparing visualizations.
● Creating daily, weekly and monthly reports of product KPIs.
● Create informative, actionable and repeatable reporting that highlights
relevant business trends and opportunities for improvement.
Required Skills And Experience:
● 2-5 years of work experience in data analytics- including analyzing large data sets.
● BTech in Mathematics/Computer Science
● Strong analytical, quantitative and data interpretation skills.
● Hands-on experience with Python, Apache Spark, Hadoop, NoSQL
databases(MongoDB preferred), Linux is a must.
● Experience building and optimizing ‘big data’ data pipelines, architectures and data sets.
● Experience with Google Cloud Data Analytics Products such as BigQuery, Dataflow, Dataproc etc. (or similar cloud-based platforms).
● Experience working within a Linux computing environment, and use of
command-line tools including knowledge of shell/Python scripting for
automating common tasks.
● Previous experience working at startups and/or in fast-paced environments.
● Previous experience as a data engineer or in a similar role.
Job Title – Data Scientist (Forecasting)
Anicca Data is seeking a Data Scientist (Forecasting) who is motivated to apply his/her/their skill set to solve complex and challenging problems. The focus of the role will center around applying deep learning models to real-world applications. The candidate should have experience in training, testing deep learning architectures. This candidate is expected to work on existing codebases or write an optimized codebase at Anicca Data. The ideal addition to our team is self-motivated, highly organized, and a team player who thrives in a fast-paced environment with the ability to learn quickly and work independently.
Job Location: Remote (for time being) and Bangalore, India (post-COVID crisis)
- At least 3+ years of experience in a Data Scientist role
- Bachelor's/Master’s degree in Computer Science, Engineering, Statistics, Mathematics, or similar quantitative discipline. D. will add merit to the application process
- Experience with large data sets, big data, and analytics
- Exposure to statistical modeling, forecasting, and machine learning. Deep theoretical and practical knowledge of deep learning, machine learning, statistics, probability, time series forecasting
- Training Machine Learning (ML) algorithms in areas of forecasting and prediction
- Experience in developing and deploying machine learning solutions in a cloud environment (AWS, Azure, Google Cloud) for production systems
- Research and enhance existing in-house, open-source models, integrate innovative techniques, or create new algorithms to solve complex business problems
- Experience in translating business needs into problem statements, prototypes, and minimum viable products
- Experience managing complex projects including scoping, requirements gathering, resource estimations, sprint planning, and management of internal and external communication and resources
- Write C++ and Python code along with TensorFlow, PyTorch to build and enhance the platform that is used for training ML models
- Worked on forecasting projects – both classical and ML models
- Experience with training time series forecasting methods like Moving Average (MA) and Autoregressive Integrated Moving Average (ARIMA) with Neural Networks (NN) models as Feed-forward NN and Nonlinear Autoregressive
- Strong background in forecasting accuracy drivers
- Experience in Advanced Analytics techniques such as regression, classification, and clustering
- Ability to explain complex topics in simple terms, ability to explain use cases and tell stories
Exp-Min 10 Years
Powerbi, Tableau, QlikView,
Solution Architect/Technology Lead – Data Analytics
Looking for Business Intelligence lead (BI Lead) having hands on experience BI tools (Tableau, SAP Business Objects, Financial and Accounting modules, Power BI), SAP integration, and database knowledge including one or more of Azure Synapse/Datafactory, SQL Server, Oracle, cloud-based DB Snowflake. Good knowledge of AI-ML, Python is also expected.
- You will be expected to work closely with our business users. The development will be performed using an Agile methodology which is based on scrum (time boxing, daily scrum meetings, retrospectives, etc.) and XP (continuous integration, refactoring, unit testing, etc) best practices. Candidates must therefore be able to work collaboratively, demonstrate good ownership, leadership and be able to work well in teams.
- Responsibilities :
- Design, development and support of multiple/hybrid Data sources, data visualization Framework using Power BI, Tableau, SAP Business Objects etc. and using ETL tools, Scripting, Python Scripting etc.
- Implementing DevOps techniques and practices like Continuous Integration, Continuous Deployment, Test Automation, Build Automation and Test-Driven Development to enable the rapid delivery of working code-utilizing tools like Git. Primary Skills
- 10+ years working as a hands-on developer in Information Technology across Database, ETL and BI (SAP Business Objects, integration with SAP Financial and Accounting modules, Tableau, Power BI) & prior team management experience
- Tableau/PowerBI integration with SAP and knowledge of SAP modules related to finance is a must
- 3+ years of hands-on development experience in Data Warehousing and Data Processing
- 3+ years of Database development experience with a solid understanding of core database concepts and relational database design, SQL, Performance tuning
- 3+ years of hands-on development experience with Tableau
- 3+ years of Power BI experience including parameterized reports and publishing it on PowerBI Service
- Excellent understanding and practical experience delivering under an Agile methodology
- Ability to work with business users to provide technical support
- Ability to get involved in all the stages of project lifecycle, including analysis, design, development, testing, Good To have Skills
- Experience with other Visualization tools and reporting tools like SAP Business Objects.
- 5+ years of industry experience in administering (including setting up, managing, monitoring) data processing pipelines (both streaming and batch) using frameworks such as Kafka Streams, Py Spark, and streaming databases like druid or equivalent like Hive
- Strong industry expertise with containerization technologies including kubernetes (EKS/AKS), Kubeflow
- Experience with cloud platform services such as AWS, Azure or GCP especially with EKS, Managed Kafka
- 5+ Industry experience in python
- Experience with popular modern web frameworks such as Spring boot, Play framework, or Django
- Experience with scripting languages. Python experience highly desirable. Experience in API development using Swagger
- Implementing automated testing platforms and unit tests
- Proficient understanding of code versioning tools, such as Git
- Familiarity with continuous integration, Jenkins
- Architect, Design and Implement Large scale data processing pipelines using Kafka Streams, PySpark, Fluentd and Druid
- Create custom Operators for Kubernetes, Kubeflow
- Develop data ingestion processes and ETLs
- Assist in dev ops operations
- Design and Implement APIs
- Identify performance bottlenecks and bugs, and devise solutions to these problems
- Help maintain code quality, organization, and documentation
- Communicate with stakeholders regarding various aspects of solution.
- Mentor team members on best practices
About the Role:
Freight Tiger is growing exponentially, and technology is at the centre of it. Our Engineers love solving complex industry problems by building modular and scalable solutions using cutting-edge technology. Your peers will be an exceptional group of Software Engineers, Quality Assurance Engineers, DevOps Engineers, and Infrastructure and Solution Architects.
This role is responsible for developing data pipelines and data engineering components to support strategic initiatives and ongoing business processes. This role works with leads, analysts, and data scientists to understand requirements, develop technical solutions, and ensure the reliability and performance of the data engineering solutions.
This role provides an opportunity to directly impact business outcomes for sales, underwriting, claims and operations functions across multiple use cases by providing them data for their analytical modelling needs.
- Create and maintain a data pipeline.
- Build and deploy ETL infrastructure for optimal data delivery.
- Work with various product, design and executive teams to troubleshoot data-related issues.
- Create tools for data analysts and scientists to help them build and optimise the product.
- Implement systems and processes for data access controls and guarantees.
- Distil the knowledge from experts in the field outside the org and optimise internal data systems.
- Should have 5+ years of relevant experience.
- Strong analytical skills.
- Degree in Computer Science, Statistics, Informatics, Information Systems.
- Strong project management and organisational skills.
- Experience supporting and working with cross-functional teams in a dynamic environment.
- SQL guru with hands-on experience on various databases.
- NoSQL databases like Cassandra, and MongoDB.
- Experience with Snowflake, Redshift.
- Experience with tools like Airflow, and Hevo.
- Experience with Hadoop, Spark, Kafka, and Flink.
- Programming experience in Python, Java, and Scala.
- Key responsibility is to design, develop & maintain efficient Data models for the organization maintained to ensure optimal query performance by the consumption layer.
- Developing, Deploying & maintaining a repository of UDXs written in Java / Python.
- Develop optimal Data Model design, analyzing complex distributed data deployments, and making recommendations to optimize performance basis data consumption patterns, performance expectations, the query is executed on the tables/databases, etc.
- Periodic Database health check and maintenance
- Designing collections in a no-SQL Database for efficient performance
- Document & maintain data dictionary from various sources to enable data governance
- Coordination with Business teams, IT, and other stakeholders to provide best-in-class data pipeline solutions, exposing data via APIs, loading in down streams, No-SQL Databases, etc
- Data Governance Process Implementation and ensuring data security
- Extensive working experience in Designing & Implementing Data models in OLAP Data Warehousing solutions (Redshift, Synapse, Snowflake, Teradata, Vertica, etc).
- Programming experience using Python / Java.
- Working knowledge in developing & deploying User-defined Functions (UDXs) using Java / Python.
- Strong understanding & extensive working experience in OLAP Data Warehousing (Redshift, Synapse, Snowflake, Teradata, Vertica, etc) architecture and cloud-native Data Lake (S3, ADLS, BigQuery, etc) Architecture.
- Strong knowledge in Design, Development & Performance tuning of 3NF/Flat/Hybrid Data Model.
- Extensive technical experience in SQL including code optimization techniques.
- Strung knowledge of database performance and tuning, troubleshooting, and tuning.
- Knowledge of collection design in any No-SQL DB (DynamoDB, MongoDB, CosmosDB, etc), along with implementation of best practices.
- Ability to understand business functionality, processes, and flows.
- Good combination of technical and interpersonal skills with strong written and verbal communication; detail-oriented with the ability to work independently.
- Any OLAP DWH DBA Experience and User Management will be added advantage.
- Knowledge in financial industry-specific Data models such as FSLDM, IBM Financial Data Model, etc will be added advantage.
- Experience in Snowflake will be added advantage.
- Working experience in BFSI/NBFC & data understanding of Loan/Mortgage data will be added advantage.
- Data Governance & Quality Assurance
- Modern OLAP Database Architecture & Design
- Data structures, algorithm & data modeling techniques
- No-SQL database architecture
- Data Security
Experience - 2 to 5 Years
- Sound understanding of Google Cloud Platform
- Should have worked on Big Query, Workflow or Composer
- Experience of migrating to GCP and integration projects on large-scale environments
- ETL technical design, development and support
- Good in SQL skills and Unix Scripting
- Programming experience with Python, Java or Spark would be desirable, but not essential
- Good Communication skills .
- Experience of SOA and services-based data solutions, would be advantageous
- Experience with relational SQL & NoSQL databases including MySQL & MongoDB.
- Familiar with the basic principles of distributed computing and data modeling.
- Experience with distributed data pipeline frameworks like Celery, Apache Airflow, etc.
- Experience with NLP and NER models is a bonus.
- Experience building reusable code and libraries for future use.
- Experience building REST APIs.
Preference for candidates working in tech product companies
- Previous experience of working in large scale data engineering
- 4+ years of experience working in data engineering and/or backend technologies with cloud experience (any) is mandatory.
- Previous experience of architecting and designing backend for large scale data processing.
- Familiarity and experience of working in different technologies related to data engineering – different database technologies, Hadoop, spark, storm, hive etc.
- Hands-on and have the ability to contribute a key portion of data engineering backend.
- Self-inspired and motivated to drive for exceptional results.
- Familiarity and experience working with different stages of data engineering – data acquisition, data refining, large scale data processing, efficient data storage for business analysis.
- Familiarity and experience working with different DB technologies and how to scale them.
- End to end responsibility to come up with data engineering architecture, design, development and then implementation of it.
- Build data engineering workflow for large scale data processing.
- Discover opportunities in data acquisition.
- Bring industry best practices for data engineering workflow.
- Develop data set processes for data modelling, mining and production.
- Take additional tech responsibilities for driving an initiative to completion
- Recommend ways to improve data reliability, efficiency and quality
- Goes out of their way to reduce complexity.
- Humble and outgoing - engineering cheerleaders.