Data Engineers develop modern data architecture approaches to meet key business objectives and provide end-to-end data solutions. You might spend a few weeks with a new client on a deep technical review or a complete organizational review, helping them to understand the potential that data brings to solve their most pressing problems. On other projects, you might be acting as the architect, leading the design of technical solutions, or perhaps overseeing a program inception to build a new product. It could also be a software delivery project where you're equally happy coding and tech-leading the team to implement the solution.
You’ll spend time on the following:
- You will partner with teammates to create complex data processing pipelines in order to solve our clients’ most ambitious challenges
- You will collaborate with Data Scientists in order to design scalable implementations of their models
- You will pair to write clean and iterative code based on TDD
- Leverage various continuous delivery practices to deploy data pipelines
- Advise and educate clients on how to use different distributed storage and computing technologies from the plethora of options available
- Develop modern data architecture approaches to meet key business objectives and provide end-to-end data solutions
- Create data models and speak to the tradeoffs of different modeling approaches
Here’s what we’re looking for:
- You have a good understanding of data modelling and experience with data engineering tools and platforms such as Kafka, Spark, and Hadoop
- You have built large-scale data pipelines and data-centric applications using any of the distributed storage platforms such as HDFS, S3, NoSQL databases (Hbase, Cassandra, etc.) and any of the distributed processing platforms like Hadoop, Spark, Hive, Oozie, and Airflow in a production setting
- Hands on experience in MapR, Cloudera, Hortonworks and/or cloud (AWS EMR, Azure HDInsights, Qubole etc.) based Hadoop distributions
- You are comfortable taking data-driven approaches and applying data security strategy to solve business problems
- Working with data excites you: you can build and operate data pipelines, and maintain data storage, all within distributed systems
- Strong communication and client-facing skills with the ability to work in a consulting environment
About Thoughtworks
Founded in 1993, we’ve grown from a small team in Chicago to a leading software consultancy of more than 8000 Thoughtworkers in 17 countries. Our cross-functional teams of strategists, developers, data engineers, and designers bring over two decades of global experience to every partnership.
Thoughtworks invented the concept of distributed agile and we know how to harness the power of global teams to deliver software excellence at scale. Today we help our clients to create their own path to digital fluency and to build organizational resilience to navigate the future.
Our job is to foster a vibrant community where people have the freedom to make an extraordinary impact on the world through technology.
As a Thoughtworker, you are free to seek out the most ambitious challenges. Free to change career paths. Free to use technology as a tool for social change. Free to be yourself.
Similar jobs
We are seeking a dedicated Machine Learning Engineer to join our growing company.
You will collaborate with software engineers and product managers to create efficient artificial intelligence algorithms. As an ML Engineer, we hope you can put your passion for AI engineering towards solving amazing problems through AI.
Roles and Responsibility
- Develop Machine Learning (ML) models using various neural network architectures and implement the model using Python.
- Understand the problem by interacting with domain experts and design/implement various training algorithms and feature detectors.
- Train models using various datasets and optimize the inference architecture for performance.
- Continuously work to improve the Recall accuracy and precision metrics for ML models.
- Design and implement event driven pipelines using Kafka, Python, Keras, Pytorch and Tensorflow.
- Perform data clean-up and guide the labelling team to create labelled datasets.
- Work with different engineers to implement inference graphs, infographics and automated report/alert generation.
- Debug, build, test and release complete software products under SaaS model.
Bonus points for -
- Experience developing and consuming REST APIs.
- Knowledge of developing dockerized micorservices-based architecture to ensure scalability.
Job Qualifications and Skill Sets
- 1-2 years of relevant experience.
- Proven experience as a software developer with knowledge about software development lifecycle (SDLC), from design to implementation.
- Knowledge of scripting languages (e.g. Python)
- Experience with deep learning frameworks (e.g., PyTorch, Tensorflow etc) and software stack (e.g., TensorRT, TVM, etc)
- Experience with model optimization techniques like pruning, quantization, NAS, etc.
- Experience with ML accelerators and hardware architecture, e.g., GPUs, TPUs, NNAs, MLAs Experience with modern parallel programming: GPU programming (CUDA, OpenCL), SIMD (avx, neon/SVE), multi-process and multi-threaded designs.
- Familiarity with HW vendors' deep learning stacks (e.g., cuDNN, cuBLAS, AMD MIOpen, TensorRT, OpenVino, ARM Compute Library, etc)
- Experience with version control systems such as Git and offerings such as GitHub, BitBucket etc.
Bonus points for -
- Familiarity with databases (e.g. MySQL, MongoDB, Cassandra), web servers (e.g. Apache, NGINX), UI/UX design.
- Exposure to edge/mobile-based ML is a plus.
● Proficient in Python and using packages like NLTK, Numpy, Pandas
● Should have worked on deep learning frameworks (like Tensorflow, Keras, PyTorch, etc)
● Hands-on experience in Natural Language Processing, Sequence, and RNN Based models
● Mathematical intuition of ML and DL algorithms
● Should be able to perform thorough model evaluation by creating hypotheses on the basis of statistical
analyses
● Should be comfortable in going through open-source code and reading research papers.
● Should be curious or thoughtful enough to answer the “WHYs” pertaining to the most cherished
observations, thumb rules, and ideas across the data science community.
About the Role:
As a Speech Engineer you will be working on development of on-device multilingual speech recognition systems.
- Apart from ASR you will be working on solving speech focused research problems like speech enhancement, voice analysis and synthesis etc.
- You will be responsible for building complete pipeline for speech recognition from data preparation to deployment on edge devices.
- Reading, implementing and improving baselines reported in leading research papers will be another key area of your daily life at Saarthi.
Requirements:
- 2-3 year of hands-on experience in speech recognitionbased projects
- Proven experience as a Speech engineer or similar role
- Should have experience of deployment on edge devices
- Candidate should have hands-on experience with open-source tools such as Kaldi, Pytorch-Kaldi and any of the end-to-end ASR tools such as ESPNET or EESEN or DeepSpeech Pytorch
- Prior proven experience in training and deployment of deep learning models on scale
- Strong programming experience in Python,C/C++, etc.
- Working experience with Pytorch and Tensorflow
- Experience contributing to research communities including publications at conferences and/or journals
- Strong communication skills
- Strong analytical and problem-solving skills
Job Description: Data Scientist
At Propellor.ai, we derive insights that allow our clients to make scientific decisions. We believe in demanding more from the fields of Mathematics, Computer Science, and Business Logic. Combine these and we show our clients a 360-degree view of their business. In this role, the Data Scientist will be expected to work on Procurement problems along with a team-based across the globe.
We are a Remote-First Company.
Read more about us here: https://www.propellor.ai/consulting
What will help you be successful in this role
- Articulate
- High Energy
- Passion to learn
- High sense of ownership
- Ability to work in a fast-paced and deadline-driven environment
- Loves technology
- Highly skilled at Data Interpretation
- Problem solver
- Ability to narrate the story to the business stakeholders
- Generate insights and the ability to turn them into actions and decisions
Skills to work in a challenging, complex project environment
- Need you to be naturally curious and have a passion for understanding consumer behavior
- A high level of motivation, passion, and high sense of ownership
- Excellent communication skills needed to manage an incredibly diverse slate of work, clients, and team personalities
- Flexibility to work on multiple projects and deadline-driven fast-paced environment
- Ability to work in ambiguity and manage the chaos
Key Responsibilities
- Analyze data to unlock insights: Ability to identify relevant insights and actions from data. Use regression, cluster analysis, time series, etc. to explore relationships and trends in response to stakeholder questions and business challenges.
- Bring in experience for AI and ML: Bring in Industry experience and apply the same to build efficient and optimal Machine Learning solutions.
- Exploratory Data Analysis (EDA) and Generate Insights: Analyse internal and external datasets using analytical techniques, tools, and visualization methods. Ensure pre-processing/cleansing of data and evaluate data points across the enterprise landscape and/or external data points that can be leveraged in machine learning models to generate insights.
- DS and ML Model Identification and Training: Identity, test, and train machine learning models that need to be leveraged for business use cases. Evaluate models based on interpretability, performance, and accuracy as required. Experiment and identify features from datasets that will help influence model outputs. Determine what models will need to be deployed, data points that need to be fed into models, and aid in the deployment and maintenance of models.
Technical Skills
An enthusiastic individual with the following skills. Please do not hesitate to apply if you do not match all of them. We are open to promising candidates who are passionate about their work, fast learners and are team players.
- Strong experience with machine learning and AI including regression, forecasting, time series, cluster analysis, classification, Image recognition, NLP, Text Analytics and Computer Vision.
- Strong experience with advanced analytics tools for Object-oriented/object function scripting using languages such as Python, or similar.
- Strong experience with popular database programming languages including SQL.
- Strong experience in Spark/Pyspark
- Experience in working in Databricks
What are the company benefits you get, when you join us as?
- Permanent Work from Home Opportunity
- Opportunity to work with Business Decision Makers and an internationally based team
- The work environment that offers limitless learning
- A culture void of any bureaucracy, hierarchy
- A culture of being open, direct, and with mutual respect
- A fun, high-caliber team that trusts you and provides the support and mentorship to help you grow
- The opportunity to work on high-impact business problems that are already defining the future of Marketing and improving real lives
To know more about how we work: https://bit.ly/3Oy6WlE
Whom will you work with?
You will closely work with other Senior Data Scientists and Data Engineers.
Immediate to 15-day Joiners will be preferred.
As a Senior Engineer - Big Data Analytics, you will help the architectural design and development for Healthcare Platforms, Products, Services, and Tools to deliver the vision of the Company. You will significantly contribute to engineering, technology, and platform architecture. This will be done through innovation and collaboration with engineering teams and related business functions. This is a critical, highly visible role within the company that has the potential to drive significant business impact.
The scope of this role will include strong technical contribution in the development and delivery of Big Data Analytics Cloud Platform, Products and Services in collaboration with execution and strategic partners.
Responsibilities:
- Design & develop, operate, and drive scalable, resilient, and cloud native Big Data Analytics platform to address the business requirements
- Help drive technology transformation to achieve business transformation, through the creation of the Healthcare Analytics Data Cloud that will help Change establish a leadership position in healthcare data & analytics in the industry
- Help in successful implementation of Analytics as a Service
- Ensure Platforms and Services meet SLA requirements
- Be a significant contributor and partner in the development and execution of the Enterprise Technology Strategy
Qualifications:
- At least 2 years of experience software development for big data analytics, and cloud. At least 5 years of experience in software development
- Experience working with High Performance Distributed Computing Systems in public and private cloud environments
- Understands big data open-source eco-systems and its players. Contribution to open source is a strong plus
- Experience with Spark, Spark Streaming, Hadoop, AWS/Azure, NoSQL Databases, In-Memory caches, distributed computing, Kafka, OLAP stores, etc.
- Have successful track record of creating working Big Data stack that aligned with business needs, and delivered timely enterprise class products
- Experience with delivering and managing scale of Operating Environment
- Experience with Big Data/Micro Service based Systems, SaaS, PaaS, and Architectures
- Experience Developing Systems in Java, Python, Unix
- BSCS, BSEE or equivalent, MSCS preferred
- Building and operationalizing large scale enterprise data solutions and applications using one or more of AZURE data and analytics services in combination with custom solutions - Azure Synapse/Azure SQL DWH, Azure Data Lake, Azure Blob Storage, Spark, HDInsights, Databricks, CosmosDB, EventHub/IOTHub.
- Experience in migrating on-premise data warehouses to data platforms on AZURE cloud.
- Designing and implementing data engineering, ingestion, and transformation functions
-
Azure Synapse or Azure SQL data warehouse
-
Spark on Azure is available in HD insights and data bricks
- Experience with Azure Analysis Services
- Experience in Power BI
- Experience with third-party solutions like Attunity/Stream sets, Informatica
- Experience with PreSales activities (Responding to RFPs, Executing Quick POCs)
- Capacity Planning and Performance Tuning on Azure Stack and Spark.
PriceLabs ( chicagobusiness.com/innovators/what-if-you-could-adjust-prices-meet-demand ) is a cloud based software for vacation and short term rentals to help them dynamically manage prices just the way large hotels and airlines do! Our mission is to help small businesses in the travel and tourism industry by giving them access to advanced analytical systems that are often restricted to large companies.
We're looking for someone with strong analytical capabilities who wants to understand how our current architecture and algorithms work, and help us design and develop long lasting solutions to address those. Depending on the needs of the day, the role will come with a good mix of team-work, following our best practices, introducing us to industry best practices, independent thinking, and ownership of your work.
Responsibilities:
- Design, develop and enhance our pricing algorithms to enable new capabilities.
- Process, analyze, model, and visualize findings from our market level supply and demand data.
- Build and enhance internal and customer facing dashboards to better track metrics and trends that help customers use PriceLabs in a better way.
- Take ownership of product ideas and design discussions.
- Occasional travel to conferences to interact with prospective users and partners, and learn where the industry is headed.
Requirements:
- Bachelors, Masters or Ph. D. in Operations Research, Industrial Engineering, Statistics, Computer Science or other quantitative/engineering fields.
- Strong understanding of analysis of algorithms, data structures and statistics.
- Solid programming experience. Including being able to quickly prototype an idea and test it out.
- Strong communication skills, including the ability and willingness to explain complicated algorithms and concepts in simple terms.
- Experience with relational databases and strong knowledge of SQL.
- Experience building data heavy analytical models in the travel industry.
- Experience in the vacation rental industry.
- Experience developing dynamic pricing models.
- Prior experience working at a fast paced environment.
- Willingness to wear many hats.
Nactus is at forefront of education reinvention, helping educators and learner’s community at large through innovative solutions in digital era. We are looking for an experienced AI specialist to join our revolution using the deep learning, artificial intelligence. This is an excellent opportunity to take advantage of emerging trends and technologies to a real-world difference.
Role and Responsibilities
- Manage and direct research and development (R&D) and processes to meet the needs of our AI strategy.
- Understand company and client challenges and how integrating AI capabilities can help create educational solutions.
- Analyse and explain AI and machine learning (ML) solutions while setting and maintaining high ethical standards.
Skills Required
- Knowledge of algorithms, object-oriented and functional design principles
- Demonstrated artificial intelligence, machine learning, mathematical and statistical modelling knowledge and skills.
- Well-developed programming skills – specifically in SAS or SQL and other packages with statistical and machine learning application, e.g. R, Python
- Experience with machine learning fundamentals, parallel computing and distributed systems fundamentals, or data structure fundamentals
- Experience with C, C++, or Python programming
- Experience with debugging and building AI applications.
- Robustness and productivity analyse conclusions.
- Develop a human-machine speech interface.
- Verify, evaluate, and demonstrate implemented work.
- Proven experience with ML, deep learning, Tensorflow, Python