Lead Data Engineer
Data Engineers develop modern data architecture approaches to meet key business objectives and provide end-to-end data solutions. You might spend a few weeks with a new client on a deep technical review or a complete organizational review, helping them to understand the potential that data brings to solve their most pressing problems. On other projects, you might be acting as the architect, leading the design of technical solutions, or perhaps overseeing a program inception to build a new product. It could also be a software delivery project where you're equally happy coding and tech-leading the team to implement the solution.
Job responsibilities
· You might spend a few weeks with a new client on a deep technical review or a complete organizational review, helping them to understand the potential that data brings to solve their most pressing problems
· You will partner with teammates to create complex data processing pipelines in order to solve our clients' most ambitious challenges
· You will collaborate with Data Scientists in order to design scalable implementations of their models
· You will pair to write clean and iterative code based on TDD
· Leverage various continuous delivery practices to deploy, support and operate data pipelines
· Advise and educate clients on how to use different distributed storage and computing technologies from the plethora of options available
· Develop and operate modern data architecture approaches to meet key business objectives and provide end-to-end data solutions
· Create data models and speak to the tradeoffs of different modeling approaches
· On other projects, you might be acting as the architect, leading the design of technical solutions, or perhaps overseeing a program inception to build a new product
· Seamlessly incorporate data quality into your day-to-day work as well as into the delivery process
· Assure effective collaboration between Thoughtworks' and the client's teams, encouraging open communication and advocating for shared outcomes
Job qualifications Technical skills
· You are equally happy coding and leading a team to implement a solution
· You have a track record of innovation and expertise in Data Engineering
· You're passionate about craftsmanship and have applied your expertise across a range of industries and organizations
· You have a deep understanding of data modelling and experience with data engineering tools and platforms such as Kafka, Spark, and Hadoop
· You have built large-scale data pipelines and data-centric applications using any of the distributed storage platforms such as HDFS, S3, NoSQL databases (Hbase, Cassandra, etc.) and any of the distributed processing platforms like Hadoop, Spark, Hive, Oozie, and Airflow in a production setting
· Hands on experience in MapR, Cloudera, Hortonworks and/or cloud (AWS EMR, Azure HDInsights, Qubole etc.) based Hadoop distributions
· You are comfortable taking data-driven approaches and applying data security strategy to solve business problems
· You're genuinely excited about data infrastructure and operations with a familiarity working in cloud environments
· Working with data excites you: you have created Big data architecture, you can build and operate data pipelines, and maintain data storage, all within distributed systems
Professional skills
· Advocate your data engineering expertise to the broader tech community outside of Thoughtworks, speaking at conferences and acting as a mentor for more junior-level data engineers
· You're resilient and flexible in ambiguous situations and enjoy solving problems from technical and business perspectives
· An interest in coaching others, sharing your experience and knowledge with teammates
· You enjoy influencing others and always advocate for technical excellence while being open to change when needed
About Thoughtworks
Founded in 1993, we’ve grown from a small team in Chicago to a leading software consultancy of more than 8000 Thoughtworkers in 17 countries. Our cross-functional teams of strategists, developers, data engineers, and designers bring over two decades of global experience to every partnership.
Thoughtworks invented the concept of distributed agile and we know how to harness the power of global teams to deliver software excellence at scale. Today we help our clients to create their own path to digital fluency and to build organizational resilience to navigate the future.
Our job is to foster a vibrant community where people have the freedom to make an extraordinary impact on the world through technology.
As a Thoughtworker, you are free to seek out the most ambitious challenges. Free to change career paths. Free to use technology as a tool for social change. Free to be yourself.
Similar jobs
Data Scientist-
We are looking for an experienced Data Scientists to join our engineering team and
help us enhance our mobile application with data. In this role, we're looking for
people who are passionate about developing ML/AI in various domains that solves
enterprise problems. We are keen on hiring someone who loves working in fast paced start-up environment and looking to solve some challenging engineering
problems.
As one of the earliest members in engineering, you will have the flexibility to design
the models and architecture from ground up. As any early-stage start-up, we expect
you to be comfortable wearing various hats, and be proactive contributor in building
something truly remarkable.
Responsibilities
Researches, develops and maintains machine learning and statistical models for
business requirements
Work across the spectrum of statistical modelling including supervised,
unsupervised, & deep learning techniques to apply the right level of solution to
the right problem Coordinate with different functional teams to monitor outcomes and refine/
improve the machine learning models Implements models to uncover patterns and predictions creating business value and innovation
Identify unexplored data opportunities for the business to unlock and maximize
the potential of digital data within the organization
Develop NLP concepts and algorithms to classify and summarize structured/unstructured text data
Qualifications
3+ years of experience solving complex business problems using machine
learning.
Fluency in programming languages such as Python, NLP and Bert, is a must
Strong analytical and critical thinking skills
Experience in building production quality models using state-of-the-art technologies
Familiarity with databases like MySQL, Oracle, SQL Server, NoSQL, etc. is
desirable Ability to collaborate on projects and work independently when required.
Previous experience in Fintech/payments domain is a bonus
You should have Bachelor’s or Master’s degree in Computer Science, Statistics
or Mathematics or another quantitative field from a top tier Institute
Required Skill Set-
Project experience in any of the following - Data Management,
Database Development, Data Migration or Data Warehousing.
• Expertise in SQL, PL/SQL.
Role and Responsibilities -
• Work on a complex data management program for multi-billion dollar
customer
• Work on customer projects related to data migration, data
integration
•No Troubleshooting
• Execution of data pipelines, perform QA, project documentation for
project deliverables
• Perform data profiling, data cleansing, data analysis for migration
data
• Participate and contribute in project meeting
• Experience in data manipulation using Python preferred
• Proficient in using Excel, PowerPoint
-Perform other tasks as per project requirements.
- We are looking for : Data engineer
- Sprak
- Scala
- Hadoop
N.p - 15 days to 30 Days
Location : Bangalore / Noida
Experience in developing lambda functions with AWS Lambda
Expertise with Spark/PySpark – Candidate should be hands on with PySpark code and should be able to do transformations with Spark
Should be able to code in Python and Scala.
Snowflake experience will be a plus
Deep-Rooted.Co is on a mission to get Fresh, Clean, Community (Local farmer) produce from harvest to reach your home with a promise of quality first! Our values are rooted in trust, convenience, and dependability, with a bunch of learning & fun thrown in.
Founded out of Bangalore by Arvind, Avinash, Guru and Santosh, with the support of our Investors Accel, Omnivore & Mayfield, we raised $7.5 million in Seed, Series A and Debt funding till date from investors include ACCEL, Omnivore, Mayfield among others. Our brand Deep-Rooted.Co which was launched in August 2020 was the first of its kind as India’s Fruits & Vegetables (F&V) which is present in Bangalore & Hyderabad and on a journey of expansion to newer cities which will be managed seamlessly through Tech platform that has been designed and built to transform the Agri-Tech sector.
Deep-Rooted.Co is committed to building a diverse and inclusive workplace and is an equal-opportunity employer.
How is this possible? It’s because we work with smart people. We are looking for Engineers in Bangalore to work with thehttps://www.linkedin.com/in/gururajsrao/"> Product Leader (Founder) andhttps://www.linkedin.com/in/sriki77/"> CTO and this is a meaningful project for us and we are sure you will love the project as it touches everyday life and is fun. This will be a virtual consultation.
We want to start the conversation about the project we have for you, but before that, we want to connect with you to know what’s on your mind. Do drop a note sharing your mobile number and letting us know when we can catch up.
Purpose of the role:
* As a startup we have data distributed all across various sources like Excel, Google Sheets, Databases etc. We need swift decision making based a on a lot of data that exists as we grow. You help us bring together all this data and put it in a data model that can be used in business decision making. * Handle nuances of Excel and Google Sheets API. * Pull data in and manage it growth, freshness and correctness. * Transform data in a format that aids easy decision-making for Product, Marketing and Business Heads. * Understand the business problem, solve the same using the technology and take it to production - no hand offs - full path to production is yours.
Technical expertise:
* Good Knowledge And Experience with Programming languages - Java, SQL,Python. * Good Knowledge of Data Warehousing, Data Architecture. * Experience with Data Transformations and ETL; * Experience with API tools and more closed systems like Excel, Google Sheets etc. * Experience AWS Cloud Platform and Lambda * Experience with distributed data processing tools. * Experiences with container-based deployments on cloud.
Skills:
Java, SQL, Python, Data Build Tool, Lambda, HTTP, Rest API, Extract Transform Load.
CommerceIQ is Hiring Data Scientist (3-5 yrs)
At CommerceIQ, we are building the world’s most sophisticated E-commerce Channel Optimization software to help brands leverage Machine Learning, Analytics and Automation to grow their E-commerce business on all channels, globally.
Using CommerceIQ as a single source of truth, customers have driven 40% increase in incremental sales, 20% improvement in profitability and 32% reduction in out of stock rates on Amazon.
What You’ll Be Doing
As a Senior Data Scientist, you will work closely with Engineering/Product/Operations teams to build state-of-the-art ML based solutions for B2B SaaS products. This entails not only leveraging advanced techniques for predictions, time-series forecasting, topic modelling, optimisation but deep understanding of business and product too.
- Apply excellent problem solving skills to deconstruct and formulate solutions from first-principles
- Work on data science roadmap and build the core engine of our flagship CommerceIQ product
- Collaborate with product and engineering to design product strategy, identify key metrics to drive and support with proof of concept
- Perform rapid prototyping of experimental solutions and develop robust, sustainable and scalable production systems
- Work with large scale ecommerce data of the biggest brands on amazon
- Apply out-of-the-box, advanced algorithms to complex problems in real-time systems
- Drive productization of techniques to be made available to a wide range of customers
- You would be working with and mentoring fellow team members on the owned charter
What we are looking for -
- Bachelor’s or Masters in Computer Science or Maths/Stats from a reputed college with 4+ years of experience in solving data science problems that have driven value to customers
- Good depth and breadth in machine learning (theory and practice), optimization methods, data mining, statistics and linear algebra. Experience in NLP would be an advantage
- Hands-on programming skills and ability to write modular and scalable code in Python/R. Knowledge of SQL is required
- Familiarity with distributed computing architecture like Spark, Map-Reduce paradigm and Hadoop will be an added advantage
- Strong spoken and written communication skills, able to explain complex ideas in a simple, intuitive manner, write/maintain good technical documentation on projects
- Experience with building ML data products in an engineering organization interfacing with other teams and departments to deliver impact
- We are looking for candidates who are curious and self-starters; obsess over customer problems to deliver maximum value to them.
- Data scientist, Machine Learning, data science, data analyst
Job Type: Full-time
Experience:
- Data Scientist: 3 years (Required)
Application Question:
- Looking for product based industry experience from tier 1 /tier 2 colleges (NIT ,BIT, IIT,IIIT, BITS, Strong Profiles)
Role and Responsibilities
- Build a low latency serving layer that powers DataWeave's Dashboards, Reports, and Analytics functionality
- Build robust RESTful APIs that serve data and insights to DataWeave and other products
- Design user interaction workflows on our products and integrating them with data APIs
- Help stabilize and scale our existing systems. Help design the next generation systems.
- Scale our back end data and analytics pipeline to handle increasingly large amounts of data.
- Work closely with the Head of Products and UX designers to understand the product vision and design philosophy
- Lead/be a part of all major tech decisions. Bring in best practices. Mentor younger team members and interns.
- Constantly think scale, think automation. Measure everything. Optimize proactively.
- Be a tech thought leader. Add passion and vibrance to the team. Push the envelope.
Skills and Requirements
- 8- 15 years of experience building and scaling APIs and web applications.
- Experience building and managing large scale data/analytics systems.
- Have a strong grasp of CS fundamentals and excellent problem solving abilities. Have a good understanding of software design principles and architectural best practices.
- Be passionate about writing code and have experience coding in multiple languages, including at least one scripting language, preferably Python.
- Be able to argue convincingly why feature X of language Y rocks/sucks, or why a certain design decision is right/wrong, and so on.
- Be a self-starter—someone who thrives in fast paced environments with minimal ‘management’.
- Have experience working with multiple storage and indexing technologies such as MySQL, Redis, MongoDB, Cassandra, Elastic.
- Good knowledge (including internals) of messaging systems such as Kafka and RabbitMQ.
- Use the command line like a pro. Be proficient in Git and other essential software development tools.
- Working knowledge of large-scale computational models such as MapReduce and Spark is a bonus.
- Exposure to one or more centralized logging, monitoring, and instrumentation tools, such as Kibana, Graylog, StatsD, Datadog etc.
- Working knowledge of building websites and apps. Good understanding of integration complexities and dependencies.
- Working knowledge linux server administration as well as the AWS ecosystem is desirable.
- It's a huge bonus if you have some personal projects (including open source contributions) that you work on during your spare time. Show off some of your projects you have hosted on GitHub.