Data Engineers develop modern data architecture approaches to meet key business objectives and provide end-to-end data solutions. You might spend a few weeks with a new client on a deep technical review or a complete organizational review, helping them to understand the potential that data brings to solve their most pressing problems. On other projects, you might be acting as the architect, leading the design of technical solutions, or perhaps overseeing a program inception to build a new product. It could also be a software delivery project where you're equally happy coding and tech-leading the team to implement the solution.
You’ll spend time on the following:
- You will partner with teammates to create complex data processing pipelines in order to solve our clients’ most ambitious challenges
- You will collaborate with Data Scientists in order to design scalable implementations of their models
- You will pair to write clean and iterative code based on TDD
- Leverage various continuous delivery practices to deploy data pipelines
- Advise and educate clients on how to use different distributed storage and computing technologies from the plethora of options available
- Develop modern data architecture approaches to meet key business objectives and provide end-to-end data solutions
- Create data models and speak to the tradeoffs of different modeling approaches
Here’s what we’re looking for:
- You have a good understanding of data modelling and experience with data engineering tools and platforms such as Kafka, Spark, and Hadoop
- You have built large-scale data pipelines and data-centric applications using any of the distributed storage platforms such as HDFS, S3, NoSQL databases (Hbase, Cassandra, etc.) and any of the distributed processing platforms like Hadoop, Spark, Hive, Oozie, and Airflow in a production setting
- Hands on experience in MapR, Cloudera, Hortonworks and/or cloud (AWS EMR, Azure HDInsights, Qubole etc.) based Hadoop distributions
- You are comfortable taking data-driven approaches and applying data security strategy to solve business problems
- Working with data excites you: you can build and operate data pipelines, and maintain data storage, all within distributed systems
- Strong communication and client-facing skills with the ability to work in a consulting environment
About Thoughtworks
Founded in 1993, we’ve grown from a small team in Chicago to a leading software consultancy of more than 8000 Thoughtworkers in 17 countries. Our cross-functional teams of strategists, developers, data engineers, and designers bring over two decades of global experience to every partnership.
Thoughtworks invented the concept of distributed agile and we know how to harness the power of global teams to deliver software excellence at scale. Today we help our clients to create their own path to digital fluency and to build organizational resilience to navigate the future.
Our job is to foster a vibrant community where people have the freedom to make an extraordinary impact on the world through technology.
As a Thoughtworker, you are free to seek out the most ambitious challenges. Free to change career paths. Free to use technology as a tool for social change. Free to be yourself.
Similar jobs
Who Are We?
Vahak (https://www.vahak.in) is India’s largest & most trusted online transport marketplace & directory for road transport businesses and individual commercial vehicle (Trucks, Trailers, Containers, Hyva, LCVs) owners for online truck and load booking, transport business branding and transport business network expansion. Lorry owners can find intercity and intracity loads from all over India and connect with other businesses to find trusted transporters and best deals in the Indian logistics services market. With the Vahak app, users can book loads and lorries from a live transport marketplace with over 7 Lakh + Transporters and Lorry owners in over 10,000+ locations for daily transport requirements.
Vahak has raised a capital of $5+ Million in a Pre-Series A round from RTP Global along with participation from Luxor Capital and Leo Capital. The other marquee angel investors include Kunal Shah, Founder and CEO, CRED; Jitendra Gupta, Founder and CEO, Jupiter; Vidit Aatrey and Sanjeev Barnwal, Co-founders, Meesho; Mohd Farid, Co-founder, Sharechat; Amrish Rau, CEO, Pine Labs; Harsimarbir Singh, Co-founder, Pristyn Care; Rohit and Kunal Bahl, Co-founders, Snapdeal; and Ravish Naresh, Co-founder and CEO, Khatabook.
Manager Data Science:
We at Vahak, are looking for an enthusiastic and passionate Manager of Data Science, to join our young & diverse team.You will play a key role in the data science group, working with different teams, identifying the use cases that could be solved by application of data science techniques.
Our goal as a group is to drive powerful, big data analytics products with scalable results.We love people who are humble and collaborative with hunger for excellence.
Responsibilities:
- Mine and Analyze end to end business data and generate actionable insights. Work will involve analyzing Customer transaction data, Marketing Campaign performance analysis, identifying process bottlenecks, business performance analysis etc.
- Identify data driven opportunities to drive optimization and improvement of product development, marketing techniques and business strategies.
- Collaborate with Product and growth teams to test and learn at unprecedented pace and help the team achieve substantial upside in key metrics
- Actively participate in the OKR process and help team democratize the key KPIs and metrics that drive various objectives
- Comfortable with digital marketing campaign concepts, use of marketing campaign platforms such as Google Adwords and Facebook Ads
- Responsible for design of algorithms that require different advanced analytics techniques and heuristics to work together
- Create dashboard and visualization from scratch and present data in logical manner to all the stakeholders
- Collaborates with internal teams to create actionable items based off analysis; works with the datasets to conduct complex quantitative analysis and helps drive the innovation for our customers
Requirements:
- Bachelor’s or Masters degree in Engineering, Science, Maths, Economics or other quantitative fields. MBA is a plus but not required
- 5+ years of proven experience working in Data Science field preferably in ecommerce/web based or consumer technology companies
- Thorough understanding of implementation and analysis of product and marketing metrics at scale
- Strong problem solving skills with an emphasis on product development.
- Fluency in statistical computer languages like SQL, Python, R as well as a deep understanding of statistical analysis, experiments designs and common pitfalls of data analysis
- Should have worked in a relational database like Oracle or Mysql, experience in Big Data systems like Bigquery or Redshift a definite plus
- Experience using business intelligence tools e.g. Tableau, Power BI would be an added advantage (not mandatory)
Designation – Deputy Manager - TS
Job Description
- Total of 8/9 years of development experience Data Engineering . B1/BII role
- Minimum of 4/5 years in AWS Data Integrations and should be very good on Data modelling skills.
- Should be very proficient in end to end AWS Data solution design, that not only includes strong data ingestion, integrations (both Data @ rest and Data in Motion) skills but also complete DevOps knowledge.
- Should have experience in delivering at least 4 Data Warehouse or Data Lake Solutions on AWS.
- Should be very strong experience on Glue, Lambda, Data Pipeline, Step functions, RDS, CloudFormation etc.
- Strong Python skill .
- Should be an expert in Cloud design principles, Performance tuning and cost modelling. AWS certifications will have an added advantage
- Should be a team player with Excellent communication and should be able to manage his work independently with minimal or no supervision.
- Life Science & Healthcare domain background will be a plus
Qualifications
BE/Btect/ME/MTech
ETL-Database Developer/Lead
Job Description
The applicant must have a minimum of 5 years of hands-on IT experience, working on a full software lifecycle in Agile mode.
Good to have experience in data modeling and/or systems architecture.
Responsibilities will include technical analysis, design, development and perform enhancements.
You will participate in all/most of the following activities:
- Working with business analysts and other project leads to understand requirements.
- Modeling and implementing database schemas in DB2 UDB or other relational databases.
- Designing, developing, maintaining and Data processing using Python, DB2, Greenplum, Autosys and other technologies
Skills /Expertise Required :
Work experience in developing large volume database (DB2/Greenplum/Oracle/Sybase).
Good experience in writing stored procedures, integration of database processing, tuning and optimizing database queries.
Strong knowledge of table partitions, high-performance loading and data processing.
Good to have hands-on experience working with Perl or Python.
Hands on development using Spark / KDB / Greenplum platform will be a strong plus.
Designing, developing, maintaining and supporting Data Extract, Transform and Load (ETL) software using Informatica, Shell Scripts, DB2 UDB and Autosys.
Coming up with system architecture/re-design proposals for greater efficiency and ease of maintenance and developing software to turn proposals into implementations.
Need to work with business analysts and other project leads to understand requirements.
Strong collaboration and communication skills
We are establishing infrastructure for internal and external reporting using Tableau and are looking for someone with experience building visualizations and dashboards in Tableau and using Tableau Server to deliver them to internal and external users.
Required Experience
- Implementation of interactive visualizations using Tableau Desktop
- Integration with Tableau Server and support of production dashboards and embedded reports with it
- Writing and optimization of SQL queries
- Proficient in Python including the use of Pandas and numpy libraries to perform data exploration and analysis
- 3 years of experience working as a Software Engineer / Senior Software Engineer
- Bachelors in Engineering – can be Electronic and comm , Computer , IT
- Well versed with Basic Data Structures Algorithms and system design
- Should be capable of working well in a team – and should possess very good communication skills
- Self-motivated and fun to work with and organized
- Productive and efficient working remotely
- Test driven mindset with a knack for finding issues and problems at earlier stages of development
- Interest in learning and picking up a wide range of cutting edge technologies
- Should be curious and interested in learning some Data science related concepts and domain knowledge
- Work alongside other engineers on the team to elevate technology and consistently apply best practices
Highly Desirable
- Data Analytics
- Experience in AWS cloud or any cloud technologies
- Experience in BigData technologies and streaming like – pyspark, kafka is a big plus
- Shell scripting
- Preferred tech stack – Python, Rest API, Microservices, Flask/Fast API, pandas, numpy, linux, shell scripting, Airflow, pyspark
- Has a strong backend experience – and worked with Microservices and Rest API’s - Flask, FastAPI, Databases Relational and Non-relational
Senior Data Consultant (Talend DI)
at Pinghala
Pingahla is recruiting business intelligence Consultants/Senior consultants who can help us with Information Management projects (domestic, onshore and offshore) as developers and team leads. The candidates are expected to have 3-6 years of experience with Informatica Power Center/Talend DI/Informatica Cloud and must be very proficient with Business Intelligence in general. The job is based out of our Pune office.
Responsibilities:
- Manage the customer relationship by serving as the single point of contact before, during and after engagements.
- Architect data management solutions.
- Provide technical leadership to other consultants and/or customer/partner resources.
- Design, develop, test and deploy data integration solutions in accordance with customer’s schedule.
- Supervise and mentor all intermediate and junior level team members.
- Provide regular reports to communicate status both internally and externally.
- Qualifications:
- A typical profile that would suit this position would be if the following background:
- A graduate from a reputed engineering college
- An excellent I.Q and analytical skills and should be able to grasp new concepts and learn new technologies.
- A willingness to work with a small team in a fast-growing environment.
- A good knowledge of Business Intelligence concepts
Mandatory Requirements:
- Knowledge of Business Intelligence
- Good knowledge of at least one of the following data integration tools - Informatica Powercenter, Talend DI, Informatica Cloud
- Knowledge of SQL
- Excellent English and communication skills
- Intelligent, quick to learn new technologies
- Track record of accomplishment and effectiveness with handling customers and managing complex data management needs
Senior Artificial intelligence/ Machine Learning Developer
at A firm which woks with US clients. Permanent WFH.
This person MUST have:
- B.E Computer Science or equivalent
- 5 years experience with the Django framework
- Experience with building APIs (REST or GraphQL)
- Strong Troubleshooting and debugging skills
- React.js knowledge would be an added bonus
- Understanding on how to use a database like Postgres (prefered choice), SQLite, MongoDB, MySQL.
- Sound knowledge of object-oriented design and analysis.
- A strong passion for writing simple, clean and efficient code.
- Proficient understanding of code versioning tools Git.
- Strong communication skills.
Experience:
- Min 5 year experience
- Startup experience is a must.
Location:
- Remote developer
Timings:
- 40 hours a week but with 4 hours a day overlapping with client timezone. Typically clients are in California PST Timezone.
Position:
- Full time/Direct
- We have great benefits such as PF, medical insurance, 12 annual company holidays, 12 PTO leaves per year, annual increments, Diwali bonus, spot bonuses and other incentives etc.
- We dont believe in locking in people with large notice periods. You will stay here because you love the company. We have only a 15 days notice period.
Data Governance Engineer
at European Bank headquartered at Copenhagen, Denmark.
Roles & Responsibilities
- Designing and delivering a best-in-class, highly scalable data governance platform
- Improving processes and applying best practices
- Contribute in all scrum ceremonies; assuming the role of ‘scum master’ on a rotational basis
- Development, management and operation of our infrastructure to ensure it is easy to deploy, scalable, secure and fault-tolerant
- Flexible on working hours as per business needs
About MX Player (https://play.google.com/store/apps/details?id=com.mxtech.videoplayer.ad&hl=en_IN">Playstore Link)
MX Player is the world’s #1 entertainment superapp offering 100,000+ hours of premium OTT (over the top) content spanning acclaimed MX Originals, Web Shows, TV (Live & OnDemand), movies, music videos and hyper-casual games, music streaming, short form video and more. With more than 1 billion installs worldwide – MX Player is present on 1 out of every 2 smartphones, making it the largest entertainment app/platform in the world.
Position : Product Analyst / Business Analyst - Ad Tech
Key Responsibilities:
- Driving the collection of new data that would help build the next generation of algorithms (E.g. audience segmentation, contextual targeting)
- Understanding user behavior and performing root-cause analysis of changes in data trends to identify corrections or propose desirable enhancements in product & across different verticals
- Excellent problem solving skills and the ability to make sound judgments based on trade-offs for different solutions to complex problem constraints
- Defining and monitoring KPIs for product/content/business performance and identifying ways to improve them
- Should be a strong advocate of data driven approach and drive analytics decisions by doing user testing, data analysis, and A/B testing
- Help in defining the analytics roadmap for the product
- Prior knowledge and experience in ad tech industry or other advertising platforms will be preferred
Tools/ Skillset:
- Knowledge of Google DFP (prefered)
- SQL
- R/Python (preferred)
- Any BI Tool such as tableau, sisense (preferred)
- Go getter attitude
- Ability to thrive in a fast paced dynamic environment
- Self - Starter