Software Developer- Data Engineering / Java / Golang
at Metadata Technologies, North America
We are looking for an exceptional Software Developer for our Data Engineering India team who can-
contribute to building a world-class big data engineering stack that will be used to fuel us
Analytics and Machine Learning products. This person will be contributing to the architecture,
operation, and enhancement of:
Our petabyte-scale data platform with a key focus on finding solutions that can support
Analytics and Machine Learning product roadmap. Everyday terabytes of ingested data
need to be processed and made available for querying and insights extraction for
various use cases.
About the Organisation:
- It provides a dynamic, fun workplace filled with passionate individuals. We are at the cutting edge of advertising technology and there is never a dull moment at work.
- We have a truly global footprint, with our headquarters in Singapore and offices in Australia, United States, Germany, United Kingdom, and India.
- You will gain work experience in a global environment. We speak over 20 different languages, from more than 16 different nationalities and over 42% of our staff are multilingual.
Job Description
Position:
Software Developer, Data Engineering team
Location: Pune(Initially 100% Remote due to Covid 19 for coming 1 year)
- Our bespoke Machine Learning pipelines. This will also provide opportunities to
contribute to the prototyping, building, and deployment of Machine Learning models.
You:
- Have at least 4+ years’ Experience.
- Deep technical understanding of Java or Golang.
- Production experience with Python is a big plus, extremely valuable supporting skill for
us.
- Exposure to modern Big Data tech: Cassandra/Scylla, Kafka, Ceph, the Hadoop Stack,
Spark, Flume, Hive, Druid etc… while at the same time understanding that certain
problems may require completely novel solutions.
- Exposure to one or more modern ML tech stacks: Spark ML-Lib, TensorFlow, Keras,
GCP ML Stack, AWS Sagemaker - is a plus.
- Experience includes working in Agile/Lean model
- Experience with supporting and troubleshooting large systems
- Exposure to configuration management tools such as Ansible or Salt
- Exposure to IAAS platforms such as AWS, GCP, Azure…
- Good addition - Experience working with large-scale data
- Good addition - Good to have experience architecting, developing, and operating data
warehouses, big data analytics platforms, and high velocity data pipelines
**** Not looking for a Big Data Developer / Hadoop Developer
Similar jobs
About Us
Sahaj Software is an artisanal software engineering firm built on the values of trust, respect, curiosity, and craftsmanship, and delivering purpose-built solutions to drive data-led transformation for organisations. Our emphasis is on craft as we create purpose-built solutions, leveraging Data Engineering, Platform Engineering and Data Science with a razor-sharp focus to solve complex business and technology challenges and provide customers with a competitive edge
About The Role
As a Data Engineer, you’ll feel at home if you are hands-on, grounded, opinionated and passionate about delivering comprehensive data solutions that align with modern data architecture approaches. Your work will range from building a full data platform to building data pipelines or helping with data architecture and strategy. This role is ideal for those looking to have a large impact and huge scope for growth, while still being hands-on with technology. We aim to allow growth without becoming “post-technical”.
Responsibilities
- Collaborate with Data Scientists and Engineers to deliver production-quality AI and Machine Learning systems
- Build frameworks and supporting tooling for data ingestion from a complex variety of sources
- Consult with our clients on data strategy, modernising their data infrastructure, architecture and technology
- Model their data for increased visibility and performance
- You will be given ownership of your work, and are encouraged to propose alternatives and make a case for doing things differently; our clients trust us and we manage ourselves.
- You will work in short sprints to deliver working software
- You will be working with other data engineers in Sahaj and work on building Data Engineering capability across the organisation
You can read more about what we do and how we think here: https://sahaj.ai/client-stories/
Skills you’ll need
- Demonstrated experience as a Senior Data Engineer in complex enterprise environments
- Deep understanding of technology fundamentals and experience with languages like Python, or functional programming languages like Scala
- Demonstrated experience in the design and development of big data applications using tech stacks like Databricks, Apache Spark, HDFS, HBase and Snowflake
- Commendable skills in building data products, by integrating large sets of data from hundreds of internal and external sources would be highly critical
- A nuanced understanding of code quality, maintainability and practices like Test Driven Development
- Ability to deliver an application end to end; having an opinion on how your code should be built, packaged and deployed using CI/CD
- Understanding of Cloud platforms, DevOps, GitOps, and Containers
What will you experience as a culture at Sahaj?
At Sahaj, people's collective stands for a shared purpose where everyone owns the dreams, ideas, ideologies, successes, and failures of the organisation - a synergy that is rooted in the ethos of honesty, respect, trust, and equitability. At Sahaj, you will experience
- Creativity
- Ownership
- Curiosity
- Craftsmanship
- A culture of trust, respect and transparency
- Opportunity to collaborate with some of the finest minds in the industry
- Work across multiple domains
What are the benefits of being at Sahaj?
- Unlimited leaves
- Life Insurance & Private Health insurance paid by Sahaj
- Stock options
- No hierarchy
- Open Salaries
Primary Skills
DynamoDB, Java, Kafka, Spark, Amazon Redshift, AWS Lake Formation, AWS Glue, Python
Skills:
Good work experience showing growth as a Data Engineer.
Hands On programming experience
Implementation Experience on Kafka, Kinesis, Spark, AWS Glue, AWS Lake Formation.
Excellent knowledge in: Python, Scala/Java, Spark, AWS (Lambda, Step Functions, Dynamodb, EMR), Terraform, UI (Angular), Git, Mavena
Experience of performance optimization in Batch and Real time processing applications
Expertise in Data Governance and Data Security Implementation
Good hands-on design and programming skills building reusable tools and products Experience developing in AWS or similar cloud platforms. Preferred:, ECS, EKS, S3, EMR, DynamoDB, Aurora, Redshift, Quick Sight or similar.
Familiarity with systems with very high volume of transactions, micro service design, or data processing pipelines (Spark).
Knowledge and hands-on experience with server less technologies such as Lambda, MSK, MWAA, Kinesis Analytics a plus.
Expertise in practices like Agile, Peer reviews, Continuous Integration
Roles and responsibilities:
Determining project requirements and developing work schedules for the team.
Delegating tasks and achieving daily, weekly, and monthly goals.
Responsible for designing, building, testing, and deploying the software releases.
Salary: 25LPA-40LPA
Designation – Deputy Manager - TS
Job Description
- Total of 8/9 years of development experience Data Engineering . B1/BII role
- Minimum of 4/5 years in AWS Data Integrations and should be very good on Data modelling skills.
- Should be very proficient in end to end AWS Data solution design, that not only includes strong data ingestion, integrations (both Data @ rest and Data in Motion) skills but also complete DevOps knowledge.
- Should have experience in delivering at least 4 Data Warehouse or Data Lake Solutions on AWS.
- Should be very strong experience on Glue, Lambda, Data Pipeline, Step functions, RDS, CloudFormation etc.
- Strong Python skill .
- Should be an expert in Cloud design principles, Performance tuning and cost modelling. AWS certifications will have an added advantage
- Should be a team player with Excellent communication and should be able to manage his work independently with minimal or no supervision.
- Life Science & Healthcare domain background will be a plus
Qualifications
BE/Btect/ME/MTech
● Able to contribute to the gathering of functional requirements, developing technical
specifications, and test case planning
● Demonstrating technical expertise, and solving challenging programming and design
problems
● 60% hands-on coding with architecture ownership of one or more products
● Ability to articulate architectural and design options, and educate development teams and
business users
● Resolve defects/bugs during QA testing, pre-production, production, and post-release
patches
● Mentor and guide team members
● Work cross-functionally with various bidgely teams including product management, QA/QE,
various product lines, and/or business units to drive forward results
Requirements
● BS/MS in computer science or equivalent work experience
● 8-12 years’ experience designing and developing applications in Data Engineering
● Hands-on experience with Big data EcoSystems.
● Past experience with Hadoop,Hdfs,Map Reduce,YARN,AWS Cloud, EMR, S3, Spark, Cassandra,
Kafka, Zookeeper
● Expertise with any of the following Object-Oriented Languages (OOD): Java/J2EE,Scala,
Python
● Ability to lead and mentor technical team members
● Expertise with the entire Software Development Life Cycle (SDLC)
● Excellent communication skills: Demonstrated ability to explain complex technical issues to
both technical and non-technical audiences
● Expertise in the Software design/architecture process
● Expertise with unit testing & Test-Driven Development (TDD)
● Business Acumen - strategic thinking & strategy development
● Experience on Cloud or AWS is preferable
● Have a good understanding and ability to develop software, prototypes, or proofs of
concepts (POC's) for various Data Engineering requirements.
● Experience with Agile Development, SCRUM, or Extreme Programming methodologies
-
Fix issues with plugins for our Python-based ETL pipelines
-
Help with automation of standard workflow
-
Deliver Python microservices for provisioning and managing cloud infrastructure
-
Responsible for any refactoring of code
-
Effectively manage challenges associated with handling large volumes of data working to tight deadlines
-
Manage expectations with internal stakeholders and context-switch in a fast-paced environment
-
Thrive in an environment that uses AWS and Elasticsearch extensively
-
Keep abreast of technology and contribute to the engineering strategy
-
Champion best development practices and provide mentorship to others
-
First and foremost you are a Python developer, experienced with the Python Data stack
-
You love and care about data
-
Your code is an artistic manifest reflecting how elegant you are in what you do
-
You feel sparks of joy when a new abstraction or pattern arises from your code
-
You support the manifests DRY (Don’t Repeat Yourself) and KISS (Keep It Short and Simple)
-
You are a continuous learner
-
You have a natural willingness to automate tasks
-
You have critical thinking and an eye for detail
-
Excellent ability and experience of working to tight deadlines
-
Sharp analytical and problem-solving skills
-
Strong sense of ownership and accountability for your work and delivery
-
Excellent written and oral communication skills
-
Mature collaboration and mentoring abilities
-
We are keen to know your digital footprint (community talks, blog posts, certifications, courses you have participated in or you are keen to, your personal projects as well as any kind of contributions to the open-source communities if any)
-
Delivering complex software, ideally in a FinTech setting
-
Experience with CI/CD tools such as Jenkins, CircleCI
-
Experience with code versioning (git / mercurial / subversion)
- At least 4 to 7 years of relevant experience as Big Data Engineer
- Hands-on experience in Scala or Python
- Hands-on experience on major components in Hadoop Ecosystem like HDFS, Map Reduce, Hive, Impala.
- Strong programming experience in building applications/platform using Scala or Python.
- Experienced in implementing Spark RDD Transformations, actions to implement business analysis
We are specialized in productizing solutions of new technology.
Our vision is to build engineers with entrepreneurial and leadership mindsets who can create highly impactful products and solutions using technology to deliver immense value to our clients.
We strive to develop innovation and passion into everything we do, whether it is services or products, or solutions.
Function : Sr. DB Developer
Location : India/Gurgaon/Tamilnadu
>> THE INDIVIDUAL
- Have a strong background in data platform creation and management.
- Possess in-depth knowledge of Data Management, Data Modelling, Ingestion - Able to develop data models and ingestion frameworks based on client requirements and advise on system optimization.
- Hands-on experience in SQL database (PostgreSQL) and No-SQL database (MongoDB)
- Hands-on experience in performance tuning of DB
- Good to have knowledge of database setup in cluster node
- Should be well versed with data security aspects and data governance framework
- Hands-on experience in Spark, Airflow, ELK.
- Good to have knowledge on any data cleansing tool like apache Griffin
- Preferably getting involved during project implementation so have a background on business knowledge and technical requirement as well.
- Strong analytical and problem-solving skills. Have exposure to data analytics skills and knowledge of advanced data analytical tools will be an advantage.
- Strong written and verbal communication skills (presentation skills).
- Certifications in the above technologies is preferred.
>> Qualification
- Tech /B.E. / MCA /M. Tech from a reputed institute.
Experience of Data Management, Data Modelling, Ingestion for more than 4 years. Total experience of 8-10 Years
Big data Developer
Exp: 3yrs to 7 yrs.
Job Location: Hyderabad
Notice: Immediate / within 30 days
1. Expertise in building AWS Data Engineering pipelines with AWS Glue -> Athena -> Quick sight
2. Experience in developing lambda functions with AWS Lambda
3. Expertise with Spark/PySpark Candidate should be hands on with PySpark code and should be able to do transformations with Spark
4. Should be able to code in Python and Scala.
5. Snowflake experience will be a plus
We can start keeping Hadoop and Hive requirements as good to have or understanding of is enough rather than keeping it as a desirable requirement.
What you will do:
- Bringing data to life via Historical and Real Time Dashboard like:
2. Transaction behavior analytics
3. User level analytics
4. Propensity models and personalization models, eg whats the best product or offer to drive a sale
5. Emails/ SMS/ AN – analytics models getting data from platforms like Netcore+ Branch+ Internal Tables
6. Other reports that are relevant to know what is working, gaps, etc
- Monitoring key metrics such as commission, gross margin, conversion, customer acquisitions etc
- Using the data models and reports to draw actionable and meaningful insights. Based on the insights helping drive the strategy, optimization opportunities, product improvement and more
- Demonstrating examples where data interpretation led to improvement in core business outcomes like better conversion, better ROI from Ad spends, improvements in product etc.
- Digging into data to identify opportunities or problems and translating them into easy-to-understand way for all key business teams
- Working closely with various business stakeholders: Marketing, Customer Insights, Growth and Product teams
- Ensuring effective and timely delivery of reports and insights that analyze business functions and key operations and performance metrics
What you need to have:
- Minimum 2 years of data analytics and interpretation experience
- Proven experience to show how data analytics shaped strategy, marketing and product. Should have multiple examples of this based on current and past experience
- Strong Data Engineering skills using tools like SQL, Python, Tableau, Power BI, Advanced Excel, PowerQuery etc
- Strong marketing acumen to be able to translate data into marketing outcomes
- Good understanding of Google Analytics, Google AdWords, Facebook Ads, CleverTap and other analytics tools
- Familiarity with data sources like Branch/ Clevertap / Firebase, Big Query, Internal Tables (User/ Click/ Transaction/ Events etc)