Similar jobs
We are looking out for a technically driven "Full-Stack Engineer" for one of our premium client
COMPANY DESCRIPTION:
Qualifications
• Bachelor's degree in computer science or related field; Master's degree is a plus
• 3+ years of relevant work experience
• Meaningful experience with at least two of the following technologies: Python, Scala, Java
• Strong proven experience on distributed processing frameworks (Spark, Hadoop, EMR) and SQL is very
much expected
• Commercial client-facing project experience is helpful, including working in close-knit teams
• Ability to work across structured, semi-structured, and unstructured data, extracting information and
identifying linkages across disparate data sets
• Confirmed ability in clearly communicating complex solutions
• Understandings on Information Security principles to ensure compliant handling and management of
client data
• Experience and interest in Cloud platforms such as: AWS, Azure, Google Platform or Databricks
• Extraordinary attention to detail
Job Responsibilities
- Design, build & test ETL processes using Python & SQL for the corporate data warehouse
- Inform, influence, support, and execute our product decisions
- Maintain advertising data integrity by working closely with R&D to organize and store data in a format that provides accurate data and allows the business to quickly identify issues.
- Evaluate and prototype new technologies in the area of data processing
- Think quickly, communicate clearly and work collaboratively with product, data, engineering, QA and operations teams
- High energy level, strong team player and good work ethic
- Data analysis, understanding of business requirements and translation into logical pipelines & processes
- Identification, analysis & resolution of production & development bugs
- Support the release process including completing & reviewing documentation
- Configure data mappings & transformations to orchestrate data integration & validation
- Provide subject matter expertise
- Document solutions, tools & processes
- Create & support test plans with hands-on testing
- Peer reviews of work developed by other data engineers within the team
- Establish good working relationships & communication channels with relevant departments
Skills and Qualifications we look for
- University degree 2.1 or higher (or equivalent) in a relevant subject. Master’s degree in any data subject will be a strong advantage.
- 4 - 6 years experience with data engineering.
- Strong coding ability and software development experience in Python.
- Strong hands-on experience with SQL and Data Processing.
- Google cloud platform (Cloud composer, Dataflow, Cloud function, Bigquery, Cloud storage, dataproc)
- Good working experience in any one of the ETL tools (Airflow would be preferable).
- Should possess strong analytical and problem solving skills.
- Good to have skills - Apache pyspark, CircleCI, Terraform
- Motivated, self-directed, able to work with ambiguity and interested in emerging technologies, agile and collaborative processes.
- Understanding & experience of agile / scrum delivery methodology
We are looking for an exceptional Software Developer for our Data Engineering India team who can-
contribute to building a world-class big data engineering stack that will be used to fuel us
Analytics and Machine Learning products. This person will be contributing to the architecture,
operation, and enhancement of:
Our petabyte-scale data platform with a key focus on finding solutions that can support
Analytics and Machine Learning product roadmap. Everyday terabytes of ingested data
need to be processed and made available for querying and insights extraction for
various use cases.
About the Organisation:
- It provides a dynamic, fun workplace filled with passionate individuals. We are at the cutting edge of advertising technology and there is never a dull moment at work.
- We have a truly global footprint, with our headquarters in Singapore and offices in Australia, United States, Germany, United Kingdom, and India.
- You will gain work experience in a global environment. We speak over 20 different languages, from more than 16 different nationalities and over 42% of our staff are multilingual.
Job Description
Position:
Software Developer, Data Engineering team
Location: Pune(Initially 100% Remote due to Covid 19 for coming 1 year)
- Our bespoke Machine Learning pipelines. This will also provide opportunities to
contribute to the prototyping, building, and deployment of Machine Learning models.
You:
- Have at least 4+ years’ Experience.
- Deep technical understanding of Java or Golang.
- Production experience with Python is a big plus, extremely valuable supporting skill for
us.
- Exposure to modern Big Data tech: Cassandra/Scylla, Kafka, Ceph, the Hadoop Stack,
Spark, Flume, Hive, Druid etc… while at the same time understanding that certain
problems may require completely novel solutions.
- Exposure to one or more modern ML tech stacks: Spark ML-Lib, TensorFlow, Keras,
GCP ML Stack, AWS Sagemaker - is a plus.
- Experience includes working in Agile/Lean model
- Experience with supporting and troubleshooting large systems
- Exposure to configuration management tools such as Ansible or Salt
- Exposure to IAAS platforms such as AWS, GCP, Azure…
- Good addition - Experience working with large-scale data
- Good addition - Good to have experience architecting, developing, and operating data
warehouses, big data analytics platforms, and high velocity data pipelines
**** Not looking for a Big Data Developer / Hadoop Developer
Job Title: AWS-Azure Data Engineer with Snowflake
Location: Bangalore, India
Experience: 4+ years
Budget: 15 to 20 LPA
Notice Period: Immediate joiners or less than 15 days
Job Description:
We are seeking an experienced AWS-Azure Data Engineer with expertise in Snowflake to join our team in Bangalore. As a Data Engineer, you will be responsible for designing, implementing, and maintaining data infrastructure and systems using AWS, Azure, and Snowflake. Your primary focus will be on developing scalable and efficient data pipelines, optimizing data storage and processing, and ensuring the availability and reliability of data for analysis and reporting.
Responsibilities:
- Design, develop, and maintain data pipelines on AWS and Azure to ingest, process, and transform data from various sources.
- Optimize data storage and processing using cloud-native services and technologies such as AWS S3, AWS Glue, Azure Data Lake Storage, Azure Data Factory, etc.
- Implement and manage data warehouse solutions using Snowflake, including schema design, query optimization, and performance tuning.
- Collaborate with cross-functional teams to understand data requirements and translate them into scalable and efficient technical solutions.
- Ensure data quality and integrity by implementing data validation, cleansing, and transformation processes.
- Develop and maintain ETL processes for data integration and migration between different data sources and platforms.
- Implement and enforce data governance and security practices, including access control, encryption, and compliance with regulations.
- Collaborate with data scientists and analysts to support their data needs and enable advanced analytics and machine learning initiatives.
- Monitor and troubleshoot data pipelines and systems to identify and resolve performance issues or data inconsistencies.
- Stay updated with the latest advancements in cloud technologies, data engineering best practices, and emerging trends in the industry.
Requirements:
- Bachelor's or Master's degree in Computer Science, Information Systems, or a related field.
- Minimum of 4 years of experience as a Data Engineer, with a focus on AWS, Azure, and Snowflake.
- Strong proficiency in data modelling, ETL development, and data integration.
- Expertise in cloud platforms such as AWS and Azure, including hands-on experience with data storage and processing services.
- In-depth knowledge of Snowflake, including schema design, SQL optimization, and performance tuning.
- Experience with scripting languages such as Python or Java for data manipulation and automation tasks.
- Familiarity with data governance principles and security best practices.
- Strong problem-solving skills and ability to work independently in a fast-paced environment.
- Excellent communication and interpersonal skills to collaborate effectively with cross-functional teams and stakeholders.
- Immediate joiner or notice period less than 15 days preferred.
If you possess the required skills and are passionate about leveraging AWS, Azure, and Snowflake to build scalable data solutions, we invite you to apply. Please submit your resume and a cover letter highlighting your relevant experience and achievements in the AWS, Azure, and Snowflake domains.
Job Overview
We are looking for a Data Engineer to join our data team to solve data-driven critical
business problems. The hire will be responsible for expanding and optimizing the existing
end-to-end architecture including the data pipeline architecture. The Data Engineer will
collaborate with software developers, database architects, data analysts, data scientists and platform team on data initiatives and will ensure optimal data delivery architecture is
consistent throughout ongoing projects. The right candidate should have hands on in
developing a hybrid set of data-pipelines depending on the business requirements.
Responsibilities
- Develop, construct, test and maintain existing and new data-driven architectures.
- Align architecture with business requirements and provide solutions which fits best
- to solve the business problems.
- Build the infrastructure required for optimal extraction, transformation, and loading
- of data from a wide variety of data sources using SQL and Azure ‘big data’
- technologies.
- Data acquisition from multiple sources across the organization.
- Use programming language and tools efficiently to collate the data.
- Identify ways to improve data reliability, efficiency and quality
- Use data to discover tasks that can be automated.
- Deliver updates to stakeholders based on analytics.
- Set up practices on data reporting and continuous monitoring
Required Technical Skills
- Graduate in Computer Science or in similar quantitative area
- 1+ years of relevant work experience as a Data Engineer or in a similar role.
- Advanced SQL knowledge, Data-Modelling and experience working with relational
- databases, query authoring (SQL) as well as working familiarity with a variety of
- databases.
- Experience in developing and optimizing ETL pipelines, big data pipelines, and datadriven
- architectures.
- Must have strong big-data core knowledge & experience in programming using Spark - Python/Scala
- Experience with orchestrating tool like Airflow or similar
- Experience with Azure Data Factory is good to have
- Build processes supporting data transformation, data structures, metadata,
- dependency and workload management.
- Experience supporting and working with cross-functional teams in a dynamic
- environment.
- Good understanding of Git workflow, Test-case driven development and using CICD
- is good to have
- Good to have some understanding of Delta tables It would be advantage if the candidate also have below mentioned experience using
- the following software/tools:
- Experience with big data tools: Hadoop, Spark, Hive, etc.
- Experience with relational SQL and NoSQL databases
- Experience with cloud data services
- Experience with object-oriented/object function scripting languages: Python, Scala, etc.
Technical/Core skills
- Minimum 3 yrs of exp in Informatica Big data Developer(BDM) in Hadoop environment.
- Have knowledge of informatica Power exchange (PWX).
- Minimum 3 yrs of exp in big data querying tool like Hive and Impala.
- Ability to designing/development of complex mappings using informatica Big data Developer.
- Create and manage Informatica power exchange and CDC real time implementation
- Strong Unix knowledge skills for writing shell scripts and troubleshoot of existing scripts.
- Good knowledge of big data platforms and its framework.
- Good to have an experience in cloudera data platform (CDP)
- Experience with building stream processing systems using Kafka and spark
- Excellent SQL knowledge
Soft skills :
- Ability to work independently
- Strong analytical and problem solving skills
- Attitude of learning new technology
- Regular interaction with vendors, partners and stakeholders
Preferred Education & Experience:
-
Bachelor’s or master’s degree in Computer Engineering, Computer Science, Computer Applications, Mathematics, Statistics or related technical field or equivalent practical experience. Relevant experience of at least 3 years in lieu of above if from a different stream of education.
-
Well-versed in and 5+ years of hands-on demonstrable experience with:
▪ Data Analysis & Data Modeling
▪ Database Design & Implementation
▪ Database Performance Tuning & Optimization
▪ PL/pgSQL & SQL -
5+ years of hands-on development experience in Relational Database (PostgreSQL/SQL Server/Oracle).
-
5+ years of hands-on development experience in SQL, PL/PgSQL, including stored procedures, functions, triggers, and views.
-
Hands-on experience with demonstrable working experience in Database Design Principles, SQL Query Optimization Techniques, Index Management, Integrity Checks, Statistics, and Isolation levels
-
Hands-on experience with demonstrable working experience in Database Read & Write Performance Tuning & Optimization.
-
Knowledge and Experience working in Domain Driven Design (DDD) Concepts, Object Oriented Programming System (OOPS) Concepts, Cloud Architecture Concepts, NoSQL Database Concepts are added values
-
Knowledge and working experience in Oil & Gas, Financial, & Automotive Domains is a plus
-
Hands-on development experience in one or more NoSQL data stores such as Cassandra, HBase, MongoDB, DynamoDB, Elastic Search, Neo4J, etc. a plus.
We are looking for an outstanding Big Data Engineer with experience setting up and maintaining Data Warehouse and Data Lakes for an Organization. This role would closely collaborate with the Data Science team and assist the team build and deploy machine learning and deep learning models on big data analytics platforms.
Roles and Responsibilities:
- Develop and maintain scalable data pipelines and build out new integrations and processes required for optimal extraction, transformation, and loading of data from a wide variety of data sources using 'Big Data' technologies.
- Develop programs in Scala and Python as part of data cleaning and processing.
- Assemble large, complex data sets that meet functional / non-functional business requirements and fostering data-driven decision making across the organization.
- Responsible to design and develop distributed, high volume, high velocity multi-threaded event processing systems.
- Implement processes and systems to validate data, monitor data quality, ensuring production data is always accurate and available for key stakeholders and business processes that depend on it.
- Perform root cause analysis on internal and external data and processes to answer specific business questions and identify opportunities for improvement.
- Provide high operational excellence guaranteeing high availability and platform stability.
- Closely collaborate with the Data Science team and assist the team build and deploy machine learning and deep learning models on big data analytics platforms.
Skills:
- Experience with Big Data pipeline, Big Data analytics, Data warehousing.
- Experience with SQL/No-SQL, schema design and dimensional data modeling.
- Strong understanding of Hadoop Architecture, HDFS ecosystem and eexperience with Big Data technology stack such as HBase, Hadoop, Hive, MapReduce.
- Experience in designing systems that process structured as well as unstructured data at large scale.
- Experience in AWS/Spark/Java/Scala/Python development.
- Should have Strong skills in PySpark (Python & SPARK). Ability to create, manage and manipulate Spark Dataframes. Expertise in Spark query tuning and performance optimization.
- Experience in developing efficient software code/frameworks for multiple use cases leveraging Python and big data technologies.
- Prior exposure to streaming data sources such as Kafka.
- Should have knowledge on Shell Scripting and Python scripting.
- High proficiency in database skills (e.g., Complex SQL), for data preparation, cleaning, and data wrangling/munging, with the ability to write advanced queries and create stored procedures.
- Experience with NoSQL databases such as Cassandra / MongoDB.
- Solid experience in all phases of Software Development Lifecycle - plan, design, develop, test, release, maintain and support, decommission.
- Experience with DevOps tools (GitHub, Travis CI, and JIRA) and methodologies (Lean, Agile, Scrum, Test Driven Development).
- Experience building and deploying applications on on-premise and cloud-based infrastructure.
- Having a good understanding of machine learning landscape and concepts.
Qualifications and Experience:
Engineering and post graduate candidates, preferably in Computer Science, from premier institutions with proven work experience as a Big Data Engineer or a similar role for 3-5 years.
Certifications:
Good to have at least one of the Certifications listed here:
AZ 900 - Azure Fundamentals
DP 200, DP 201, DP 203, AZ 204 - Data Engineering
AZ 400 - Devops Certification
1. Expert in deep learning and machine learning techniques,
2. Extremely Good in image/video processing,
3. Have a Good understanding of Linear algebra, Optimization techniques, Statistics and pattern recognition.
Then u r the right fit for this position.