Azure Data Engineer
- Create and maintain optimal data pipeline architecture,
- Assemble large, complex data sets that meet functional / non-functional business requirements.
- Author data services using a variety of programming languages
- Identify, design, and implement internal process improvements: automating manual processes, optimizing data delivery, re-designing infrastructure for greater scalability, etc.
- Build the infrastructure required for optimal extraction, transformation, and loading of data from a wide variety of data sources using SQL and Azure ‘big data’ technologies.
- Build analytics tools that utilize the data pipeline to provide actionable insights into customer acquisition, operational efficiency, and other key business performance metrics.
- Work with stakeholders including the Executive, Product, Data and Design teams to assist with data-related technical issues and support their data infrastructure needs.
- Keep our data separated and secure across national boundaries through multiple data centres and Azure regions.
- Create data tools for analytics and data scientist team members that assist them in building and optimizing our product into an innovative industry leader.
- Work with data and analytics experts to strive for greater functionality in our data systems.
- Work in an Agile environment with Scrum teams.
- Ensure data quality and help in achieving data governance.
Basic Qualifications
- 2+ years of experience in a Data Engineer role
- Undergraduate degree required (Graduate degree preferred) in Computer Science, Statistics, Informatics, Information Systems or another quantitative field.
- Experience using the following software/tools:
- Experience with big data tools: Hadoop, Spark, Kafka, etc.
- Experience with relational SQL and NoSQL databases
- Experience with data pipeline and workflow management tools
- Experience with Azure cloud services: ADLS, ADF, ADLA, AAS
- Experience with stream-processing systems: Storm, Spark-Streaming, etc.
- Experience with object-oriented/object function scripting languages: Python, Java, C++, Scala, etc.
- Advanced working SQL knowledge and experience working with relational databases, query authoring (SQL) as well as working familiarity with a variety of databases
- Understanding of ELT and ETL patterns and when to use each. Understanding of data models and transforming data into the models
- Experience building and optimizing ‘big data’ data pipelines, architectures, and data sets
- Strong analytic skills related to working with unstructured datasets
- Build processes supporting data transformation, data structures, metadata, dependency, and workload management
- Working knowledge of message queuing, stream processing, and highly scalable ‘big data’ data stores
- Experience supporting and working with cross-functional teams in a dynamic environment
About A Chemical & Purifier Company headquartered in the US.
Similar jobs
Mandatory Skills: Azure Data Lake Storage, Azure SQL databases, Azure Synapse, Data Bricks (Pyspark/Spark), Python, SQL, Azure Data Factory.
Good to have: Power BI, Azure IAAS services, Azure Devops, Microsoft Fabric
Ø Very strong understanding on ETL and ELT
Ø Very strong understanding on Lakehouse architecture.
Ø Very strong knowledge in Pyspark and Spark architecture.
Ø Good knowledge in Azure data lake architecture and access controls
Ø Good knowledge in Microsoft Fabric architecture
Ø Good knowledge in Azure SQL databases
Ø Good knowledge in T-SQL
Ø Good knowledge in CI /CD process using Azure devops
Ø Power BI
Analytics Job Description
We are hiring an Analytics Engineer to help drive our Business Intelligence efforts. You will
partner closely with leaders across the organization, working together to understand the how
and why of people, team and company challenges, workflows and culture. The team is
responsible for delivering data and insights that drive decision-making, execution, and
investments for our product initiatives.
You will work cross-functionally with product, marketing, sales, engineering, finance, and our
customer-facing teams enabling them with data and narratives about the customer journey.
You’ll also work closely with other data teams, such as data engineering and product analytics,
to ensure we are creating a strong data culture at Blend that enables our cross-functional partners
to be more data-informed.
Role : DataEngineer
Please find below the JD for the DataEngineer Role..
Location: Guindy,Chennai
How you’ll contribute:
• Develop objectives and metrics, ensure priorities are data-driven, and balance short-
term and long-term goals
• Develop deep analytical insights to inform and influence product roadmaps and
business decisions and help improve the consumer experience
• Work closely with GTM and supporting operations teams to author and develop core
data sets that empower analyses
• Deeply understand the business and proactively spot risks and opportunities
• Develop dashboards and define metrics that drive key business decisions
• Build and maintain scalable ETL pipelines via solutions such as Fivetran, Hightouch,
and Workato
• Design our Analytics and Business Intelligence architecture, assessing and
implementing new technologies that fitting
• Work with our engineering teams to continually make our data pipelines and tooling
more resilient
Who you are:
• Bachelor’s degree or equivalent required from an accredited institution with a
quantitative focus such as Economics, Operations Research, Statistics, Computer Science OR 1-3 Years of Experience as a Data Analyst, Data Engineer, Data Scientist
• Must have strong SQL and data modeling skills, with experience applying skills to
thoughtfully create data models in a warehouse environment.
• A proven track record of using analysis to drive key decisions and influence change
• Strong storyteller and ability to communicate effectively with managers and
executives
• Demonstrated ability to define metrics for product areas, understand the right
questions to ask and push back on stakeholders in the face of ambiguous, complex
problems, and work with diverse teams with different goals
• A passion for documentation.
• A solution-oriented growth mindset. You’ll need to be a self-starter and thrive in a
dynamic environment.
• A bias towards communication and collaboration with business and technical
stakeholders.
• Quantitative rigor and systems thinking.
• Prior startup experience is preferred, but not required.
• Interest or experience in machine learning techniques (such as clustering, decision
tree, and segmentation)
• Familiarity with a scientific computing language, such as Python, for data wrangling
and statistical analysis
• Experience with a SQL focused data transformation framework such as dbt
• Experience with a Business Intelligence Tool such as Mode/Tableau
Mandatory Skillset:
-Very Strong in SQL
-Spark OR pyspark OR Python
-Shell Scripting
JOB DESCRIPTION
Product Analyst
About Us:-
"Slack for Construction"
Early stage startup cofounded by IIT - Roorkee alumnis. A Mobile-based operating system to manage construction & architectural projects. Material, all the info is shared over whatsapp, mobile app to manage all this in one single place - almost like a slack tool for construction.Mobile app + SAAS platform - administration and management of the process, 150000 users, subscription based pricing.It helps construction project owners and contractors track on-site progress in real-time to finish projects on time and in budget. We aim to bring the speed of software development to infrastructure development.Founded by IIT Roorkee alumni and backed by industry experts, we are on a mission to help the second largest industry in India-Construction make a transition from pen and paper to digital.
About the team
As a productivity app startup, we value productivity and ownership most. That helps raise our own bar and the bar of people we hire.We follow agile and scrum approaches for product development and use best of class tools and practices. Measuring our progress on a weekly basis and iterating fast enables us to build breakthrough modules and features rapidly.If you join us, You will be constantly thrown into challenging situations. Decisions that you take, will directly impact our clients and sales. That's how we learn.
Techstack -
- Prior experience in any data driven decision making field.
- Working knowledge of querying data using SQL.
- Familiarity with customer and business data analytic tools like Segment, Mix-panel, Google Analytics, SmartLook etc.
- Data visualisation tools like Tableau, Power BI, etc.
Responsibility -
"All things data"
- Ability to synthesize complex data into actionable goals.
- Critical thinking skills to recommend original and productive ideas
- Ability to visualise user stories and create user funnels
- Perform user test sessions and market surveys to inform product development teams
- Excellent writing skills to prepare detailed product specification and analytic reports
- Help define Product strategy / Roadmaps with scalable architecture
- Interpersonal skills to work collaboratively with various stakeholders who may have competing interests
The Merck Data Engineering Team is responsible for designing, developing, testing, and supporting automated end-to-end data pipelines and applications on Merck’s data management and global analytics platform (Palantir Foundry, Hadoop, AWS and other components).
The Foundry platform comprises multiple different technology stacks, which are hosted on Amazon Web Services (AWS) infrastructure or on-premise Merck’s own data centers. Developing pipelines and applications on Foundry requires:
• Proficiency in SQL / Java / Python (Python required; all 3 not necessary)
• Proficiency in PySpark for distributed computation
• Familiarity with Postgres and ElasticSearch
• Familiarity with HTML, CSS, and JavaScript and basic design/visual competency
• Familiarity with common databases (e.g. JDBC, mySQL, Microsoft SQL). Not all types required
This position will be project based and may work across multiple smaller projects or a single large project utilizing an agile project methodology.
Roles & Responsibilities:
• Develop data pipelines by ingesting various data sources – structured and un-structured – into Palantir Foundry
• Participate in end to end project lifecycle, from requirements analysis to go-live and operations of an application
• Acts as business analyst for developing requirements for Foundry pipelines
• Review code developed by other data engineers and check against platform-specific standards, cross-cutting concerns, coding and configuration standards and functional specification of the pipeline
• Document technical work in a professional and transparent way. Create high quality technical documentation
• Work out the best possible balance between technical feasibility and business requirements (the latter can be quite strict)
• Deploy applications on Foundry platform infrastructure with clearly defined checks
• Implementation of changes and bug fixes via Merck's change management framework and according to system engineering practices (additional training will be provided)
• DevOps project setup following Agile principles (e.g. Scrum)
• Besides working on projects, act as third level support for critical applications; analyze and resolve complex incidents/problems. Debug problems across a full stack of Foundry and code based on Python, Pyspark, and Java
• Work closely with business users, data scientists/analysts to design physical data models
BRIEF DESCRIPTION:
At-least 1 year of Python, Spark, SQL, data engineering experience
Primary Skillset: PySpark, Scala/Python/Spark, Azure Synapse, S3, RedShift/Snowflake
Relevant Experience: Legacy ETL job Migration to AWS Glue / Python & Spark combination
ROLE SCOPE:
Reverse engineer the existing/legacy ETL jobs
Create the workflow diagrams and review the logic diagrams with Tech Leads
Write equivalent logic in Python & Spark
Unit test the Glue jobs and certify the data loads before passing to system testing
Follow the best practices, enable appropriate audit & control mechanism
Analytically skillful, identify the root causes quickly and efficiently debug issues
Take ownership of the deliverables and support the deployments
REQUIREMENTS:
Create data pipelines for data integration into Cloud stacks eg. Azure Synapse
Code data processing jobs in Azure Synapse Analytics, Python, and Spark
Experience in dealing with structured, semi-structured, and unstructured data in batch and real-time environments.
Should be able to process .json, .parquet and .avro files
PREFERRED BACKGROUND:
Tier1/2 candidates from IIT/NIT/IIITs
However, relevant experience, learning attitude takes precedence
- 5+ years of experience in software development.
- At least 2 years of relevant work experience on large scale Data applications
- Good attitude, strong problem-solving abilities, analytical skills, ability to take ownership as appropriate
- Should be able to do coding, debugging, performance tuning, and deploying the apps to Prod.
- Should have good working experience Hadoop ecosystem (HDFS, Hive, Yarn, File formats like Avro/Parquet)
- Kafka
- J2EE Frameworks (Spring/Hibernate/REST)
- Spark Streaming or any other streaming technology.
- Java programming language is mandatory.
- Good to have experience with Java
- Ability to work on the sprint stories to completion along with Unit test case coverage.
- Experience working in Agile Methodology
- Excellent communication and coordination skills
- Knowledgeable (and preferred hands-on) - UNIX environments, different continuous integration tools.
- Must be able to integrate quickly into the team and work independently towards team goals
- Take the complete responsibility of the sprint stories’ execution
- Be accountable for the delivery of the tasks in the defined timelines with good quality
- Follow the processes for project execution and delivery.
- Follow agile methodology
- Work with the team lead closely and contribute to the smooth delivery of the project.
- Understand/define the architecture and discuss the pros-cons of the same with the team
- Involve in the brainstorming sessions and suggest improvements in the architecture/design.
- Work with other team leads to get the architecture/design reviewed.
- Work with the clients and counterparts (in US) of the project.
- Keep all the stakeholders updated about the project/task status/risks/issues if there are any.
Job Summary
SQL development for our Enterprise Resource Planning (ERP) Product offered to SMEs. Regular modifications , creation and validation with testing of stored procedures , views, functions on MS SQL Server.
Responsibilities and Duties
Understanding the ERP Software and use cases.
Regular Creation,modifications and testing of
- Stored Procedures
- Views
- Functions
- Nested Queries
- Table and Schema Designs
Qualifications and Skills
MS SQL
- Procedural Language
- Datatypes
- Objects
- Databases
- Schema
- Product Analytics: This is the first and most obvious role of the Product Analyst. At this capacity, the Product Analyst is responsible for the development and delivery of tangible consumer benefits through the product or service of the business.
- In addition, in this capacity, the Product Analyst is also responsible for measuring and monitoring the product or service’s performance as well as presenting product-related consumer, market, and competitive intelligence.
- Product Strategy: As a member of the Product team, the Product Analyst is responsible for the development and proposal of product strategies.
- Product Management Operations: The Product Analyst also has the obligation to respond in a timely manner to all requests and inquiries for product information or changes. He also performs the initial product analysis in order to assess the need for any requested changes as well as their potential impact.
- At this capacity, the Product Analyst also undertakes financial modeling on the products or services of the business as well as of the target markets in order to bring about an understanding of the relations between the product and the target market. This information is presented to the Marketing Manager and other stakeholders, when necessary.
- Additionally, the Product Analyst produces reports and makes recommendations to the Product Manager and Product Marketing Manager to be used as guidance in decision-making pertaining to the business’s new as well as existent products.
- Initiative: In this capacity, the Product Analyst ensures that there is a good flow of communication between the Product team and other teams. The Product Analyst ensures this by actively participating in team meetings and keeping everyone up to date.
- Pricing and Development: The Product Analyst has the responsibility to monitor the market, competitor activities, as well as any price movements and make recommendations that will be used in key decision making. In this function, the Product Analyst will normally liaise with other departments such as the credit/risk in the business in order to enhance and increase the efficiency of effecting price changes in accordance with market shifts.
- Customer/Market Intelligence: The Product Analyst has the obligation to drive consumer intelligence through the development of external and internal data sources that improve the business’s understanding of the product’s market, competitor activities, and consumer activities.
- In the performance of this role, the Product Analyst develops or adopts research tools, sources, and methods that further support and contribute to the business’s product.
- Creating, designing and developing data models
- Prepare plans for all ETL (Extract/Transformation/Load) procedures and architectures
- Validating results and creating business reports
- Monitoring and tuning data loads and queries
- Develop and prepare a schedule for a new data warehouse
- Analyze large databases and recommend appropriate optimization for the same
- Administer all requirements and design various functional specifications for data
- Provide support to the Software Development Life cycle
- Prepare various code designs and ensure efficient implementation of the same
- Evaluate all codes and ensure the quality of all project deliverables
- Monitor data warehouse work and provide subject matter expertise
- Hands-on BI practices, data structures, data modeling, SQL skills
- Minimum 1 year experience in Pyspark