Similar jobs
Mandatory Skills: Azure Data Lake Storage, Azure SQL databases, Azure Synapse, Data Bricks (Pyspark/Spark), Python, SQL, Azure Data Factory.
Good to have: Power BI, Azure IAAS services, Azure Devops, Microsoft Fabric
Ø Very strong understanding on ETL and ELT
Ø Very strong understanding on Lakehouse architecture.
Ø Very strong knowledge in Pyspark and Spark architecture.
Ø Good knowledge in Azure data lake architecture and access controls
Ø Good knowledge in Microsoft Fabric architecture
Ø Good knowledge in Azure SQL databases
Ø Good knowledge in T-SQL
Ø Good knowledge in CI /CD process using Azure devops
Ø Power BI
Requirements
- Strong SQL coding skills including index builds, performance tuning, triggers, functions, and stored procedures.
- Good knowledge of SSIS development, T-SQL, Views, and CTEs.
- Good knowledge of SSIS, Azure Data Factory, Synapse, Azure Data Lake Gen 2
- Excellent understanding of relational databases
- Proven experience manipulating large data sets e.g., millions of records
- Good knowledge of data warehousing
- Well-versed with ETL, ELT
- Good working knowledge of MS SQL Server with ability to take and restore dumps, perform admin tasks, etc.
- Good understanding and experience in software design principles, SOLID principles, and design patterns
- Good to have experience working with Microsoft Azure SQL Server and Azure Data Warehouse environment
- Ability to communicate and work well in an agile-based environment.
- Able to work as an individual contributor and be a good team player.
- Familiarity and working experience with SDLC processes
- Possess problem-analysis and problem-solving skills
- Excellent written and verbal communication skills
Responsibilities
- Design and implement Azure BI infrastructure, ensure overall quality of delivered solution
- Develop analytical & reporting tools, promote and drive adoption of developed BI solutions
- Actively participate in BI community
- Establish and enforce technical standards and documentation
- Participate in daily scrums
- Record progress daily in assigned Devops items
Ideal Candidates should have
- 5 + years of experience in a similar senior business intelligence development position
- To be successful in the role you will require a high level of expertise across all facets of the Microsoft BI stack and prior experience in designing and developing well-performing data warehouse solutions
- Demonstrated experience using development tools such as Azure SQL database, Azure Data Factory, Azure Data Lake, Azure Synapse, and Azure DevOps.
- Experience with development methodologies including Agile, DevOps, and CICD patterns
- Strong oral and written communication skills in English
- Ability and willingness to learn quickly and continuously
- Bachelor's Degree in computer science
- Proficient with SQL Server/T-SQL programming in creation and optimization of complex Stored Procedures, UDF, CTE and Triggers
- Overall Experience should be between 4 to 7 years
- Experience working in a data warehouse environment and a strong understanding of dimensional data modeling concepts. Experience in SQL server, DW principles and SSIS.
- Should have strong experience in building data transformations with SSIS including importing data from files, and moving data from source to destination.
- Creating new SSIS packages or modifying existing SSIS packages using SQL server
- Debug and fine-tune SSIS processes to ensure accurate and efficient movement of data. Experience with ETL testing & data validation.
- 1+ years of experience with Azure services like Azure Data Factory, Data flow, Azure blob Storage, etc.
- 1+ years of experience with developing Azure Data Factory Objects - ADF pipeline, configuration, parameters, variables, Integration services runtime.
- Must be able to build Business Intelligence solutions in a collaborative, agile development environment.
- Reporting experience with Power BI or SSRS is a plus.
- Experience working on an Agile/Scrum team preferred.
- Proven strong problem-solving skills, troubleshooting, and root cause analysis.
- Excellent written and verbal communication skills.
BRIEF DESCRIPTION:
At-least 1 year of Python, Spark, SQL, data engineering experience
Primary Skillset: PySpark, Scala/Python/Spark, Azure Synapse, S3, RedShift/Snowflake
Relevant Experience: Legacy ETL job Migration to AWS Glue / Python & Spark combination
ROLE SCOPE:
Reverse engineer the existing/legacy ETL jobs
Create the workflow diagrams and review the logic diagrams with Tech Leads
Write equivalent logic in Python & Spark
Unit test the Glue jobs and certify the data loads before passing to system testing
Follow the best practices, enable appropriate audit & control mechanism
Analytically skillful, identify the root causes quickly and efficiently debug issues
Take ownership of the deliverables and support the deployments
REQUIREMENTS:
Create data pipelines for data integration into Cloud stacks eg. Azure Synapse
Code data processing jobs in Azure Synapse Analytics, Python, and Spark
Experience in dealing with structured, semi-structured, and unstructured data in batch and real-time environments.
Should be able to process .json, .parquet and .avro files
PREFERRED BACKGROUND:
Tier1/2 candidates from IIT/NIT/IIITs
However, relevant experience, learning attitude takes precedence
Responsibilities
- Design, plan and control the implementation of business solutions requests/demands.
- Execution of best practices, design, and codification, guiding the rest of the team in accordance with it.
- Gather the requirements and specifications to understand the client requirements in a detailed manner and translate the same into system requirements
- Drive complex technical projects from planning through execution
- Perform code review and manage technical debt
- Handling release deployments and production issues
- Coordinate stress tests, stability evaluations, and support for the concurrent processing of specific solutions
- Participate in project estimation, provide inputs for solution delivery, conduct technical risk planning, perform code reviews and unit test plan reviews
Skills
- Degree in Informatics Engineering, Computer Science, or in similar areas
- Minimum of 5+ years’ work experience in the similar roles
- Expert knowledge in developing cloud-based applications with Java, Spring Boot, Spring Rest, SpringJPA, and Spring Cloud
- Strong understanding of Azure Data Services
- Strong working knowledge of SQL Server, SQL Azure Database, No SQL, Data Modeling, Azure AD, ADFS, Identity & Access Management.
- Hands-on experience in ThingWorx platform (Application development, Mashups creation, Installation of ThingWorx and ThingWorx components)
- Strong knowledge of IoT Platform
- Development experience in Microservices Architectures best practices and, Docker, Kubernetes
- Experience designing /maintaining/tuning high-performance code to ensure optimal performance
- Strong knowledge of web security practice
- Experience working in Agile Development
- Knowledge about Google CloudPlatform and Kubernetes
- Good understanding of Git, source control procedures, and feature branching
- Fluent in English - written and spoken (mandatory)
Role: Talend Production Support Consultant
Brief Job Description:
- Involve in release deployment and monitoring of the ETL pipelines.
- Closely work with the development team and business team to provide operational support.
- Candidate should have good knowledge and hands on experience on below tools/technologies:
Talend (Talend Studio, TAC, TMC),SAP BODS,SQL,HIVE & Azure(Azure fundamentals, ADB,ADF)
- Hands on experience in CI/CD is an added advantage.
As discussed, please provide your Linkedin ID URL & a valid ID proof of yours.
Please confirm as well, you will relocate to Bangalore once required.
Role Description:
- You will be part of the data delivery team and will have the opportunity to develop a deep understanding of the domain/function.
- You will design and drive the work plan for the optimization/automation and standardization of the processes incorporating best practices to achieve efficiency gains.
- You will run data engineering pipelines, link raw client data with data model, conduct data assessment, perform data quality checks, and transform data using ETL tools.
- You will perform data transformations, modeling, and validation activities, as well as configure applications to the client context. You will also develop scripts to validate, transform, and load raw data using programming languages such as Python and / or PySpark.
- In this role, you will determine database structural requirements by analyzing client operations, applications, and programming.
- You will develop cross-site relationships to enhance idea generation, and manage stakeholders.
- Lastly, you will collaborate with the team to support ongoing business processes by delivering high-quality end products on-time and perform quality checks wherever required.
Job Requirement:
- Bachelor’s degree in Engineering or Computer Science; Master’s degree is a plus
- 3+ years of professional work experience with a reputed analytics firm
- Expertise in handling large amount of data through Python or PySpark
- Conduct data assessment, perform data quality checks and transform data using SQL and ETL tools
- Experience of deploying ETL / data pipelines and workflows in cloud technologies and architecture such as Azure and Amazon Web Services will be valued
- Comfort with data modelling principles (e.g. database structure, entity relationships, UID etc.) and software development principles (e.g. modularization, testing, refactoring, etc.)
- A thoughtful and comfortable communicator (verbal and written) with the ability to facilitate discussions and conduct training
- Strong problem-solving, requirement gathering, and leading.
-
Track record of completing projects successfully on time, within budget and as per scope
Responsibilities
- Installing and configuring Informatica components, including high availability; managing server activations and de-activations for all environments; ensuring that all systems and procedures adhere to organizational best practices
- Day to day administration of the Informatica Suite of services (PowerCenter, IDS, Metadata, Glossary and Analyst).
- Informatica capacity planning and on-going monitoring (e.g. CPU, Memory, etc.) to proactively increase capacity as needed.
- Manage backup and security of Data Integration Infrastructure.
- Design, develop, and maintain all data warehouse, data marts, and ETL functions for the organization as a part of an infrastructure team.
- Consult with users, management, vendors, and technicians to assess computing needs and system requirements.
- Develop and interpret organizational goals, policies, and procedures.
- Evaluate the organization's technology use and needs and recommend improvements, such as software upgrades.
- Prepare and review operational reports or project progress reports.
- Assist in the daily operations of the Architecture Team , analyzing workflow, establishing priorities, developing standards, and setting deadlines.
- Work with vendors to manage support SLA’s and influence vendor product roadmap
- Provide leadership and guidance in technical meetings, define standards and assist/provide status updates
- Work with cross functional operations teams such as systems, storage and network to design technology stacks.
Preferred Qualifications
- Minimum 6+ years’ experience as Informatica Engineer and Developer role
- Minimum of 5+ years’ experience in an ETL environment as a developer.
- Minimum of 5+ years of experience in SQL coding and understanding of databases
- Proficiency in Python
- Proficiency in command line troubleshooting
- Proficiency in writing code in Perl/Shell scripting languages
- Understanding of Java and concepts of Object-oriented programming
- Good understanding of systems, networking, and storage
- Strong knowledge of scalability and high availability
We are looking for a Data Engineer that will be responsible for collecting, storing, processing, and analyzing huge sets of data that is coming from different sources.
Responsibilities
Working with Big Data tools and frameworks to provide requested capabilities Identify development needs in order to improve and streamline operations Develop and manage BI solutions Implementing ETL process and Data Warehousing Monitoring performance and managing infrastructure
Skills
Proficient understanding of distributed computing principles Proficiency with Hadoop and Spark Experience with building stream-processing systems, using solutions such as Kafka and Spark-Streaming Good knowledge of Data querying tools SQL and Hive Knowledge of various ETL techniques and frameworks Experience with Python/Java/Scala (at least one) Experience with cloud services such as AWS or GCP Experience with NoSQL databases, such as DynamoDB,MongoDB will be an advantage Excellent written and verbal communication skills