• Problem Solving:. Resolving production issues to fix service P1-4 issues. Problems relating to
introducing new technology, and resolving major issues in the platform and/or service.
• Software Development Concepts: Understands and is experienced with the use of a wide range of
programming concepts and is also aware of and has applied a range of algorithms.
• Commercial & Risk Awareness: Able to understand & evaluate both obvious and subtle commercial
risks, especially in relation to a programme.
Experience you would be expected to have
• Cloud: experience with one of the following cloud vendors: AWS, Azure or GCP
• GCP : Experience prefered, but learning essential.
• Big Data: Experience with Big Data methodology and technologies
• Programming : Python or Java worked with Data (ETL)
• DevOps: Understand how to work in a Dev Ops and agile way / Versioning / Automation / Defect
Management – Mandatory
• Agile methodology - knowledge of Jira
Azure – Data Engineer
- At least 2 years hands on experience working with an Agile data engineering team working on big data pipelines using Azure in a commercial environment.
- Dealing with senior stakeholders/leadership
- Understanding of Azure data security and encryption best practices. [ADFS/ACLs]
Data Bricks –experience writing in and using data bricks Using Python to transform, manipulate data.
Data Factory – experience using data factory in an enterprise solution to build data pipelines. Experience calling rest APIs.
Synapse/data warehouse – experience using synapse/data warehouse to present data securely and to build & manage data models.
Microsoft SQL server – We’d expect the candidate to have come from a SQL/Data background and progressed into Azure
PowerBI – Experience with this is preferred
- Experience using GIT as a source control system
- Understanding of DevOps concepts and application
- Understanding of Azure Cloud costs/management and running platforms efficiently
Title: Data Engineer (Azure) (Location: Gurgaon/Hyderabad)
Salary: Competitive as per Industry Standard
We are expanding our Data Engineering Team and hiring passionate professionals with extensive
knowledge and experience in building and managing large enterprise data and analytics platforms. We
are looking for creative individuals with strong programming skills, who can understand complex
business and architectural problems and develop solutions. The individual will work closely with the rest
of our data engineering and data science team in implementing and managing Scalable Smart Data
Lakes, Data Ingestion Platforms, Machine Learning and NLP based Analytics Platforms, Hyper-Scale
Processing Clusters, Data Mining and Search Engines.
What You’ll Need:
- 3+ years of industry experience in creating and managing end-to-end Data Solutions, Optimal
Data Processing Pipelines and Architecture dealing with large volume, big data sets of varied
- Proficiency in Python, Linux and shell scripting.
- Strong knowledge of working with PySpark dataframes, Pandas dataframes for writing efficient pre-processing and other data manipulation tasks.
● Strong experience in developing the infrastructure required for data ingestion, optimal
extraction, transformation, and loading of data from a wide variety of data sources using tools like Azure Data Factory, Azure Databricks (or Jupyter notebooks/ Google Colab) (or other similiar tools).
- Working knowledge of github or other version control tools.
- Experience with creating Restful web services and API platforms.
- Work with data science and infrastructure team members to implement practical machine
learning solutions and pipelines in production.
- Experience with cloud providers like Azure/AWS/GCP.
- Experience with SQL and NoSQL databases. MySQL/ Azure Cosmosdb / Hbase/MongoDB/ Elasticsearch etc.
- Experience with stream-processing systems: Spark-Streaming, Kafka etc and working experience with event driven architectures.
- Strong analytic skills related to working with unstructured datasets.
Good to have (to filter or prioritize candidates)
- Experience with testing libraries such as pytest for writing unit-tests for the developed code.
- Knowledge of Machine Learning algorithms and libraries would be good to have,
implementation experience would be an added advantage.
- Knowledge and experience of Datalake, Dockers and Kubernetes would be good to have.
- Knowledge of Azure functions , Elastic search etc will be good to have.
- Having experience with model versioning (mlflow) and data versioning will be beneficial
- Having experience with microservices libraries or with python libraries such as flask for hosting ml services and models would be great.
- Key responsibility is to design and develop a data pipeline including the architecture, prototyping, and development of data extraction, transformation/processing, cleansing/standardizing, and loading in Data Warehouse at real-time/near the real-time frequency. Source data can be structured, semi-structured, and/or unstructured format.
- Provide technical expertise to design efficient data ingestion solutions to consolidate data from RDBMS, APIs, Messaging queues, weblogs, images, audios, documents, etc of Enterprise Applications, SAAS applications, external 3rd party sites or APIs, etc through ETL/ELT, API integrations, Change Data Capture, Robotic Process Automation, Custom Python/Java Coding, etc
- Development of complex data transformation using Talend (BigData edition), Python/Java transformation in Talend, SQL/Python/Java UDXs, AWS S3, etc to load in OLAP Data Warehouse in Structured/Semi-structured form
- Development of data model and creating transformation logic to populate models for faster data consumption with simple SQL.
- Implementing automated Audit & Quality assurance checks in Data Pipeline
- Document & maintain data lineage to enable data governance
- Coordination with BIU, IT, and other stakeholders to provide best-in-class data pipeline solutions, exposing data via APIs, loading in down streams, No-SQL Databases, etc
- Programming experience using Python / Java, to create functions / UDX
- Extensive technical experience with SQL on RDBMS (Oracle/MySQL/Postgresql etc) including code optimization techniques
- Strong ETL/ELT skillset using Talend BigData Edition. Experience in Talend CDC & MDM functionality will be an advantage.
- Experience & expertise in implementing complex data pipelines, including semi-structured & unstructured data processing
- Expertise to design efficient data ingestion solutions to consolidate data from RDBMS, APIs, Messaging queues, weblogs, images, audios, documents, etc of Enterprise Applications, SAAS applications, external 3rd party sites or APIs, etc through ETL/ELT, API integrations, Change Data Capture, Robotic Process Automation, Custom Python/Java Coding, etc
- Good understanding & working experience in OLAP Data Warehousing solutions (Redshift, Synapse, Snowflake, Teradata, Vertica, etc) and cloud-native Data Lake (S3, ADLS, BigQuery, etc) solutions
- Familiarity with AWS tool stack for Storage & Processing. Able to recommend the right tools/solutions available to address a technical problem
- Good knowledge of database performance and tuning, troubleshooting, query optimization, and tuning
- Good analytical skills with the ability to synthesize data to design and deliver meaningful information
- Good knowledge of Design, Development & Performance tuning of 3NF/Flat/Hybrid Data Model
- Know-how on any No-SQL DB (DynamoDB, MongoDB, CosmosDB, etc) will be an advantage.
- Ability to understand business functionality, processes, and flows
- Good combination of technical and interpersonal skills with strong written and verbal communication; detail-oriented with the ability to work independently
- Data Governance & Quality Assurance
- Distributed computing
- Data structures and algorithm
- Unstructured Data Processing
Work days- Sun-Thu
Should Have Minimum 3 yrs of Exp in Data factory , Data Lake.
• Should have exposure in Azure SQL Database, Azure Dataware house.
• Exposure in t-SQL skills with experience in Azure SQL DW
• Experience handling Structured and unstructured datasets.
• Excellent problem solving, Critical and Analytical thinking skills.
• Understand and worked on Azure Concepts.
• Good Communication
PriceLabs ( chicagobusiness.com/innovators/what-if-you-could-adjust-prices-meet-demand ) is a cloud based software for vacation and short term rentals to help them dynamically manage prices just the way large hotels and airlines do! Our mission is to help small businesses in the travel and tourism industry by giving them access to advanced analytical systems that are often restricted to large companies.
We're looking for someone with strong analytical capabilities who wants to understand how our current architecture and algorithms work, and help us design and develop long lasting solutions to address those. Depending on the needs of the day, the role will come with a good mix of team-work, following our best practices, introducing us to industry best practices, independent thinking, and ownership of your work.
- Design, develop and enhance our pricing algorithms to enable new capabilities.
- Process, analyze, model, and visualize findings from our market level supply and demand data.
- Build and enhance internal and customer facing dashboards to better track metrics and trends that help customers use PriceLabs in a better way.
- Take ownership of product ideas and design discussions.
- Occasional travel to conferences to interact with prospective users and partners, and learn where the industry is headed.
- Bachelors, Masters or Ph. D. in Operations Research, Industrial Engineering, Statistics, Computer Science or other quantitative/engineering fields.
- Strong understanding of analysis of algorithms, data structures and statistics.
- Solid programming experience. Including being able to quickly prototype an idea and test it out.
- Strong communication skills, including the ability and willingness to explain complicated algorithms and concepts in simple terms.
- Experience with relational databases and strong knowledge of SQL.
- Experience building data heavy analytical models in the travel industry.
- Experience in the vacation rental industry.
- Experience developing dynamic pricing models.
- Prior experience working at a fast paced environment.
- Willingness to wear many hats.
- Installing and configuring Informatica components, including high availability; managing server activations and de-activations for all environments; ensuring that all systems and procedures adhere to organizational best practices
- Day to day administration of the Informatica Suite of services (PowerCenter, IDS, Metadata, Glossary and Analyst).
- Informatica capacity planning and on-going monitoring (e.g. CPU, Memory, etc.) to proactively increase capacity as needed.
- Manage backup and security of Data Integration Infrastructure.
- Design, develop, and maintain all data warehouse, data marts, and ETL functions for the organization as a part of an infrastructure team.
- Consult with users, management, vendors, and technicians to assess computing needs and system requirements.
- Develop and interpret organizational goals, policies, and procedures.
- Evaluate the organization's technology use and needs and recommend improvements, such as software upgrades.
- Prepare and review operational reports or project progress reports.
- Assist in the daily operations of the Architecture Team , analyzing workflow, establishing priorities, developing standards, and setting deadlines.
- Work with vendors to manage support SLA’s and influence vendor product roadmap
- Provide leadership and guidance in technical meetings, define standards and assist/provide status updates
- Work with cross functional operations teams such as systems, storage and network to design technology stacks.
- Minimum 6+ years’ experience as Informatica Engineer and Developer role
- Minimum of 5+ years’ experience in an ETL environment as a developer.
- Minimum of 5+ years of experience in SQL coding and understanding of databases
- Proficiency in Python
- Proficiency in command line troubleshooting
- Proficiency in writing code in Perl/Shell scripting languages
- Understanding of Java and concepts of Object-oriented programming
- Good understanding of systems, networking, and storage
- Strong knowledge of scalability and high availability