![skill icon](/_next/image?url=https%3A%2F%2Fcdn.cutshort.io%2Fpublic%2Fimages%2Fskill_icons%2Fmachine_learning.png&w=32&q=75)
![skill icon](/_next/image?url=https%3A%2F%2Fcdn.cutshort.io%2Fpublic%2Fimages%2Fskill_icons%2Fpython.png&w=32&q=75)
![skill icon](/_next/image?url=https%3A%2F%2Fcdn.cutshort.io%2Fpublic%2Fimages%2Fskill_icons%2Fdata_science.png&w=32&q=75)
![companies logos](/_next/image?url=https%3A%2F%2Fcdn.cutshort.io%2Fpublic%2Fimages%2Fhiring_companies_logos-v2.webp&w=3840&q=80)
About Ignite Solutions
We’re a team of passionate individuals who delight in creating highly impactful products. We work with companies who have ambitious ideas and want to see them brought to life.
Work With Us
- We help ambitious companies with UX design, technology strategy, and software development.
- We believe in small teams. Small teams and smart individuals can make things happen.
- We have an open, honest culture. As part of our team, you will get to contribute to our ideas, our plans, and our success.
- We treat you with respect and trust you will do the same to your team members.
- We hire carefully, looking for attitude along with aptitude.
- We like to have fun so bring your sense of humour with you.
Similar jobs
![skill icon](/_next/image?url=https%3A%2F%2Fcdn.cutshort.io%2Fpublic%2Fimages%2Fskill_icons%2Fpython.png&w=32&q=75)
![skill icon](/_next/image?url=https%3A%2F%2Fcdn.cutshort.io%2Fpublic%2Fimages%2Fskill_icons%2Fr.png&w=32&q=75)
![skill icon](/_next/image?url=https%3A%2F%2Fcdn.cutshort.io%2Fpublic%2Fimages%2Fskill_icons%2Fpython.png&w=32&q=75)
Role: Data Engineer
Company: PayU
Location: Bangalore/ Mumbai
Experience : 2-5 yrs
About Company:
PayU is the payments and fintech business of Prosus, a global consumer internet group and one of the largest technology investors in the world. Operating and investing globally in markets with long-term growth potential, Prosus builds leading consumer internet companies that empower people and enrich communities.
The leading online payment service provider in 36 countries, PayU is dedicated to creating a fast, simple and efficient payment process for merchants and buyers. Focused on empowering people through financial services and creating a world without financial borders where everyone can prosper, PayU is one of the biggest investors in the fintech space globally, with investments totalling $700 million- to date. PayU also specializes in credit products and services for emerging markets across the globe. We are dedicated to removing risks to merchants, allowing consumers to use credit in ways that suit them and enabling a greater number of global citizens to access credit services.
Our local operations in Asia, Central and Eastern Europe, Latin America, the Middle East, Africa and South East Asia enable us to combine the expertise of high growth companies with our own unique local knowledge and technology to ensure that our customers have access to the best financial services.
India is the biggest market for PayU globally and the company has already invested $400 million in this region in last 4 years. PayU in its next phase of growth is developing a full regional fintech ecosystem providing multiple digital financial services in one integrated experience. We are going to do this through 3 mechanisms: build, co-build/partner; select strategic investments.
PayU supports over 350,000+ merchants and millions of consumers making payments online with over 250 payment methods and 1,800+ payment specialists. The markets in which PayU operates represent a potential consumer base of nearly 2.3 billion people and a huge growth potential for merchants.
Job responsibilities:
- Design infrastructure for data, especially for but not limited to consumption in machine learning applications
- Define database architecture needed to combine and link data, and ensure integrity across different sources
- Ensure performance of data systems for machine learning to customer-facing web and mobile applications using cutting-edge open source frameworks, to highly available RESTful services, to back-end Java based systems
- Work with large, fast, complex data sets to solve difficult, non-routine analysis problems, applying advanced data handling techniques if needed
- Build data pipelines, includes implementing, testing, and maintaining infrastructural components related to the data engineering stack.
- Work closely with Data Engineers, ML Engineers and SREs to gather data engineering requirements to prototype, develop, validate and deploy data science and machine learning solutions
Requirements to be successful in this role:
- Strong knowledge and experience in Python, Pandas, Data wrangling, ETL processes, statistics, data visualisation, Data Modelling and Informatica.
- Strong experience with scalable compute solutions such as in Kafka, Snowflake
- Strong experience with workflow management libraries and tools such as Airflow, AWS Step Functions etc.
- Strong experience with data engineering practices (i.e. data ingestion pipelines and ETL)
- A good understanding of machine learning methods, algorithms, pipelines, testing practices and frameworks
- Preferred) MEng/MSc/PhD degree in computer science, engineering, mathematics, physics, or equivalent (preference: DS/ AI)
- Experience with designing and implementing tools that support sharing of data, code, practices across organizations at scale
![skill icon](/_next/image?url=https%3A%2F%2Fcdn.cutshort.io%2Fpublic%2Fimages%2Fskill_icons%2Fpython.png&w=32&q=75)
Job Description
Mandatory Requirements
-
Experience in AWS Glue
-
Experience in Apache Parquet
-
Proficient in AWS S3 and data lake
-
Knowledge of Snowflake
-
Understanding of file-based ingestion best practices.
-
Scripting language - Python & pyspark
CORE RESPONSIBILITIES
-
Create and manage cloud resources in AWS
-
Data ingestion from different data sources which exposes data using different technologies, such as: RDBMS, flat files, Streams, and Time series data based on various proprietary systems. Implement data ingestion and processing with the help of Big Data technologies
-
Data processing/transformation using various technologies such as Spark and Cloud Services. You will need to understand your part of business logic and implement it using the language supported by the base data platform
-
Develop automated data quality check to make sure right data enters the platform and verifying the results of the calculations
-
Develop an infrastructure to collect, transform, combine and publish/distribute customer data.
-
Define process improvement opportunities to optimize data collection, insights and displays.
-
Ensure data and results are accessible, scalable, efficient, accurate, complete and flexible
-
Identify and interpret trends and patterns from complex data sets
-
Construct a framework utilizing data visualization tools and techniques to present consolidated analytical and actionable results to relevant stakeholders.
-
Key participant in regular Scrum ceremonies with the agile teams
-
Proficient at developing queries, writing reports and presenting findings
-
Mentor junior members and bring best industry practices.
QUALIFICATIONS
-
5-7+ years’ experience as data engineer in consumer finance or equivalent industry (consumer loans, collections, servicing, optional product, and insurance sales)
-
Strong background in math, statistics, computer science, data science or related discipline
-
Advanced knowledge one of language: Java, Scala, Python, C#
-
Production experience with: HDFS, YARN, Hive, Spark, Kafka, Oozie / Airflow, Amazon Web Services (AWS), Docker / Kubernetes, Snowflake
-
Proficient with
-
Data mining/programming tools (e.g. SAS, SQL, R, Python)
-
Database technologies (e.g. PostgreSQL, Redshift, Snowflake. and Greenplum)
-
Data visualization (e.g. Tableau, Looker, MicroStrategy)
-
Comfortable learning about and deploying new technologies and tools.
-
Organizational skills and the ability to handle multiple projects and priorities simultaneously and meet established deadlines.
-
Good written and oral communication skills and ability to present results to non-technical audiences
-
Knowledge of business intelligence and analytical tools, technologies and techniques.
Familiarity and experience in the following is a plus:
-
AWS certification
-
Spark Streaming
-
Kafka Streaming / Kafka Connect
-
ELK Stack
-
Cassandra / MongoDB
-
CI/CD: Jenkins, GitLab, Jira, Confluence other related tools
![skill icon](/_next/image?url=https%3A%2F%2Fcdn.cutshort.io%2Fpublic%2Fimages%2Fskill_icons%2Fmachine_learning.png&w=32&q=75)
- Passionate about search & AI technologies. Open to collaborating with colleagues & external contributors.
- Good understanding of the mainstream deep learning models from multiple domains: computer vision, NLP, reinforcement learning, model optimization, etc.
- Hands-on experience on deep learning frameworks, e.g. Tensorflow, Pytorch, MXNet, BERT. Able to implement the latest DL model using existing API, open-source libraries in a short time.
- Hands-on experience with the Cloud-Native techniques. Good understanding of web services and modern software technologies.
- Maintained/contributed machine learning projects, familiar with the agile software development process, CICD workflow, ticket management, code-review, version control, etc.
- Skilled in the following programming languages: Python 3.
- Good English skills especially for writing and reading documentation
![skill icon](/_next/image?url=https%3A%2F%2Fcdn.cutshort.io%2Fpublic%2Fimages%2Fskill_icons%2Fpython.png&w=32&q=75)
● Good communication and collaboration skills with 4-7 years of experience.
● Ability to code and script with strong grasp of CS fundamentals, excellent problem solving abilities.
● Comfort with frequent, incremental code testing and deployment, Data management skills
● Good understanding of RDBMS
● Experience in building Data pipelines and processing large datasets .
● Knowledge of building Web Scraping and data mining is a plus.
● Working knowledge of open source tools such as mysql, Solr, ElasticSearch, Cassandra ( data stores )
would be a plus.
● Expert in Python programming
Role and responsibilities
● Inclined towards working in a start-up environment.
● Comfort with frequent, incremental code testing and deployment, Data management skills
● Design and Build robust and scalable data engineering solutions for structured and unstructured data for
delivering business insights, reporting and analytics.
● Expertise in troubleshooting, debugging, data completeness and quality issues and scaling overall
system performance.
● Build robust API ’s that powers our delivery points (Dashboards, Visualizations and other integrations).
![skill icon](/_next/image?url=https%3A%2F%2Fcdn.cutshort.io%2Fpublic%2Fimages%2Fskill_icons%2Fpython.png&w=32&q=75)
Roles and responsibilities:
- Responsible for development and maintenance of applications with technologies involving Enterprise Java and Distributed technologies.
- Experience in Hadoop, Kafka, Spark, Elastic Search, SQL, Kibana, Python, experience w/ machine learning and Analytics etc.
- Collaborate with developers, product manager, business analysts and business users in conceptualizing, estimating and developing new software applications and enhancements..
- Collaborate with QA team to define test cases, metrics, and resolve questions about test results.
- Assist in the design and implementation process for new products, research and create POC for possible solutions.
- Develop components based on business and/or application requirements
- Create unit tests in accordance with team policies & procedures
- Advise, and mentor team members in specialized technical areas as well as fulfill administrative duties as defined by support process
- Work with cross-functional teams during crisis to address and resolve complex incidents and problems in addition to assessment, analysis, and resolution of cross-functional issues.
![skill icon](/_next/image?url=https%3A%2F%2Fcdn.cutshort.io%2Fpublic%2Fimages%2Fskill_icons%2Fpython.png&w=32&q=75)
Responsibilities
- Installing and configuring Informatica components, including high availability; managing server activations and de-activations for all environments; ensuring that all systems and procedures adhere to organizational best practices
- Day to day administration of the Informatica Suite of services (PowerCenter, IDS, Metadata, Glossary and Analyst).
- Informatica capacity planning and on-going monitoring (e.g. CPU, Memory, etc.) to proactively increase capacity as needed.
- Manage backup and security of Data Integration Infrastructure.
- Design, develop, and maintain all data warehouse, data marts, and ETL functions for the organization as a part of an infrastructure team.
- Consult with users, management, vendors, and technicians to assess computing needs and system requirements.
- Develop and interpret organizational goals, policies, and procedures.
- Evaluate the organization's technology use and needs and recommend improvements, such as software upgrades.
- Prepare and review operational reports or project progress reports.
- Assist in the daily operations of the Architecture Team , analyzing workflow, establishing priorities, developing standards, and setting deadlines.
- Work with vendors to manage support SLA’s and influence vendor product roadmap
- Provide leadership and guidance in technical meetings, define standards and assist/provide status updates
- Work with cross functional operations teams such as systems, storage and network to design technology stacks.
Preferred Qualifications
- Minimum 6+ years’ experience as Informatica Engineer and Developer role
- Minimum of 5+ years’ experience in an ETL environment as a developer.
- Minimum of 5+ years of experience in SQL coding and understanding of databases
- Proficiency in Python
- Proficiency in command line troubleshooting
- Proficiency in writing code in Perl/Shell scripting languages
- Understanding of Java and concepts of Object-oriented programming
- Good understanding of systems, networking, and storage
- Strong knowledge of scalability and high availability
![skill icon](/_next/image?url=https%3A%2F%2Fcdn.cutshort.io%2Fpublic%2Fimages%2Fskill_icons%2Fmachine_learning.png&w=32&q=75)
![skill icon](/_next/image?url=https%3A%2F%2Fcdn.cutshort.io%2Fpublic%2Fimages%2Fskill_icons%2Fdeep_learning.png&w=32&q=75)
![skill icon](/_next/image?url=https%3A%2F%2Fcdn.cutshort.io%2Fpublic%2Fimages%2Fskill_icons%2Fpython.png&w=32&q=75)
![skill icon](/_next/image?url=https%3A%2F%2Fcdn.cutshort.io%2Fpublic%2Fimages%2Fskill_icons%2Fdata_science.png&w=32&q=75)
![skill icon](/_next/image?url=https%3A%2F%2Fcdn.cutshort.io%2Fpublic%2Fimages%2Fskill_icons%2Fpython.png&w=32&q=75)
![skill icon](/_next/image?url=https%3A%2F%2Fcdn.cutshort.io%2Fpublic%2Fimages%2Fskill_icons%2Fscala.png&w=32&q=75)
![skill icon](/_next/image?url=https%3A%2F%2Fcdn.cutshort.io%2Fpublic%2Fimages%2Fskill_icons%2Fdata_science.png&w=32&q=75)
![skill icon](/_next/image?url=https%3A%2F%2Fcdn.cutshort.io%2Fpublic%2Fimages%2Fskill_icons%2Fr.png&w=32&q=75)
![skill icon](/_next/image?url=https%3A%2F%2Fcdn.cutshort.io%2Fpublic%2Fimages%2Fskill_icons%2Fpython.png&w=32&q=75)
![icon](/_next/image?url=https%3A%2F%2Fcdn.cutshort.io%2Fpublic%2Fimages%2Fsearch.png&w=48&q=75)
![companies logos](/_next/image?url=https%3A%2F%2Fcdn.cutshort.io%2Fpublic%2Fimages%2Fhiring_companies_logos-v2.webp&w=3840&q=80)