Cutshort logo
Dataflow architecture jobs

11+ Dataflow architecture Jobs in India

Apply to 11+ Dataflow architecture Jobs on CutShort.io. Find your next job, effortlessly. Browse Dataflow architecture Jobs and apply today!

icon
Pune
10 - 18 yrs
₹35L - ₹40L / yr
Google Cloud Platform (GCP)
Dataflow architecture
Data migration
Data processing
Big Data
+4 more

CANDIDATE WILL BE DEPLOYED IN A FINANCIAL CAPTIVE ORGANIZATION @ PUNE (KHARADI)

 

Below are the job Details :-

 

Experience 10 to 18 years

 

Mandatory skills –

  • data migration,
  • data flow

The ideal candidate for this role will have the below experience and qualifications:  

  • Experience of building a range of Services in a Cloud Service provider (ideally GCP)  
  • Hands-on design and development of Google Cloud Platform (GCP), across a wide range of GCP services including hands on experience of GCP storage & database technologies. 
  • Hands-on experience in architecting, designing or implementing solutions on GCP, K8s, and other Google technologies. Security and Compliance, e.g. IAM and cloud compliance/auditing/monitoring tools 
  • Desired Skills within the GCP stack - Cloud Run, GKE, Serverless, Cloud Functions, Vision API, DLP, Data Flow, Data Fusion 
  • Prior experience of migrating on-prem applications to cloud environments. Knowledge and hands on experience on Stackdriver, pub-sub, VPC, Subnets, route tables, Load balancers, firewalls both for on premise and the GCP.  
  • Integrate, configure, deploy and manage centrally provided common cloud services (e.g. IAM, networking, logging, Operating systems, Containers.)  
  • Manage SDN in GCP Knowledge and experience of DevOps technologies around Continuous Integration & Delivery in GCP using Jenkins.  
  • Hands on experience of Terraform, Kubernetes, Docker, Stackdriver, Terraform  
  • Programming experience in one or more of the following languages: Python, Ruby, Java, JavaScript, Go, Groovy, Scala  
  • Knowledge or experience in DevOps tooling such as Jenkins, Git, Ansible, Splunk, Jira or Confluence, AppD, Docker, Kubernetes  
  • Act as a consultant and subject matter expert for internal teams to resolve technical deployment obstacles, improve product's vision. Ensure compliance with centrally defined Security 
  • Financial experience is preferred 
  • Ability to learn new technologies and rapidly prototype newer concepts 
  • Top-down thinker, excellent communicator, and great problem solver

 

Exp:- 10  to 18 years

 

Location:- Pune

 

Candidate must have experience in below.

  • GCP Data Platform
  • Data Processing:- Data Flow, Data Prep, Data Fusion
  • Data Storage:- Big Query, Cloud Sql,
  • Pub Sub, GCS Bucket
Read more
TensorGo Software Private Limited
Deepika Agarwal
Posted by Deepika Agarwal
Remote only
5 - 8 yrs
₹5L - ₹15L / yr
skill iconPython
PySpark
apache airflow
Spark
Hadoop
+4 more

Requirements:

● Understanding our data sets and how to bring them together.

● Working with our engineering team to support custom solutions offered to the product development.

● Filling the gap between development, engineering and data ops.

● Creating, maintaining and documenting scripts to support ongoing custom solutions.

● Excellent organizational skills, including attention to precise details

● Strong multitasking skills and ability to work in a fast-paced environment

● 5+ years experience with Python to develop scripts.

● Know your way around RESTFUL APIs.[Able to integrate not necessary to publish]

● You are familiar with pulling and pushing files from SFTP and AWS S3.

● Experience with any Cloud solutions including GCP / AWS / OCI / Azure.

● Familiarity with SQL programming to query and transform data from relational Databases.

● Familiarity to work with Linux (and Linux work environment).

● Excellent written and verbal communication skills

● Extracting, transforming, and loading data into internal databases and Hadoop

● Optimizing our new and existing data pipelines for speed and reliability

● Deploying product build and product improvements

● Documenting and managing multiple repositories of code

● Experience with SQL and NoSQL databases (Casendra, MySQL)

● Hands-on experience in data pipelining and ETL. (Any of these frameworks/tools: Hadoop, BigQuery,

RedShift, Athena)

● Hands-on experience in AirFlow

● Understanding of best practices, common coding patterns and good practices around

● storing, partitioning, warehousing and indexing of data

● Experience in reading the data from Kafka topic (both live stream and offline)

● Experience in PySpark and Data frames

Responsibilities:

You’ll

● Collaborating across an agile team to continuously design, iterate, and develop big data systems.

● Extracting, transforming, and loading data into internal databases.

● Optimizing our new and existing data pipelines for speed and reliability.

● Deploying new products and product improvements.

● Documenting and managing multiple repositories of code.

Read more
Agilisium
Agency job
via Recruiting India by Moumita Santra
Chennai
10 - 19 yrs
₹12L - ₹40L / yr
Big Data
Apache Spark
Spark
PySpark
ETL
+1 more

Job Sector: IT, Software

Job Type: Permanent

Location: Chennai

Experience: 10 - 20 Years

Salary: 12 – 40 LPA

Education: Any Graduate

Notice Period: Immediate

Key Skills: Python, Spark, AWS, SQL, PySpark

Contact at triple eight two zero nine four two double seven

 

Job Description:

Requirements

  • Minimum 12 years experience
  • In depth understanding and knowledge on distributed computing with spark.
  • Deep understanding of Spark Architecture and internals
  • Proven experience in data ingestion, data integration and data analytics with spark, preferably PySpark.
  • Expertise in ETL processes, data warehousing and data lakes.
  • Hands on with python for Big data and analytics.
  • Hands on in agile scrum model is an added advantage.
  • Knowledge on CI/CD and orchestration tools is desirable.
  • AWS S3, Redshift, Lambda knowledge is preferred
Thanks
Read more
Ganit Business Solutions

at Ganit Business Solutions

3 recruiters
Vijitha VS
Posted by Vijitha VS
Remote only
4 - 7 yrs
₹10L - ₹30L / yr
skill iconScala
ETL
Informatica
Data Warehouse (DWH)
Big Data
+4 more

Job Description:

We are looking for a Big Data Engineer who have worked across the entire ETL stack. Someone who has ingested data in a batch and live stream format, transformed large volumes of daily and built Data-warehouse to store the transformed data and has integrated different visualization dashboards and applications with the data stores.    The primary focus will be on choosing optimal solutions to use for these purposes, then maintaining, implementing, and monitoring them.

Responsibilities:

  • Develop, test, and implement data solutions based on functional / non-functional business requirements.
  • You would be required to code in Scala and PySpark daily on Cloud as well as on-prem infrastructure
  • Build Data Models to store the data in a most optimized manner
  • Identify, design, and implement process improvements: automating manual processes, optimizing data delivery, re-designing infrastructure for greater scalability, etc.
  • Implementing the ETL process and optimal data pipeline architecture
  • Monitoring performance and advising any necessary infrastructure changes.
  • Create data tools for analytics and data scientist team members that assist them in building and optimizing our product into an innovative industry leader.
  • Work with data and analytics experts to strive for greater functionality in our data systems.
  • Proactively identify potential production issues and recommend and implement solutions
  • Must be able to write quality code and build secure, highly available systems.
  • Create design documents that describe the functionality, capacity, architecture, and process.
  • Review peer-codes and pipelines before deploying to Production for optimization issues and code standards

Skill Sets:

  • Good understanding of optimal extraction, transformation, and loading of data from a wide variety of data sources using SQL and ‘big data’ technologies.
  • Proficient understanding of distributed computing principles
  • Experience in working with batch processing/ real-time systems using various open-source technologies like NoSQL, Spark, Pig, Hive, Apache Airflow.
  • Implemented complex projects dealing with the considerable data size (PB).
  • Optimization techniques (performance, scalability, monitoring, etc.)
  • Experience with integration of data from multiple data sources
  • Experience with NoSQL databases, such as HBase, Cassandra, MongoDB, etc.,
  • Knowledge of various ETL techniques and frameworks, such as Flume
  • Experience with various messaging systems, such as Kafka or RabbitMQ
  • Creation of DAGs for data engineering
  • Expert at Python /Scala programming, especially for data engineering/ ETL purposes

 

 

 

Read more
SpringML

at SpringML

1 video
4 recruiters
Sai Raj Sampath
Posted by Sai Raj Sampath
Remote, Hyderabad
4 - 9 yrs
₹12L - ₹20L / yr
Big Data
Data engineering
TensorFlow
Apache Spark
skill iconJava
+2 more
REQUIRED SKILLS:

• Total of 4+ years of experience in development, architecting/designing and implementing Software solutions for enterprises.

• Must have strong programming experience in either Python or Java/J2EE.

• Minimum of 4+ year’s experience working with various Cloud platforms preferably Google Cloud Platform.

• Experience in Architecting and Designing solutions leveraging Google Cloud products such as Cloud BigQuery, Cloud DataFlow, Cloud Pub/Sub, Cloud BigTable and Tensorflow will be highly preferred.

• Presentation skills with a high degree of comfort speaking with management and developers

• The ability to work in a fast-paced, work environment

• Excellent communication, listening, and influencing skills

RESPONSIBILITIES:

• Lead teams to implement and deliver software solutions for Enterprises by understanding their requirements.

• Communicate efficiently and document the Architectural/Design decisions to customer stakeholders/subject matter experts.

• Opportunity to learn new products quickly and rapidly comprehend new technical areas – technical/functional and apply detailed and critical thinking to customer solutions.

• Implementing and optimizing cloud solutions for customers.

• Migration of Workloads from on-prem/other public clouds to Google Cloud Platform.

• Provide solutions to team members for complex scenarios.

• Promote good design and programming practices with various teams and subject matter experts.

• Ability to work on any product on the Google cloud platform.

• Must be hands-on and be able to write code as required.

• Ability to lead junior engineers and conduct code reviews



QUALIFICATION:

• Minimum B.Tech/B.E Engineering graduate
Read more
first principle labs

at first principle labs

1 recruiter
Ankit Goenka
Posted by Ankit Goenka
Pune
3 - 7 yrs
₹12L - ₹18L / yr
skill iconData Science
skill iconPython
skill iconR Programming
Big Data
Hadoop
The selected would be a part of the inhouse Data Labs team. He/she would be responsible to creation insights-driven decision structure.

This will include:

Scorecards
Strategies
MIS

The verticals included are:

Risk
Marketing
Product
Read more
BDI Plus Lab

at BDI Plus Lab

2 recruiters
Silita S
Posted by Silita S
Bengaluru (Bangalore)
3 - 7 yrs
₹5L - ₹12L / yr
Big Data
Hadoop
skill iconJava
skill iconPython
PySpark
+1 more

Roles and responsibilities:

 

  1. Responsible for development and maintenance of applications with technologies involving Enterprise Java and Distributed  technologies.
  2. Experience in Hadoop, Kafka, Spark, Elastic Search, SQL, Kibana, Python, experience w/ machine learning and Analytics     etc.
  3. Collaborate with developers, product manager, business analysts and business users in conceptualizing, estimating and developing new software applications and enhancements..
  4. Collaborate with QA team to define test cases, metrics, and resolve questions about test results.
  5. Assist in the design and implementation process for new products, research and create POC for possible solutions.
  6. Develop components based on business and/or application requirements
  7. Create unit tests in accordance with team policies & procedures
  8. Advise, and mentor team members in specialized technical areas as well as fulfill administrative duties as defined by support process
  9. Work with cross-functional teams during crisis to address and resolve complex incidents and problems in addition to assessment, analysis, and resolution of cross-functional issues. 
Read more
MNC

at MNC

Agency job
via I Squaresoft by Khadri SH
Remote only
5 - 8 yrs
₹10L - ₹20L / yr
ETL
skill iconAmazon Web Services (AWS)
Google Cloud Platform (GCP)
SSIS
Cloud Datawarehouse
Hi,

job Description

Problem Formulation: Identifies possible options to address the business problems and must possess good understanding of dimension modelling

Must have worked on at least one end to end project using any Cloud Datawarehouse (Azure Synapses, AWS Redshift, Google Big query)

Good to have an understand of POWER BI and integration with any Cloud services like Azure or GCP

Experience of working with SQL Server, SSIS(Preferred)

Applied Business Acumen: Supports the development of business cases and recommendations. Owns delivery of project activity and tasks assigned by others. Supports process updates and changes. Solves business issues.

Data Transformation/Integration/Optimization:

The ETL developer is responsible for designing and creating the Data warehouse and all related data extraction, transformation and load of data function in the company

The developer should provide the oversight and planning of data models, database structural design and deployment and work closely with the data architect and Business analyst

Duties include working in a cross functional software development teams (Business analyst, Testers, Developers) following agile ceremonies and development practices.

The developer plays a key role in contributing to the design, evaluation, selection, implementation and support of databases solution.

Development and Testing: Develops codes for the required solution by determining the appropriate approach and leveraging business, technical, and data requirements.

Creates test cases to review and validate the proposed solution design. Work on POCs and deploy the software to production servers.

Good to Have (Preferred Skills):

  • Minimum 4-8 Years of experience in Data warehouse design and development for large scale application
  • Minimum 3 years of experience with star schema, dimensional modelling and extract transform load (ETL) Design and development
  • Expertise working with various databases (SQL Server, Oracle)
  • Experience developing Packages, Procedures, Views and triggers
  • Nice to have Big data technologies
  • The individual must have good written and oral communication skills.
  • Nice to have SSIS

 

Education and Experience

  • Minimum 4-8 years of software development experience
  • Bachelor's and/or Master’s degree in computer science

Please revert back with below details.

Total Experience:
Relevant Experience:

Current CTC:
Expected CTC:

Any offers: Y/N

Notice Period:

Qualification:

DOB:
Present Company Name:

Designation:

Domain

Reason for job change:

Current Location:

Read more
Rely

at Rely

1 video
3 recruiters
Hizam Ismail
Posted by Hizam Ismail
Bengaluru (Bangalore)
2 - 10 yrs
₹8L - ₹35L / yr
skill iconPython
Hadoop
Spark
skill iconAmazon Web Services (AWS)
Big Data
+2 more

Intro

Our data and risk team is the core pillar of our business that harnesses alternative data sources to guide the decisions we make at Rely. The team designs, architects, as well as develop and maintain a scalable data platform the powers our machine learning models. Be part of a team that will help millions of consumers across Asia, to be effortlessly in control of their spending and make better decisions.


What will you do
The data engineer is focused on making data correct and accessible, and building scalable systems to access/process it. Another major responsibility is helping AI/ML Engineers write better code.

• Optimize and automate ingestion processes for a variety of data sources such as: click stream, transactional and many other sources.

  • Create and maintain optimal data pipeline architecture and ETL processes
  • Assemble large, complex data sets that meet functional / non-functional business requirements.
  • Develop data pipeline and infrastructure to support real-time decisions
  • Build the infrastructure required for optimal extraction, transformation, and loading of data from a wide variety of data sources using SQL and AWS big data' technologies.
  • Build analytics tools that utilize the data pipeline to provide actionable insights into customer acquisition, operational efficiency and other key business performance metrics.
  • Work with stakeholders to assist with data-related technical issues and support their data infrastructure needs.


What will you need
• 2+ hands-on experience building and implementation of large scale production pipeline and Data Warehouse
• Experience dealing with large scale

  • Proficiency in writing and debugging complex SQLs
  • Experience working with AWS big data tools
    • Ability to lead the project and implement best data practises and technology

Data Pipelining

  • Strong command in building & optimizing data pipelines, architectures and data sets
  • Strong command on relational SQL & noSQL databases including Postgres
  • Data pipeline and workflow management tools: Azkaban, Luigi, Airflow, etc.

Big Data: Strong experience in big data tools & applications

  • Tools: Hadoop, Spark, HDFS etc
  • AWS cloud services: EC2, EMR, RDS, Redshift
  • Stream-processing systems: Storm, Spark-Streaming, Flink etc.
  • Message queuing: RabbitMQ, Spark etc

Software Development & Debugging

  • Strong experience in object-oriented programming/object function scripting languages: Python, Java, C++, Scala, etc
  • Strong hold on data structures & algorithms

What would be a bonus

  • Prior experience working in a fast-growth Startup
  • Prior experience in the payments, fraud, lending, advertising companies dealing with large scale data
Read more
NCR (Delhi | Gurgaon | Noida)
3 - 7 yrs
₹12L - ₹34L / yr
skill iconMachine Learning (ML)
Data Structures
Data engineering
Big Data
Neural networks
• Experience with Big Data, Neural network (deep learning), and reinforcement learning • Ability to design machine learning systems • Research and implement appropriate ML algorithms and tools • Develop machine learning applications according to requirements • Select appropriate datasets and data representation methods • Run machine learning tests and experiments • Perform statistical analysis and fine-tuning using test results • Extend existing ML libraries and frameworks • Keep abreast of developments in the field • Understanding of data structures, data modeling and software architecture • Deep knowledge of math, probability, statistics and algorithms • Ability to write robust code in Python, Java and R Familiarity with machine learning frameworks (like Keras or PyTorch) and libraries (like scikit-learn)
Read more
Mintifi

at Mintifi

3 recruiters
Suchita Upadhyay
Posted by Suchita Upadhyay
Mumbai
2 - 4 yrs
₹6L - ₹15L / yr
Big Data
Hadoop
MySQL
skill iconMongoDB
YARN
Job Title: Software Developer – Big Data Responsibilities We are looking for a Big Data Developer who can drive innovation and take ownership and deliver results. • Understand business requirements from stakeholders • Build & own Mintifi Big Data applications • Be heavily involved in every step of the product development process, from ideation to implementation to release. • Design and build systems with automated instrumentation and monitoring • Write unit & integration tests • Collaborate with cross functional teams to validate and get feedback on the efficacy of results created by the big data applications. Use the feedback to improve the business logic • Proactive approach to turn ambiguous problem spaces into clear design solutions. Qualifications • Hands-on programming skills in Apache Spark using Java or Scala • Good understanding about Data Structures and Algorithms • Good understanding about relational and non-relational database concepts (MySQL, Hadoop, MongoDB) • Experience in Hadoop ecosystem components like YARN, Zookeeper would be a strong plus
Read more
Get to hear about interesting companies hiring right now
Company logo
Company logo
Company logo
Company logo
Company logo
Linkedin iconFollow Cutshort
Why apply via Cutshort?
Connect with actual hiring teams and get their fast response. No spam.
Find more jobs
Get to hear about interesting companies hiring right now
Company logo
Company logo
Company logo
Company logo
Company logo
Linkedin iconFollow Cutshort