Cutshort logo
PySpark Jobs in Mumbai

34+ PySpark Jobs in Mumbai | PySpark Job openings in Mumbai

Apply to 34+ PySpark Jobs in Mumbai on CutShort.io. Explore the latest PySpark Job opportunities across top companies like Google, Amazon & Adobe.

icon
Searce Inc

at Searce Inc

3 recruiters
Jatin Gereja
Posted by Jatin Gereja
Bengaluru (Bangalore), Mumbai, Pune
10 - 18 yrs
Best in industry
Google Cloud Platform (GCP)
skill iconAmazon Web Services (AWS)
Enterprise Data Warehouse (EDW)
Data modeling
Big Data
+9 more

Director - Data engineering


What are we looking for

real solver?

Solver? Absolutely. But not the usual kind. We're searching for the architects of the audacious & the pioneers of the possible. If you're the type to dismantle assumptions, re-engineer ‘best practices,’ and build solutions that make the future possible NOW, then you're speaking our language.


Your Responsibilities

what you will wake up to solve.

1. Delivery & Tactical Rigor

  • Methodology Implementation: Implement and manage a unified, 'DataOps-First' methodology for data engineering delivery (ETL/ELT pipelines, Data Modeling, MLOps, Data Governance) within assigned business units. This ensures predictable outcomes and trusted data integrity by reducing architecture variability at the project level.
  • Operational Stewardship: Drive initiatives to optimize team utilization and enhance operational efficiency within the practice. You manage the commercial success of your squads, ensuring data delivery models (from migration to modern data stack implementation) are executed profitably, scalably, and cost-effectively.
  • Execution & Technical Resolution
  • Technical Escalation: Serve as the primary escalation point for delivery issues, personally leading the resolution of complex data integration bottlenecks and pipeline failures to protect client timelines and data reliability standards.
  • Quality Enforcement
  • Quality Oversight: Execute and monitor technical data quality standards, ensuring engineering teams adhere to strict policies regarding data lineage, automated quality checks (observability), security/privacy compliance (GDPR/CCPA/PII), and active catalog management.

2. Strategic Growth & Practice Scaling

  • Talent & Scaling Execution: Execute the strategy for data engineering talent acquisition and development within your business units. Implement objective metrics to assess and grow the 'Data-Native' DNA of your teams, ensuring squads are consistently equipped to handle petabyte-scale environments and high-impact delivery.
  • Offerings Alignment: Drive the adoption of standardized regional offerings (e.g., Modern Data Platform, Data Mesh, Lakehouse Implementation). Ensure your teams leverage the profitable frameworks defined by the practice to accelerate time-to-insight and eliminate architectural fragmentation in client environments.
  • Innovation & IP Development: Lead the practical integration of Vector Databases and LLM-ready architectures into project delivery. Champion the hands-on development of IP and reusable accelerators (e.g., automated ingestion engines) that improve delivery speed and enhance data availability across your portfolio.

3. Leadership & Unit Management

  • Unit Leadership: Directly lead, mentor, and manage the Engineering Managers and Lead Architects within your business unit. Hold your teams accountable for project-level operational consistency, technical talent development, and strict adherence to the practice's data governance standards.
  • Stakeholder Communication: Clearly articulate the business unit’s operational performance, technical quality metrics, and delivery progress to the C-suite Stakeholders and regional client leadership, bridging the gap between technical execution and business value.
  • Ecosystem Alignment: Maintain strong technical relationships with key partner contacts (Snowflake, Databricks, AWS/GCP). Align team delivery capabilities with current product roadmaps and ensure squad-level participation in training, certifications, and partner-led enablement opportunities.


Welcome to Searce

The ‘process-first’, AI-native modern tech consultancy that's rewriting the rules.

We don’t do traditional.

As an engineering-led consultancy, we are dedicated to relentlessly improving the real business outcomes. Our solvers co-innovate with clients to futurify operations and make processes smarter, faster & better.


Functional Skills

1. Delivery Management & Operational Excellence

  • Methodology Execution: Expert capability in implementing and enforcing a unified delivery methodology (DataOps, Agile, Mesh Principles) within specific business units. Proven track record of auditing squad-level adherence to ensure consistency across the project lifecycle.
  • Operational Performance: High proficiency in managing day-to-day operational metrics, including squad utilization, resource forecasting, and productivity tracking. Skilled at optimizing team performance to meet profitability and efficiency targets.
  • SOW & Risk Mitigation: Proven experience in operationalizing Statement of Work (SOW) requirements and identifying technical delivery risks early. Expert at mitigating scope creep and data-specific bottlenecks (e.g., latency, ingestion gaps) before they impact client outcomes.
  • Technical Escalation Leadership: Demonstrated ability to lead "war room" efforts to resolve complex pipeline failures or data integrity issues. Skilled at providing clear, rapid remediation plans and communicating technical status directly to regional stakeholders.

2. Architectural Implementation & Technical Oversight

  • Modern Stack Proficiency: Deep, hands-on expertise in implementing Cloud-Native architectures (Lakehouse, Data Mesh, MPP) on Snowflake, Databricks, or hyperscalers. Ability to conduct deep-dive architectural reviews and course-correct design decisions at the squad level to ensure scalability.
  • Operationalizing Governance: Proven experience in embedding data quality and observability (completeness, freshness, accuracy) directly into the CI/CD pipeline. Responsible for technical enforcement of regulatory compliance (GDPR/PII) and maintaining the integrity of data catalogs across active projects.
  • Applied Domain Expertise: Practical experience leading the delivery of high-growth solutions, specifically Generative AI infrastructure (RAG, Vector DBs), Real-Time Streaming, and large-scale platform migrations with a focus on zero-downtime execution.
  • DataOps & Engineering Standards: Expert-level mastery of DataOps, including the setup and management of orchestration frameworks (Airflow, Dagster) and Infrastructure as Code (IaC). You ensure that automation is a baseline requirement, not an afterthought, for all delivery teams.

3. Unit Management & Commercial Execution

  • Unit & Team Management: Proven success in leading and mentoring Engineering Managers and Lead Architects. Responsible for the operational metrics, technical output, and career development of the business unit's talent pool.
  • Offerings Implementation & Scoping: Expertise in translating service offerings (e.g., Data Maturity Assessments, Lakehouse Builds) into accurate project scopes, technical estimates, and resource plans to ensure delivery is both profitable and competitive.
  • Talent Growth & Mentorship: Functional ability to implement growth frameworks for data engineering roles. Focus on hands-on coaching and scaling high-performance technical talent to meet the demands of complex, petabyte-scale environments.
  • Partner Enablement: Functional competence in managing regional technical relationships with major partners (Snowflake, Databricks, GCP/AWS). Drives squad-level certifications, joint technical enablement, and alignment with partner product roadmaps.

Tech Superpowers

  • Modern Data Architect – Reimagines business with the Modern Data Stack (MDS) to deliver data mesh implementations, insights, & real value to clients.
  • End-to-End Ecosystem Thinker – Builds modular, reusable data products across ingestion, transformation (ETL/ELT), governance, and consumption layers.
  • Distributed Compute Savant – Crafts resilient, high-throughput architectures that survive petabyte-scale volume and data skew without breaking the bank.
  • Governance & Integrity Guardian – Embeds data quality, complete lineage, and privacy-by-design (GDPR/PII) into every table, view, and pipeline.
  • AI-Ready Orchestrator – Engineers pipelines that bridge structured data with Unstructured/Vector stores, powering RAG models and Generative AI workflows.
  • Product-Minded Strategist – Balances architectural purity with time-to-insight; treats every dataset as a measurable "Data Product" with clear ROI.
  • Pragmatic Stack Curator – Chooses the simplest tools that compound reliability; fluent in SQL, Python, Spark, dbt, and Cloud Warehouses.
  • Builder @ Heart – Writes, reviews, and optimizes queries daily; proves architectures with cost-performance benchmarks, not slideware. Business-first, data-second, outcome focused technology leader.

Experience & Relevance

  • Executive Experience: Minimum 10+ years of progressive experience in data engineering and analytics, with at least 3 years in a Senior Manager or Director -level role managing multiple technical teams and owning significant operational and efficiency metrics for a large data service line.
  • Delivery Standardization: Demonstrated success in defining and implementing globally consistent, repeatable delivery methodologies (DataOps/Agile Data Warehousing) across diverse teams.
  • Architectural Depth: Must retain deep, current expertise in Modern Data Stack architectures (Lakehouse, MPP, Mesh) and maintain the ability to personally validate high-level architectural and data pipeline design decisions.
  • Operational Leadership: Proven expertise in managing and scaling large professional services organizations, demonstrated ability to optimize utilization, resource allocation, and operational expense.
  • Domain Expertise: Strong background in Enterprise Data Platforms, Applied AI/ML, Generative AI integration, or large-scale Cloud Data Migration.
  • Communication: Exceptional executive-level presentation and negotiation skills, particularly in communicating complex operational, data quality, and governance metrics to C-level stakeholders.

Join the ‘real solvers’

ready to futurify?

If you are excited by the possibilities of what an AI-native engineering-led, modern tech consultancy can do to futurify businesses, apply here and experience the ‘Art of the possible’. Don’t Just Send a Resume. Send a Statement.

Read more
Searce Inc

at Searce Inc

3 recruiters
Tejashree Kokare
Posted by Tejashree Kokare
Bengaluru (Bangalore), Pune, Mumbai
6 - 15 yrs
Best in industry
Google Cloud Platform (GCP)
Data engineering
Data warehouse architecture
Data architecture
Data modeling
+6 more

Solutions Architect - Data Engineering


Modern tech solutions advisory & 'futurify' consulting as a Searce lead fds (‘forward deployed solver’) architecting scalable data platforms and robust data engineering solutions that power intelligent insights and fuel AI innovation.

If you’re a tech-savvy, consultative seller with the brain of a strategist, the heart of a builder, and the charisma of a storyteller — we’ve got a seat for you at the front of the table.

You're not a sales lead. You're the transformation driver.


What are we looking for

real solver?

Solver? Absolutely. But not the usual kind. We're searching for the architects of the audacious & the pioneers of the possible. If you're the type to dismantle assumptions, re-engineer ‘best practices,’ and build solutions that make the future possible NOW, then you're speaking our language.

  • Improver. Solver. Futurist.
  • Great sense of humor.
  • ‘Possible. It is.’ Mindset.
  • Compassionate collaborator. Bold experimenter. Tireless iterator.
  • Natural creativity that doesn’t just challenge the norm, but solves to design what’s better.
  • Thinks in systems. Solves at scale.


This Isn’t for Everyone. But if you’re the kind who questions why things are done a certain way— and then identifies 3 better ways to do it — we’d love to chat with you.


Your Responsibilities

what you will wake up to solve.


You are not just a Solutions Architect; you are a futurifier of our data universe and the primary enabler of our AI ambitions. With a deep-seated passion for data engineering, you will architect and build the foundational data infrastructure that powers the customers entire data intelligence ecosystem.

As the Directly Responsible Individual (DRI) for our enterprise-grade data platforms, you own the outcome, end-to-end. You are the definitive solver for our customer's most complex data challenges, leveraging a powerful tech stack including Snowflake, Databricks, etc. and core GCP & AWS services (BigQuery, Spanner, Airflow, Kafka). This is a hands-on-keys role where you won't just design solutions—you'll build them, break them, and perfect them.


  • Solution Design & Pre-sales Excellence:Collaborate with cross-functional teams, including sales, engineering, and operations, to ensure successful project delivery.
  • Design Core Data Engineering: Master data modeling, architecting high-performance data ingestion pipelines and ensuring data quality and governance throughout the data lifecycle.
  • Enable Cloud & AI: Design and implement solutions utilizing core GCP data services, building foundational data platforms that efficiently support advanced analytics and AI/ML initiatives.
  • Optimize Performance & Cost: Continuously optimize data architectures and implementations for performance, efficiency, and cost-effectiveness within the cloud environment.
  • Bridge Business & Tech: Translate complex business requirements into clear technical designs, providing technical leadership and guidance to data engineering teams.
  • Stay Ahead of the Curve: Continuously research and evaluate new data technologies, architectural patterns, and industry trends to keep our data platforms at the cutting edge.


Functional Skills:


  • Enterprise Data Architecture Design: Expert ability to design holistic, scalable, and resilient data architectures for complex enterprise environments.
  • Cloud Data Platform Strategy: Proven capability to strategize, design, and implement cloud-native data platforms.
  • Pre-Sales & Technical Storyteller: Crafts compelling, client-ready proposals, architectural decks, and technical demonstrations. Doesn't just present; shapes the strategic technical narrative behind every proposed solution.
  • Advanced Data Modelling: Mastery in designing various data models for analytical, operational, and transactional use cases.
  • Data Ingestion & Pipeline Orchestration: Strong expertise in designing and optimizing robust data ingestion and transformation pipelines.
  • Stakeholder Communication: Exceptional skills in articulating complex technical concepts and architectural decisions to both technical and non-technical stakeholders.
  • Performance & Cost Optimization: Adept at optimizing data solutions for performance, efficiency, and cost within a cloud environment.


Tech Superpowers:


  • Cloud Data Mastery: You're a wizard at leveraging public cloud data services, with deep expertise in GCP (BigQuery, Spanner, etc.) and expert proficiency in modern data warehouse solutions like Snowflake.
  • Data Engineering Core: Highly skilled in designing, implementing, and managing data workflows using tools like Apache Airflow and Apache Kafka. You're also an authority on advanced data modeling and ETL/ELT patterns.
  • AI/ML Data Foundation: You instinctively design data pipelines and structures that efficiently feed and empower Machine Learning and Artificial Intelligence applications.
  • Programming for Data: You have a strong command over key programming languages (Python, SQL) for scripting, automation, and building data processing applications.


Experience & Relevance:


  • Architectural Leadership (8+ Years): You bring extensive experience (7+ years) specifically in a Solutions Architect role, focused on data engineering and platform building.
  • Cloud Data Expertise: You have a proven track record of designing and implementing production-grade data solutions leveraging major public cloud platforms, with significant experience in Google Cloud Platform (GCP).
  • Data Warehousing & Data Platform: Demonstrated hands-on experience in the end-to-end design, implementation, and optimization of modern data warehouses and comprehensive data platforms.
  • Databricks & BigQuery Mastery: You possess significant practical experience with Databricks as a core data warehouse and GCP BigQuery for analytical workloads.
  • Data Ingestion & Orchestration: Proven experience designing and implementing complex data ingestion pipelines and workflow orchestration using tools like Airflow and real-time streaming technologies like Kafka.
  • AI/ML Data Enablement: Experience in building data foundations specifically geared towards supporting Machine Learning and Artificial Intelligence initiatives.


Join the ‘real solvers’

ready to futurify?

If you are excited by the possibilities of what an AI-native engineering-led, modern tech consultancy can do to futurify businesses, apply here and experience the ‘Art of the possible’.


Don’t Just Send a Resume. Send a Statement.


So, If you are passionate about tech, future & what you read above (we really are!), apply here to experience the ‘Art of Possible’

Read more
Auxo AI
kusuma Gullamajji
Posted by kusuma Gullamajji
Bengaluru (Bangalore), Mumbai, Hyderabad, Gurugram
3 - 8 yrs
₹10L - ₹40L / yr
aws
PySpark
databricks
skill iconPython

Role Summary:

AuxoAI is seeking a skilled and experienced Data Engineer to join our dynamic team. The ideal candidate will have 3-8 years of prior experience in data engineering, with a strong background in working on modern data platforms. This role offers an exciting opportunity to work on diverse projects, collaborating with cross-functional teams to design, build, and optimize data pipelines and infrastructure.


Responsibilities:

• Design, develop, and maintain data pipelines using Databricks (PySpark / Spark SQL)

• Build and manage data pipelines across Bronze, Silver, and Gold layers using Delta Lake

• Implement ETL/ELT workflows for batch and near real-time processing

• Work with Databricks Workflows for orchestration and job scheduling

• Leverage Unity Catalog for data governance, access control, and metadata management

• Optimize Spark jobs, cluster configurations, and cost efficiency

• Collaborate with business and analytics teams to translate requirements into scalable data models

• Integrate data from multiple sources (APIs, databases, cloud storage)

• Ensure data quality, validation, and observability across pipelines

• Troubleshoot and debug data pipeline issues, providing timely resolution and proactive monitoring


Qualifications:

• Bachelor’s degree in computer science, Engineering, or a related field.

• Overall 3+ years of prior experience in data engineering, with a focus on designing and building data pipelines

• Hands-on experience with Databricks platform and ecosystem

• Strong proficiency in Python (PySpark) and SQL

• Experience working with Delta Lake (ACID transactions, time travel, schema evolution)

• Good understanding of data warehousing concepts and dimensional modeling

• Familiarity with Unity Catalog (data governance, RBAC, lineage basics)

• Understanding of Spark performance tuning and optimization techniques

• Experience with cloud platforms (AWS / Azure / GCP)

• Working knowledge of Git and CI/CD practices

• Familiarity with implementing CI/CD processes or other orchestration tools is a plus.

Read more
AI Industry

AI Industry

Agency job
via Peak Hire Solutions by Dhara Thakkar
Mumbai, Bengaluru (Bangalore), Hyderabad, Gurugram
6 - 10 yrs
₹32L - ₹42L / yr
ETL
SQL
Google Cloud Platform (GCP)
Data engineering
ELT
+17 more

Role & Responsibilities:

We are looking for a strong Data Engineer to join our growing team. The ideal candidate brings solid ETL fundamentals, hands-on pipeline experience, and cloud platform proficiency — with a preference for GCP / BigQuery expertise.


Responsibilities:

  • Design, build, and maintain scalable data pipelines and ETL/ELT workflows
  • Work with Dataform or DBT to implement transformation logic and data models
  • Develop and optimize data solutions on GCP (BigQuery, GCS) or AWS/Azure
  • Support data migration initiatives and data mesh architecture patterns
  • Collaborate with analysts, scientists, and business stakeholders to deliver reliable data products
  • Apply data governance and quality best practices across the data lifecycle
  • Troubleshoot pipeline issues and drive proactive monitoring and resolution


Ideal Candidate:

  • Strong Data Engineer Profile
  • Must have 6+ years of hands-on experience in Data Engineering, with strong ownership of end-to-end data pipeline development.
  • Must have strong experience in ETL/ELT pipeline design, transformation logic, and data workflow orchestration.
  • Must have hands-on experience with any one of the following: Dataform, dbt, or BigQuery, with practical exposure to data transformation, modeling, or cloud data warehousing.
  • Must have working experience on any cloud platform: GCP (preferred), AWS, or Azure, including object storage (GCS, S3, ADLS).
  • Must have strong SQL skills with experience in writing complex queries and optimizing performance.
  • Must have programming experience in Python and/or SQL for data processing.
  • Must have experience in building and maintaining scalable data pipelines and troubleshooting data issues.
  • Exposure to data migration projects and/or data mesh architecture concepts.
  • Experience with Spark / PySpark or large-scale data processing frameworks.
  • Experience working in product-based companies or data-driven environments.
  • Bachelor’s or Master’s degree in Computer Science, Engineering, or related field.


NOTE:

  • There will be an interview drive scheduled on 28th and 29th March 2026, and if shortlisted, they will be expected to be available on these Interview dates. Only Immediate joiners are considered.
Read more
Wissen Technology

at Wissen Technology

4 recruiters
Janane Mohanasankaran
Posted by Janane Mohanasankaran
Mumbai, Pune
3 - 6 yrs
Best in industry
skill iconPython
PySpark
pandas
SQL
ADF
+2 more

* Python (3 to 6 years): Strong expertise in data workflows and automation

* Spark (PySpark): Hands-on experience with large-scale data processing

* Pandas: For detailed data analysis and validation

* Delta Lake: Managing structured and semi-structured datasets at scale

* SQL: Querying and performing operations on Delta tables

* Azure Cloud: Compute and storage services

* Orchestrator: Good experience with either ADF or Airflow

Read more
Ganit Business Solutions

at Ganit Business Solutions

3 recruiters
Agency job
via hirezyai by HR Hirezyai
Bengaluru (Bangalore), Chennai, Mumbai
5.5 - 12 yrs
₹15L - ₹25L / yr
skill iconAmazon Web Services (AWS)
PySpark
SQL

Roles & Responsibilities

  • Data Engineering Excellence: Design and implement data pipelines using formats like JSON, Parquet, CSV, and ORC, utilizing batch and streaming ingestion.
  • Cloud Data Migration Leadership: Lead cloud migration projects, developing scalable Spark pipelines.
  • Medallion Architecture: Implement Bronze, Silver, and gold tables for scalable data systems.
  • Spark Code Optimization: Optimize Spark code to ensure efficient cloud migration.
  • Data Modeling: Develop and maintain data models with strong governance practices.
  • Data Cataloging & Quality: Implement cataloging strategies with Unity Catalog to maintain high-quality data.
  • Delta Live Table Leadership: Lead the design and implementation of Delta Live Tables (DLT) pipelines for secure, tamper-resistant data management.
  • Customer Collaboration: Collaborate with clients to optimize cloud migrations and ensure best practices in design and governance.

Educational Qualifications

  • Experience: Minimum 5 years of hands-on experience in data engineering, with a proven track record in complex pipeline development and cloud-based data migration projects.
  • Education: Bachelor’s or higher degree in Computer Science, Data Engineering, or a related field.
  • Skills
  • Must-have: Proficiency in Spark, SQL, Python, and other relevant data processing technologies. Strong knowledge of Databricks and its components, including Delta Live Table (DLT) pipeline implementations. Expertise in on-premises to cloud Spark code optimization and Medallion Architecture.

Good to Have

  • Familiarity with AWS services (experience with additional cloud platforms like GCP or Azure is a plus).

Soft Skills

  • Excellent communication and collaboration skills, with the ability to work effectively with clients and internal teams.
  • Certifications
  • AWS/GCP/Azure Data Engineer Certification.


Read more
AI-First Company

AI-First Company

Agency job
via Peak Hire Solutions by Dhara Thakkar
Bengaluru (Bangalore), Mumbai, Hyderabad, Gurugram
5 - 17 yrs
₹30L - ₹45L / yr
Data engineering
Data architecture
SQL
Data modeling
GCS
+47 more

ROLES AND RESPONSIBILITIES:

You will be responsible for architecting, implementing, and optimizing Dremio-based data Lakehouse environments integrated with cloud storage, BI, and data engineering ecosystems. The role requires a strong balance of architecture design, data modeling, query optimization, and governance enablement in large-scale analytical environments.


  • Design and implement Dremio lakehouse architecture on cloud (AWS/Azure/Snowflake/Databricks ecosystem).
  • Define data ingestion, curation, and semantic modeling strategies to support analytics and AI workloads.
  • Optimize Dremio reflections, caching, and query performance for diverse data consumption patterns.
  • Collaborate with data engineering teams to integrate data sources via APIs, JDBC, Delta/Parquet, and object storage layers (S3/ADLS).
  • Establish best practices for data security, lineage, and access control aligned with enterprise governance policies.
  • Support self-service analytics by enabling governed data products and semantic layers.
  • Develop reusable design patterns, documentation, and standards for Dremio deployment, monitoring, and scaling.
  • Work closely with BI and data science teams to ensure fast, reliable, and well-modeled access to enterprise data.


IDEAL CANDIDATE:

  • Bachelor’s or Master’s in Computer Science, Information Systems, or related field.
  • 5+ years in data architecture and engineering, with 3+ years in Dremio or modern lakehouse platforms.
  • Strong expertise in SQL optimization, data modeling, and performance tuning within Dremio or similar query engines (Presto, Trino, Athena).
  • Hands-on experience with cloud storage (S3, ADLS, GCS), Parquet/Delta/Iceberg formats, and distributed query planning.
  • Knowledge of data integration tools and pipelines (Airflow, DBT, Kafka, Spark, etc.).
  • Familiarity with enterprise data governance, metadata management, and role-based access control (RBAC).
  • Excellent problem-solving, documentation, and stakeholder communication skills.


PREFERRED:

  • Experience integrating Dremio with BI tools (Tableau, Power BI, Looker) and data catalogs (Collibra, Alation, Purview).
  • Exposure to Snowflake, Databricks, or BigQuery environments.
  • Experience in high-tech, manufacturing, or enterprise data modernization programs.
Read more
One of the reputed Client in India

One of the reputed Client in India

Bengaluru (Bangalore), Mumbai, Delhi, Gurugram, Noida, Hyderabad, Pune
6 - 8 yrs
₹12L - ₹13L / yr
skill iconAmazon Web Services (AWS)
skill iconPython
PySpark

Our Client is looking to hire Databricks Amin immediatly.


This is PAN-INDIA Bulk hiring


Minimum of 6-8+ years with Databricks, Pyspark/Python and AWS.

Must have AWS


Notice 15-30 days is preferred.


Share profiles at hr at etpspl dot com

Please refer/share our email to your friends/colleagues who are looking for job.

Read more
Wissen Technology

at Wissen Technology

4 recruiters
Gagandeep Kaur
Posted by Gagandeep Kaur
Bengaluru (Bangalore), Mumbai, Pune
4 - 7 yrs
Best in industry
skill iconPython
PySpark
pandas
Airflow
Data engineering

Wissen Technology is hiring for Data Engineer

About Wissen Technology: At Wissen Technology, we deliver niche, custom-built products that solve complex business challenges across industries worldwide. Founded in 2015, our core philosophy is built around a strong product engineering mindset—ensuring every solution is architected and delivered right the first time. Today, Wissen Technology has a global footprint with 2000+ employees across offices in the US, UK, UAE, India, and Australia. Our commitment to excellence translates into delivering 2X impact compared to traditional service providers. How do we achieve this? Through a combination of deep domain knowledge, cutting-edge technology expertise, and a relentless focus on quality. We don’t just meet expectations—we exceed them by ensuring faster time-to-market, reduced rework, and greater alignment with client objectives. We have a proven track record of building mission-critical systems across industries, including financial services, healthcare, retail, manufacturing, and more. Wissen stands apart through its unique delivery models. Our outcome-based projects ensure predictable costs and timelines, while our agile pods provide clients the flexibility to adapt to their evolving business needs. Wissen leverages its thought leadership and technology prowess to drive superior business outcomes. Our success is powered by top-tier talent. Our mission is clear: to be the partner of choice for building world-class custom products that deliver exceptional impact—the first time, every time.

Job Summary: Wissen Technology is hiring a Data Engineer with expertise in Python, Pandas, Airflow, and Azure Cloud Services. The ideal candidate will have strong communication skills and experience with Kubernetes.

Experience: 4-7 years

Notice Period: Immediate- 15 days

Location: Pune, Mumbai, Bangalore

Mode of Work: Hybrid

Key Responsibilities:

  • Develop and maintain data pipelines using Python and Pandas.
  • Implement and manage workflows using Airflow.
  • Utilize Azure Cloud Services for data storage and processing.
  • Collaborate with cross-functional teams to understand data requirements and deliver solutions.
  • Ensure data quality and integrity throughout the data lifecycle.
  • Optimize and scale data infrastructure to meet business needs.

Qualifications and Required Skills:

  • Proficiency in Python (Must Have).
  • Strong experience with Pandas (Must Have).
  • Expertise in Airflow (Must Have).
  • Experience with Azure Cloud Services.
  • Good communication skills.

Good to Have Skills:

  • Experience with Pyspark.
  • Knowledge of Kubernetes.

Wissen Sites:


Read more
Wissen Technology

at Wissen Technology

4 recruiters
Bipasha Rath
Posted by Bipasha Rath
Mumbai, Bengaluru (Bangalore), Pune
3 - 7 yrs
Best in industry
skill iconPython
pandas
PySpark

Experience: 3–7 Years

Locations: Pune / Bangalore / Mumbai

Notice Period :Immediate joiner only


Employment Type: Full-time

🛠️ Key Skills (Mandatory):

  • Python: Strong coding skills for data manipulation and automation.
  • PySpark: Experience with distributed data processing using Spark.
  • SQL: Proficient in writing complex queries for data extraction and transformation.
  • Azure Databricks: Hands-on experience with notebooks, Delta Lake, and MLflow


Interested candidates please share resume with details below.


Total Experience -

Relevant Experience in Python,Pyspark,AQL,Azure Data bricks-

Current CTC -

Expected CTC -

Notice period -

Current Location -

Desired Location -


Read more
Wissen Technology

at Wissen Technology

4 recruiters
Janane Mohanasankaran
Posted by Janane Mohanasankaran
Bengaluru (Bangalore), Pune, Mumbai
7 - 12 yrs
Best in industry
skill iconPython
pandas
PySpark
SQL
Data engineering

Wissen Technology is hiring for Data Engineer

About Wissen Technology:At Wissen Technology, we deliver niche, custom-built products that solve complex business challenges across industries worldwide. Founded in 2015, our core philosophy is built around a strong product engineering mindset—ensuring every solution is architected and delivered right the first time. Today, Wissen Technology has a global footprint with 2000+ employees across offices in the US, UK, UAE, India, and Australia. Our commitment to excellence translates into delivering 2X impact compared to traditional service providers. How do we achieve this? Through a combination of deep domain knowledge, cutting-edge technology expertise, and a relentless focus on quality. We don’t just meet expectations—we exceed them by ensuring faster time-to-market, reduced rework, and greater alignment with client objectives. We have a proven track record of building mission-critical systems across industries, including financial services, healthcare, retail, manufacturing, and more. Wissen stands apart through its unique delivery models. Our outcome-based projects ensure predictable costs and timelines, while our agile pods provide clients the flexibility to adapt to their evolving business needs. Wissen leverages its thought leadership and technology prowess to drive superior business outcomes. Our success is powered by top-tier talent. Our mission is clear: to be the partner of choice for building world-class custom products that deliver exceptional impact—the first time, every time.

Job Summary:Wissen Technology is hiring a Data Engineer with a strong background in Python, data engineering, and workflow optimization. The ideal candidate will have experience with Delta Tables, Parquet, and be proficient in Pandas and PySpark.

Experience:7+ years

Location:Pune, Mumbai, Bangalore

Mode of Work:Hybrid

Key Responsibilities:

  • Develop and maintain data pipelines using Python (Pandas, PySpark).
  • Optimize data workflows and ensure efficient data processing.
  • Work with Delta Tables and Parquet for data storage and management.
  • Collaborate with cross-functional teams to understand data requirements and deliver solutions.
  • Ensure data quality and integrity throughout the data lifecycle.
  • Implement best practices for data engineering and workflow optimization.

Qualifications and Required Skills:

  • Proficiency in Python, specifically with Pandas and PySpark.
  • Strong experience in data engineering and workflow optimization.
  • Knowledge of Delta Tables and Parquet.
  • Excellent problem-solving skills and attention to detail.
  • Ability to work collaboratively in a team environment.
  • Strong communication skills.

Good to Have Skills:

  • Experience with Databricks.
  • Knowledge of Apache Spark, DBT, and Airflow.
  • Advanced Pandas optimizations.
  • Familiarity with PyTest/DBT testing frameworks.

Wissen Sites:

 

Wissen | Driving Digital Transformation

A technology consultancy that drives digital innovation by connecting strategy and execution, helping global clients to strengthen their core technology.

 

Read more
Wissen Technology

at Wissen Technology

4 recruiters
Annie Varghese
Posted by Annie Varghese
Pune, Mumbai, Bengaluru (Bangalore)
3 - 8 yrs
Best in industry
snowflake
Apache Airflow
ETL
skill iconPython
PySpark
+1 more

Job Summary:

We are looking for a highly skilled and experienced Data Engineer with deep expertise in Airflow, dbt, Python, and Snowflake. The ideal candidate will be responsible for designing, building, and managing scalable data pipelines and transformation frameworks to enable robust data workflows across the organization.

Key Responsibilities:

  • Design and implement scalable ETL/ELT pipelines using Apache Airflow for orchestration.
  • Develop modular and maintainable data transformation models using dbt.
  • Write high-performance data processing scripts and automation using Python.
  • Build and maintain data models and pipelines on Snowflake.
  • Collaborate with data analysts, data scientists, and business teams to deliver clean, reliable, and timely data.
  • Monitor and optimize pipeline performance and troubleshoot issues proactively.
  • Follow best practices in version control, testing, and CI/CD for data projects.

Must-Have Skills:

  • Strong hands-on experience with Apache Airflow for scheduling and orchestrating data workflows.
  • Proficiency in dbt (data build tool) for building scalable and testable data models.
  • Expert-level skills in Python for data processing and automation.
  • Solid experience with Snowflake, including SQL performance tuning, data modeling, and warehouse management.
  • Strong understanding of data engineering best practices including modularity, testing, and deployment.

Good to Have:

  • Experience working with cloud platforms (AWS/GCP/Azure).
  • Familiarity with CI/CD pipelines for data (e.g., GitHub Actions, GitLab CI).
  • Exposure to modern data stack tools (e.g., Fivetran, Stitch, Looker).
  • Knowledge of data security and governance best practices.


Note : One face-to-face (F2F) round is mandatory, and as per the process, you will need to visit the office for this.

Read more
VyTCDC
Gobinath Sundaram
Posted by Gobinath Sundaram
Chennai, Bengaluru (Bangalore), Hyderabad, Mumbai, Pune, Noida
4 - 6 yrs
₹3L - ₹21L / yr
AWS Data Engineer
skill iconAmazon Web Services (AWS)
skill iconPython
PySpark
databricks
+1 more

 Key Responsibilities

  • Design and implement ETL/ELT pipelines using Databricks, PySpark, and AWS Glue
  • Develop and maintain scalable data architectures on AWS (S3, EMR, Lambda, Redshift, RDS)
  • Perform data wrangling, cleansing, and transformation using Python and SQL
  • Collaborate with data scientists to integrate Generative AI models into analytics workflows
  • Build dashboards and reports to visualize insights using tools like Power BI or Tableau
  • Ensure data quality, governance, and security across all data assets
  • Optimize performance of data pipelines and troubleshoot bottlenecks
  • Work closely with stakeholders to understand data requirements and deliver actionable insights

🧪 Required Skills

Skill AreaTools & TechnologiesCloud PlatformsAWS (S3, Lambda, Glue, EMR, Redshift)Big DataDatabricks, Apache Spark, PySparkProgrammingPython, SQLData EngineeringETL/ELT, Data Lakes, Data WarehousingAnalyticsData Modeling, Visualization, BI ReportingGen AI IntegrationOpenAI, Hugging Face, LangChain (preferred)DevOps (Bonus)Git, Jenkins, Terraform, Docker

📚 Qualifications

  • Bachelor's or Master’s degree in Computer Science, Data Science, or related field
  • 3+ years of experience in data engineering or data analytics
  • Hands-on experience with Databricks, PySpark, and AWS
  • Familiarity with Generative AI tools and frameworks is a strong plus
  • Strong problem-solving and communication skills

🌟 Preferred Traits

  • Analytical mindset with attention to detail
  • Passion for data and emerging technologies
  • Ability to work independently and in cross-functional teams
  • Eagerness to learn and adapt in a fast-paced environment


Read more
Wissen Technology

at Wissen Technology

4 recruiters
Praffull Shinde
Posted by Praffull Shinde
Pune, Mumbai, Bengaluru (Bangalore)
4 - 8 yrs
₹14L - ₹26L / yr
skill iconPython
PySpark
skill iconDjango
skill iconFlask
RESTful APIs
+3 more

Job title - Python developer

Exp – 4 to 6 years

Location – Pune/Mum/B’lore

 

PFB JD

Requirements:

  • Proven experience as a Python Developer
  • Strong knowledge of core Python and Pyspark concepts
  • Experience with web frameworks such as Django or Flask
  • Good exposure to any cloud platform (GCP Preferred)
  • CI/CD exposure required
  • Solid understanding of RESTful APIs and how to build them
  • Experience working with databases like Oracle DB and MySQL
  • Ability to write efficient SQL queries and optimize database performance
  • Strong problem-solving skills and attention to detail
  • Strong SQL programing (stored procedure, functions)
  • Excellent communication and interpersonal skill

Roles and Responsibilities

  • Design, develop, and maintain data pipelines and ETL processes using pyspark
  • Work closely with data scientists and analysts to provide them with clean, structured data.
  • Optimize data storage and retrieval for performance and scalability.
  • Collaborate with cross-functional teams to gather data requirements.
  • Ensure data quality and integrity through data validation and cleansing processes.
  • Monitor and troubleshoot data-related issues to ensure data pipeline reliability.
  • Stay up to date with industry best practices and emerging technologies in data engineering.
Read more
Wissen Technology

at Wissen Technology

4 recruiters
Rutuja Patil
Posted by Rutuja Patil
Mumbai
4 - 10 yrs
Best in industry
skill iconJava
J2EE
Hibernate (Java)
skill iconSpring Boot
Spring MVC
+2 more

Company Name – Wissen Technology

Group of companies in India – Wissen Technology & Wissen Infotech

Work Location - Senior Backend Developer – Java (with Python Exposure)- Mumbai


Experience - 4 to 10 years


Kindly revert over mail if you are interested.


Java Developer – Job Description


We are seeking a Senior Backend Developer with strong expertise in Java (Spring Boot) and working knowledge of Python. In this role, Java will be your primary development language, with Python used for scripting, automation, or selected service modules. You’ll be part of a collaborative backend team building scalable and high-performance systems.


Key Responsibilities


  • Design and develop robust backend services and APIs primarily using Java (Spring Boot)
  • Contribute to Python-based components where needed for automation, scripting, or lightweight services
  • Build, integrate, and optimize RESTful APIs and microservices
  • Work with relational and NoSQL databases
  • Write unit and integration tests (JUnit, PyTest)
  • Collaborate closely with DevOps, QA, and product teams
  • Participate in architecture reviews and design discussions
  • Help maintain code quality, organization, and automation


Required Skills & Qualifications

  • 4 to 10 years of hands-on Java development experience
  • Strong experience with Spring Boot, JPA/Hibernate, and REST APIs
  • At least 1–2 years of hands-on experience with Python (e.g., for scripting, automation, or small services)
  • Familiarity with Python frameworks like Flask or FastAPI is a plus
  • Experience with SQL/NoSQL databases (e.g., PostgreSQL, MongoDB)
  • Good understanding of OOPdesign patterns, and software engineering best practices
  • Familiarity with DockerGit, and CI/CD pipelines


Read more
Deqode

at Deqode

1 recruiter
Alisha Das
Posted by Alisha Das
Bengaluru (Bangalore), Mumbai, Pune, Chennai, Gurugram
5.6 - 7 yrs
₹10L - ₹28L / yr
skill iconAmazon Web Services (AWS)
skill iconPython
PySpark
SQL

Job Summary:

As an AWS Data Engineer, you will be responsible for designing, developing, and maintaining scalable, high-performance data pipelines using AWS services. With 6+ years of experience, you’ll collaborate closely with data architects, analysts, and business stakeholders to build reliable, secure, and cost-efficient data infrastructure across the organization.

Key Responsibilities:

  • Design, develop, and manage scalable data pipelines using AWS Glue, Lambda, and other serverless technologies
  • Implement ETL workflows and transformation logic using PySpark and Python on AWS Glue
  • Leverage AWS Redshift for warehousing, performance tuning, and large-scale data queries
  • Work with AWS DMS and RDS for database integration and migration
  • Optimize data flows and system performance for speed and cost-effectiveness
  • Deploy and manage infrastructure using AWS CloudFormation templates
  • Collaborate with cross-functional teams to gather requirements and build robust data solutions
  • Ensure data integrity, quality, and security across all systems and processes

Required Skills & Experience:

  • 6+ years of experience in Data Engineering with strong AWS expertise
  • Proficient in Python and PySpark for data processing and ETL development
  • Hands-on experience with AWS Glue, Lambda, DMS, RDS, and Redshift
  • Strong SQL skills for building complex queries and performing data analysis
  • Familiarity with AWS CloudFormation and infrastructure as code principles
  • Good understanding of serverless architecture and cost-optimized design
  • Ability to write clean, modular, and maintainable code
  • Strong analytical thinking and problem-solving skills


Read more
Wissen Technology

at Wissen Technology

4 recruiters
Vishakha Walunj
Posted by Vishakha Walunj
Bengaluru (Bangalore), Pune, Mumbai
7 - 12 yrs
Best in industry
PySpark
databricks
SQL
skill iconPython

Required Skills:

  • Hands-on experience with Databricks, PySpark
  • Proficiency in SQL, Python, and Spark.
  • Understanding of data warehousing concepts and data modeling.
  • Experience with CI/CD pipelines and version control (e.g., Git).
  • Fundamental knowledge of any cloud services, preferably Azure or GCP.


Good to Have:

  • Bigquery
  • Experience with performance tuning and data governance.


Read more
Deqode

at Deqode

1 recruiter
Roshni Maji
Posted by Roshni Maji
Pune, Bengaluru (Bangalore), Gurugram, Chennai, Mumbai
5 - 7 yrs
₹6L - ₹20L / yr
skill iconAmazon Web Services (AWS)
Amazon Redshift
AWS Glue
skill iconPython
PySpark

Position: AWS Data Engineer

Experience: 5 to 7 Years

Location: Bengaluru, Pune, Chennai, Mumbai, Gurugram

Work Mode: Hybrid (3 days work from office per week)

Employment Type: Full-time

About the Role:

We are seeking a highly skilled and motivated AWS Data Engineer with 5–7 years of experience in building and optimizing data pipelines, architectures, and data sets. The ideal candidate will have strong experience with AWS services including Glue, Athena, Redshift, Lambda, DMS, RDS, and CloudFormation. You will be responsible for managing the full data lifecycle from ingestion to transformation and storage, ensuring efficiency and performance.

Key Responsibilities:

  • Design, develop, and optimize scalable ETL pipelines using AWS Glue, Python/PySpark, and SQL.
  • Work extensively with AWS services such as Glue, Athena, Lambda, DMS, RDS, Redshift, CloudFormation, and other serverless technologies.
  • Implement and manage data lake and warehouse solutions using AWS Redshift and S3.
  • Optimize data models and storage for cost-efficiency and performance.
  • Write advanced SQL queries to support complex data analysis and reporting requirements.
  • Collaborate with stakeholders to understand data requirements and translate them into scalable solutions.
  • Ensure high data quality and integrity across platforms and processes.
  • Implement CI/CD pipelines and best practices for infrastructure as code using CloudFormation or similar tools.

Required Skills & Experience:

  • Strong hands-on experience with Python or PySpark for data processing.
  • Deep knowledge of AWS Glue, Athena, Lambda, Redshift, RDS, DMS, and CloudFormation.
  • Proficiency in writing complex SQL queries and optimizing them for performance.
  • Familiarity with serverless architectures and AWS best practices.
  • Experience in designing and maintaining robust data architectures and data lakes.
  • Ability to troubleshoot and resolve data pipeline issues efficiently.
  • Strong communication and stakeholder management skills.


Read more
Deqode

at Deqode

1 recruiter
Roshni Maji
Posted by Roshni Maji
Bengaluru (Bangalore), Pune, Mumbai, Chennai, Gurugram
5 - 7 yrs
₹5L - ₹19L / yr
skill iconPython
PySpark
skill iconAmazon Web Services (AWS)
aws
Amazon Redshift
+1 more

Position: AWS Data Engineer

Experience: 5 to 7 Years

Location: Bengaluru, Pune, Chennai, Mumbai, Gurugram

Work Mode: Hybrid (3 days work from office per week)

Employment Type: Full-time

About the Role:

We are seeking a highly skilled and motivated AWS Data Engineer with 5–7 years of experience in building and optimizing data pipelines, architectures, and data sets. The ideal candidate will have strong experience with AWS services including Glue, Athena, Redshift, Lambda, DMS, RDS, and CloudFormation. You will be responsible for managing the full data lifecycle from ingestion to transformation and storage, ensuring efficiency and performance.

Key Responsibilities:

  • Design, develop, and optimize scalable ETL pipelines using AWS Glue, Python/PySpark, and SQL.
  • Work extensively with AWS services such as Glue, Athena, Lambda, DMS, RDS, Redshift, CloudFormation, and other serverless technologies.
  • Implement and manage data lake and warehouse solutions using AWS Redshift and S3.
  • Optimize data models and storage for cost-efficiency and performance.
  • Write advanced SQL queries to support complex data analysis and reporting requirements.
  • Collaborate with stakeholders to understand data requirements and translate them into scalable solutions.
  • Ensure high data quality and integrity across platforms and processes.
  • Implement CI/CD pipelines and best practices for infrastructure as code using CloudFormation or similar tools.

Required Skills & Experience:

  • Strong hands-on experience with Python or PySpark for data processing.
  • Deep knowledge of AWS Glue, Athena, Lambda, Redshift, RDS, DMS, and CloudFormation.
  • Proficiency in writing complex SQL queries and optimizing them for performance.
  • Familiarity with serverless architectures and AWS best practices.
  • Experience in designing and maintaining robust data architectures and data lakes.
  • Ability to troubleshoot and resolve data pipeline issues efficiently.
  • Strong communication and stakeholder management skills.


Read more
Deqode

at Deqode

1 recruiter
Mokshada Solanki
Posted by Mokshada Solanki
Bengaluru (Bangalore), Mumbai, Pune, Gurugram
4 - 5 yrs
₹4L - ₹20L / yr
SQL
skill iconAmazon Web Services (AWS)
Migration
PySpark
ETL

Job Summary:

Seeking a seasoned SQL + ETL Developer with 4+ years of experience in managing large-scale datasets and cloud-based data pipelines. The ideal candidate is hands-on with MySQL, PySpark, AWS Glue, and ETL workflows, with proven expertise in AWS migration and performance optimization.


Key Responsibilities:

  • Develop and optimize complex SQL queries and stored procedures to handle large datasets (100+ million records).
  • Build and maintain scalable ETL pipelines using AWS Glue and PySpark.
  • Work on data migration tasks in AWS environments.
  • Monitor and improve database performance; automate key performance indicators and reports.
  • Collaborate with cross-functional teams to support data integration and delivery requirements.
  • Write shell scripts for automation and manage ETL jobs efficiently.


Required Skills:

  • Strong experience with MySQL, complex SQL queries, and stored procedures.
  • Hands-on experience with AWS Glue, PySpark, and ETL processes.
  • Good understanding of AWS ecosystem and migration strategies.
  • Proficiency in shell scripting.
  • Strong communication and collaboration skills.


Nice to Have:

  • Working knowledge of Python.
  • Experience with AWS RDS.



Read more
Deqode

at Deqode

1 recruiter
Shraddha Katare
Posted by Shraddha Katare
Bengaluru (Bangalore), Pune, Chennai, Mumbai, Gurugram
5 - 7 yrs
₹5L - ₹19L / yr
skill iconAmazon Web Services (AWS)
skill iconPython
PySpark
SQL
redshift

Profile: AWS Data Engineer

Mode- Hybrid

Experience- 5+7 years

Locations - Bengaluru, Pune, Chennai, Mumbai, Gurugram


Roles and Responsibilities

  • Design and maintain ETL pipelines using AWS Glue and Python/PySpark
  • Optimize SQL queries for Redshift and Athena
  • Develop Lambda functions for serverless data processing
  • Configure AWS DMS for database migration and replication
  • Implement infrastructure as code with CloudFormation
  • Build optimized data models for performance
  • Manage RDS databases and AWS service integrations
  • Troubleshoot and improve data processing efficiency
  • Gather requirements from business stakeholders
  • Implement data quality checks and validation
  • Document data pipelines and architecture
  • Monitor workflows and implement alerting
  • Keep current with AWS services and best practices


Required Technical Expertise:

  • Python/PySpark for data processing
  • AWS Glue for ETL operations
  • Redshift and Athena for data querying
  • AWS Lambda and serverless architecture
  • AWS DMS and RDS management
  • CloudFormation for infrastructure
  • SQL optimization and performance tuning
Read more
Deqode

at Deqode

1 recruiter
Alisha Das
Posted by Alisha Das
Pune, Mumbai, Bengaluru (Bangalore), Chennai
4 - 7 yrs
₹5L - ₹15L / yr
skill iconAmazon Web Services (AWS)
skill iconPython
PySpark
Glue semantics
Amazon Redshift
+1 more

Job Overview:

We are seeking an experienced AWS Data Engineer to join our growing data team. The ideal candidate will have hands-on experience with AWS Glue, Redshift, PySpark, and other AWS services to build robust, scalable data pipelines. This role is perfect for someone passionate about data engineering, automation, and cloud-native development.

Key Responsibilities:

  • Design, build, and maintain scalable and efficient ETL pipelines using AWS Glue, PySpark, and related tools.
  • Integrate data from diverse sources and ensure its quality, consistency, and reliability.
  • Work with large datasets in structured and semi-structured formats across cloud-based data lakes and warehouses.
  • Optimize and maintain data infrastructure, including Amazon Redshift, for high performance.
  • Collaborate with data analysts, data scientists, and product teams to understand data requirements and deliver solutions.
  • Automate data validation, transformation, and loading processes to support real-time and batch data processing.
  • Monitor and troubleshoot data pipeline issues and ensure smooth operations in production environments.

Required Skills:

  • 5 to 7 years of hands-on experience in data engineering roles.
  • Strong proficiency in Python and PySpark for data transformation and scripting.
  • Deep understanding and practical experience with AWS Glue, AWS Redshift, S3, and other AWS data services.
  • Solid understanding of SQL and database optimization techniques.
  • Experience working with large-scale data pipelines and high-volume data environments.
  • Good knowledge of data modeling, warehousing, and performance tuning.

Preferred/Good to Have:

  • Experience with workflow orchestration tools like Airflow or Step Functions.
  • Familiarity with CI/CD for data pipelines.
  • Knowledge of data governance and security best practices on AWS.
Read more
Deqode

at Deqode

1 recruiter
Shraddha Katare
Posted by Shraddha Katare
Pune, Mumbai, Bengaluru (Bangalore), Gurugram
4 - 6 yrs
₹5L - ₹10L / yr
ETL
SQL
skill iconAmazon Web Services (AWS)
PySpark
KPI

Role - ETL Developer

Work ModeHybrid

Experience- 4+ years

Location - Pune, Gurgaon, Bengaluru, Mumbai

Required Skills - AWS, AWS Glue, Pyspark, ETL, SQL

Required Skills:

  • 4+ years of hands-on experience in MySQL, including SQL queries and procedure development
  • Experience in Pyspark, AWS, AWS Glue
  • Experience in AWS ,Migration
  • Experience with automated scripting and tracking KPIs/metrics for database performance
  • Proficiency in shell scripting and ETL.
  • Strong communication skills and a collaborative team player
  • Knowledge of Python and AWS RDS is a plus


Read more
Wissen Technology

at Wissen Technology

4 recruiters
Hanisha Pralayakaveri
Posted by Hanisha Pralayakaveri
Bengaluru (Bangalore), Mumbai
5 - 9 yrs
Best in industry
skill iconPython
skill iconAmazon Web Services (AWS)
PySpark
Data engineering

Job Description: Data Engineer 

Position Overview:

Role Overview

We are seeking a skilled Python Data Engineer with expertise in designing and implementing data solutions using the AWS cloud platform. The ideal candidate will be responsible for building and maintaining scalable, efficient, and secure data pipelines while leveraging Python and AWS services to enable robust data analytics and decision-making processes.

 

Key Responsibilities

· Design, develop, and optimize data pipelines using Python and AWS services such as Glue, Lambda, S3, EMR, Redshift, Athena, and Kinesis.

· Implement ETL/ELT processes to extract, transform, and load data from various sources into centralized repositories (e.g., data lakes or data warehouses).

· Collaborate with cross-functional teams to understand business requirements and translate them into scalable data solutions.

· Monitor, troubleshoot, and enhance data workflows for performance and cost optimization.

· Ensure data quality and consistency by implementing validation and governance practices.

· Work on data security best practices in compliance with organizational policies and regulations.

· Automate repetitive data engineering tasks using Python scripts and frameworks.

· Leverage CI/CD pipelines for deployment of data workflows on AWS.

Read more
ZeMoSo Technologies

at ZeMoSo Technologies

11 recruiters
Agency job
via TIGI HR Solution Pvt. Ltd. by Vaidehi Sarkar
Mumbai, Bengaluru (Bangalore), Hyderabad, Chennai, Pune
4 - 8 yrs
₹10L - ₹15L / yr
Data engineering
skill iconPython
SQL
Data Warehouse (DWH)
skill iconAmazon Web Services (AWS)
+3 more

Work Mode: Hybrid


Need B.Tech, BE, M.Tech, ME candidates - Mandatory



Must-Have Skills:

● Educational Qualification :- B.Tech, BE, M.Tech, ME in any field.

● Minimum of 3 years of proven experience as a Data Engineer.

● Strong proficiency in Python programming language and SQL.

● Experience in DataBricks and setting up and managing data pipelines, data warehouses/lakes.

● Good comprehension and critical thinking skills.


● Kindly note Salary bracket will vary according to the exp. of the candidate - 

- Experience from 4 yrs to 6 yrs - Salary upto 22 LPA

- Experience from 5 yrs to 8 yrs - Salary upto 30 LPA

- Experience more than 8 yrs - Salary upto 40 LPA

Read more
Deqode

at Deqode

1 recruiter
Alisha Das
Posted by Alisha Das
Bengaluru (Bangalore), Delhi, Gurugram, Noida, Ghaziabad, Faridabad, Mumbai, Pune, Hyderabad, Indore, Jaipur, Kolkata
4 - 5 yrs
₹2L - ₹18L / yr
skill iconPython
PySpark

We are looking for a skilled and passionate Data Engineers with a strong foundation in Python programming and hands-on experience working with APIs, AWS cloud, and modern development practices. The ideal candidate will have a keen interest in building scalable backend systems and working with big data tools like PySpark.

Key Responsibilities:

  • Write clean, scalable, and efficient Python code.
  • Work with Python frameworks such as PySpark for data processing.
  • Design, develop, update, and maintain APIs (RESTful).
  • Deploy and manage code using GitHub CI/CD pipelines.
  • Collaborate with cross-functional teams to define, design, and ship new features.
  • Work on AWS cloud services for application deployment and infrastructure.
  • Basic database design and interaction with MySQL or DynamoDB.
  • Debugging and troubleshooting application issues and performance bottlenecks.

Required Skills & Qualifications:

  • 4+ years of hands-on experience with Python development.
  • Proficient in Python basics with a strong problem-solving approach.
  • Experience with AWS Cloud services (EC2, Lambda, S3, etc.).
  • Good understanding of API development and integration.
  • Knowledge of GitHub and CI/CD workflows.
  • Experience in working with PySpark or similar big data frameworks.
  • Basic knowledge of MySQL or DynamoDB.
  • Excellent communication skills and a team-oriented mindset.

Nice to Have:

  • Experience in containerization (Docker/Kubernetes).
  • Familiarity with Agile/Scrum methodologies.


Read more
IT Service company

IT Service company

Agency job
via Vinprotoday by Vikas Gaur
Mumbai
4 - 10 yrs
₹8L - ₹30L / yr
Google Cloud Platform (GCP)
Workflow
TensorFlow
Deployment management
PySpark
+1 more

Key Responsibilities:

Design, develop, and optimize scalable data pipelines and ETL processes.

Work with large datasets using GCP services like BigQuery, Dataflow, and Cloud Storage.

Implement real-time data streaming and processing solutions using Pub/Sub and Dataproc.

Collaborate with cross-functional teams to ensure data quality and governance.


Technical Requirements:

4+ years of experience in Data Engineering.

Strong expertise in GCP services like Workflow,tensorflow, Dataproc, and Cloud Storage.

Proficiency in SQL and programming languages such as Python or Java

.Experience in designing and implementing data pipelines

and working with real-time data processing.

Familiarity with CI/CD pipelines and cloud security best practices.

Read more
ProtoGene Consulting Private Limited
Mumbai
3 - 8 yrs
₹7L - ₹18L / yr
PySpark
Data engineering
Big Data
Hadoop
Spark
+4 more

Data Engineer + Integration engineer + Support specialistExp – 5-8 years

Necessary Skills:· SQL & Python / PySpark

· AWS Services: Glue, Appflow, Redshift

· Data warehousing

· Data modelling

Job Description:· Experience of implementing and delivering data solutions and pipelines on AWS Cloud Platform. Design/ implement, and maintain the data architecture for all AWS data services

· A strong understanding of data modelling, data structures, databases (Redshift), and ETL processes

· Work with stakeholders to identify business needs and requirements for data-related projects

Strong SQL and/or Python or PySpark knowledge

· Creating data models that can be used to extract information from various sources & store it in a usable format

· Optimize data models for performance and efficiency

· Write SQL queries to support data analysis and reporting

· Monitor and troubleshoot data pipelines

· Collaborate with software engineers to design and implement data-driven features

· Perform root cause analysis on data issues

· Maintain documentation of the data architecture and ETL processes

· Identifying opportunities to improve performance by improving database structure or indexing methods

· Maintaining existing applications by updating existing code or adding new features to meet new requirements

· Designing and implementing security measures to protect data from unauthorized access or misuse

· Recommending infrastructure changes to improve capacity or performance

Experience in Process industry

Data Engineer + Integration engineer + Support specialistExp – 3-5 years

Necessary Skills:· SQL & Python / PySpark

· AWS Services: Glue, Appflow, Redshift

· Data warehousing basics

· Data modelling basics

Job Description:· Experience of implementing and delivering data solutions and pipelines on AWS Cloud Platform.

· A strong understanding of data modelling, data structures, databases (Redshift)

Strong SQL and/or Python or PySpark knowledge

· Design and implement ETL processes to load data into the data warehouse

· Creating data models that can be used to extract information from various sources & store it in a usable format

· Optimize data models for performance and efficiency

· Write SQL queries to support data analysis and reporting

· Collaborate with team to design and implement data-driven features

· Monitor and troubleshoot data pipelines

· Perform root cause analysis on data issues

· Maintain documentation of the data architecture and ETL processes

· Maintaining existing applications by updating existing code or adding new features to meet new requirements

· Designing and implementing security measures to protect data from unauthorized access or misuse

· Identifying opportunities to improve performance by improving database structure or indexing methods

· Designing and implementing security measures to protect data from unauthorized access or misuse

· Recommending infrastructure changes to improve capacity or performance


Read more
Bengaluru (Bangalore), Mumbai, Delhi, Gurugram, Pune, Hyderabad, Ahmedabad, Chennai
3 - 7 yrs
₹8L - ₹15L / yr
AWS Lambda
Amazon S3
Amazon VPC
Amazon EC2
Amazon Redshift
+3 more

Technical Skills:


  • Ability to understand and translate business requirements into design.
  • Proficient in AWS infrastructure components such as S3, IAM, VPC, EC2, and Redshift.
  • Experience in creating ETL jobs using Python/PySpark.
  • Proficiency in creating AWS Lambda functions for event-based jobs.
  • Knowledge of automating ETL processes using AWS Step Functions.
  • Competence in building data warehouses and loading data into them.


Responsibilities:


  • Understand business requirements and translate them into design.
  • Assess AWS infrastructure needs for development work.
  • Develop ETL jobs using Python/PySpark to meet requirements.
  • Implement AWS Lambda for event-based tasks.
  • Automate ETL processes using AWS Step Functions.
  • Build data warehouses and manage data loading.
  • Engage with customers and stakeholders to articulate the benefits of proposed solutions and frameworks.
Read more
Numantra Technologies

at Numantra Technologies

2 recruiters
Vandana Saxena
Posted by Vandana Saxena
Mumbai, Navi Mumbai
2 - 8 yrs
₹5L - ₹12L / yr
Microsoft Windows Azure
ADF
NumPy
PySpark
Databricks
+1 more
Experience and expertise in using Azure cloud services. Azure certification will be a plus.

- Experience and expertise in Python Development and its different libraries like Pyspark, pandas, NumPy

- Expertise in ADF, Databricks.

- Creating and maintaining data interfaces across a number of different protocols (file, API.).

- Creating and maintaining internal business process solutions to keep our corporate system data in sync and reduce manual processes where appropriate.

- Creating and maintaining monitoring and alerting workflows to improve system transparency.

- Facilitate the development of our Azure cloud infrastructure relative to Data and Application systems.

- Design and lead development of our data infrastructure including data warehouses, data marts, and operational data stores.

- Experience in using Azure services such as ADLS Gen 2, Azure Functions, Azure messaging services, Azure SQL Server, Azure KeyVault, Azure Cognitive services etc.
Read more
Navi Mumbai
3 - 5 yrs
₹7L - ₹18L / yr
PySpark
Data engineering
Big Data
Hadoop
Spark
+6 more
  • Proficiency in shell scripting
  • Proficiency in automation of tasks
  • Proficiency in Pyspark/Python
  • Proficiency in writing and understanding of sqoop
  • Understanding of CloudEra manager
  • Good understanding of RDBMS
  • Good understanding of Excel

 

Read more
Virtusa

at Virtusa

2 recruiters
Agency job
via Response Informatics by Anupama Lavanya Uppala
Chennai, Bengaluru (Bangalore), Mumbai, Hyderabad, Pune
3 - 10 yrs
₹10L - ₹25L / yr
PySpark
skill iconPython
  • Minimum 1 years of relevant experience, in PySpark (mandatory)
  • Hands on experience in development, test, deploy, maintain and improving data integration pipeline in AWS cloud environment is added plus 
  • Ability to play lead role and independently manage 3-5 member of Pyspark development team 
  • EMR ,Python and PYspark mandate.
  • Knowledge and awareness working with AWS Cloud technologies like Apache Spark, , Glue, Kafka, Kinesis, and Lambda in S3, Redshift, RDS
Read more
Numantra Technologies

at Numantra Technologies

2 recruiters
nisha mattas
Posted by nisha mattas
Remote, Mumbai, powai
2 - 12 yrs
₹8L - ₹18L / yr
ADF
PySpark
Jupyter Notebook
Big Data
Windows Azure
+3 more
      • Data pre-processing, data transformation, data analysis, and feature engineering
      • Performance optimization of scripts (code) and Productionizing of code (SQL, Pandas, Python or PySpark, etc.)
    • Required skills:
      • Bachelors in - in Computer Science, Data Science, Computer Engineering, IT or equivalent
      • Fluency in Python (Pandas), PySpark, SQL, or similar
      • Azure data factory experience (min 12 months)
      • Able to write efficient code using traditional, OO concepts, modular programming following the SDLC process.
      • Experience in production optimization and end-to-end performance tracing (technical root cause analysis)
      • Ability to work independently with demonstrated experience in project or program management
      • Azure experience ability to translate data scientist code in Python and make it efficient (production) for cloud deployment
 
Read more
Fragma Data Systems

at Fragma Data Systems

8 recruiters
Evelyn Charles
Posted by Evelyn Charles
Remote, Bengaluru (Bangalore), Hyderabad, Chennai, Mumbai, Pune
8 - 15 yrs
₹16L - ₹28L / yr
PySpark
SQL Azure
azure synapse
Windows Azure
Azure Data Engineer
+3 more
Technology Skills:
  • Building and operationalizing large scale enterprise data solutions and applications using one or more of AZURE data and analytics services in combination with custom solutions - Azure Synapse/Azure SQL DWH, Azure Data Lake, Azure Blob Storage, Spark, HDInsights, Databricks, CosmosDB, EventHub/IOTHub.
  • Experience in migrating on-premise data warehouses to data platforms on AZURE cloud. 
  • Designing and implementing data engineering, ingestion, and transformation functions
Good to Have: 
  • Experience with Azure Analysis Services
  • Experience in Power BI
  • Experience with third-party solutions like Attunity/Stream sets, Informatica
  • Experience with PreSales activities (Responding to RFPs, Executing Quick POCs)
  • Capacity Planning and Performance Tuning on Azure Stack and Spark.
Read more
Get to hear about interesting companies hiring right now
Company logo
Company logo
Company logo
Company logo
Company logo
Linkedin iconFollow Cutshort
Why apply via Cutshort?
Connect with actual hiring teams and get their fast response. No spam.
Find more jobs
Get to hear about interesting companies hiring right now
Company logo
Company logo
Company logo
Company logo
Company logo
Linkedin iconFollow Cutshort