Cutshort logo
PySpark Jobs in Chennai

29+ PySpark Jobs in Chennai | PySpark Job openings in Chennai

Apply to 29+ PySpark Jobs in Chennai on CutShort.io. Explore the latest PySpark Job opportunities across top companies like Google, Amazon & Adobe.

icon
Moative

at Moative

3 candid answers
Eman Khan
Posted by Eman Khan
Chennai
3 - 5 yrs
₹10L - ₹25L / yr
skill iconPython
PySpark
skill iconScala
Data engineering
ETL
+12 more

About Moative

Moative, an Applied AI company, designs and builds transformation AI solutions for traditional industries in energy, utilities, healthcare & lifesciences, and more. Through Moative Labs, we build AI micro-products and launch AI startups with partners in vertical markets that align with our theses.


Our Past: We have built and sold two companies, one of which was an AI company. Our founders and leaders are Math PhDs, Ivy League University Alumni, Ex-Googlers, and successful entrepreneurs.


Our Team: Our team of 20+ employees consist of data scientists, AI/ML Engineers, and mathematicians from top engineering and research institutes such as IITs, CERN, IISc, UZH, Ph.Ds. Our team includes academicians, IBM Research Fellows, and former founders.


Work you’ll do

As a Data Engineer, you will work on data architecture, large-scale processing systems, and data flow management. You will build and maintain optimal data architecture and data pipelines, assemble large, complex data sets, and ensure that data is readily available to data scientists, analysts, and other users. In close collaboration with ML engineers, data scientists, and domain experts, you’ll deliver robust, production-grade solutions that directly impact business outcomes. Ultimately, you will be responsible for developing and implementing systems that optimize the organization’s data use and data quality.


Responsibilities

  • Create and maintain optimal data architecture and data pipelines on cloud infrastructure (such as AWS/ Azure/ GCP)
  • Assemble large, complex data sets that meet functional / non-functional business requirements
  • Identify, design, and implement internal process improvements
  • Build the pipeline infrastructure required for optimal extraction, transformation, and loading of data from a wide variety of data sources
  • Support development of analytics that utilize the data pipeline to provide actionable insights into key business metrics
  • Work with stakeholders to assist with data-related technical issues and support their data infrastructure needs


Who you are

You are a passionate and results-oriented engineer who understands the importance of data architecture and data quality to impact solution development, enhance products, and ultimately improve business applications. You thrive in dynamic environments and are comfortable navigating ambiguity. You possess a strong sense of ownership and are eager to take initiative, advocating for your technical decisions while remaining open to feedback and collaboration. 


You have experience in developing and deploying data pipelines to support real-world applications. You have a good understanding of data structures and are excellent at writing clean, efficient code to extract, create and manage large data sets for analytical uses. You have the ability to conduct regular testing and debugging to ensure optimal data pipeline performance. You are excited at the possibility of contributing to intelligent applications that can directly impact business services and make a positive difference to users.


Skills & Requirements

  • 3+ years of hands-on experience as a data engineer, data architect or similar role, with a good understanding of data structures and data engineering.
  • Solid knowledge of cloud infra and data-related services on AWS (EC2, EMR, RDS, Redshift) and/ or Azure.
  • Advanced knowledge of SQL, including writing complex queries, stored procedures, views, etc.
  • Strong experience with data pipeline and workflow management tools (such as Luigi, Airflow).
  • Experience with common relational SQL, NoSQL and Graph databases.
  • Strong experience with scripting languages: Python, PySpark, Scala, etc.
  • Practical experience with basic DevOps concepts: CI/CD, containerization (Docker, Kubernetes), etc
  • Experience with big data tools (Spark, Kafka, etc) and stream processing.
  • Excellent communication skills to collaborate with colleagues from both technical and business backgrounds, discuss and convey ideas and findings effectively.
  • Ability to analyze complex problems, think critically for troubleshooting and develop robust data solutions.
  • Ability to identify and tackle issues efficiently and proactively, conduct thorough research and collaborate to find long-term, scalable solutions.


Working at Moative

Moative is a young company, but we believe strongly in thinking long-term, while acting with urgency. Our ethos is rooted in innovation, efficiency and high-quality outcomes. We believe the future of work is AI-augmented and boundary less. Here are some of our guiding principles:

  • Think in decades. Act in hours. As an independent company, our moat is time. While our decisions are for the long-term horizon, our execution will be fast – measured in hours and days, not weeks and months.
  • Own the canvas. Throw yourself in to build, fix or improve – anything that isn’t done right, irrespective of who did it. Be selfish about improving across the organization – because once the rot sets in, we waste years in surgery and recovery.
  • Use data or don’t use data. Use data where you ought to but not as a ‘cover-my-back’ political tool. Be capable of making decisions with partial or limited data. Get better at intuition and pattern-matching. Whichever way you go, be mostly right about it.
  • Avoid work about work. Process creeps on purpose, unless we constantly question it. We are deliberate about committing to rituals that take time away from the actual work. We truly believe that a meeting that could be an email, should be an email and you don’t need a person with the highest title to say that out loud.
  • High revenue per person. We work backwards from this metric. Our default is to automate instead of hiring. We multi-skill our people to own more outcomes than hiring someone who has less to do. We don’t like squatting and hoarding that comes in the form of hiring for growth. High revenue per person comes from high quality work from everyone. We demand it.


If this role and our work is of interest to you, please apply. We encourage you to apply even if you believe you do not meet all the requirements listed above.  


That said, you should demonstrate that you are in the 90th percentile or above. This may mean that you have studied in top-notch institutions, won competitions that are intellectually demanding, built something of your own, or rated as an outstanding performer by your current or previous employers. 


The position is based out of Chennai. Our work currently involves significant in-person collaboration and we expect you to work out of our offices in Chennai.

Read more
Tata Consultancy Services
Chennai, Hyderabad, Kolkata, Delhi, Pune, Bengaluru (Bangalore)
4 - 10 yrs
₹6L - ₹30L / yr
Scala
PySpark
Spark
skill iconAmazon Web Services (AWS)

Job Title: PySpark/Scala Developer

 

Functional Skills: Experience in Credit Risk/Regulatory risk domain

Technical Skills: Spark ,PySpark, Python, Hive, Scala, MapReduce, Unix shell scripting

Good to Have Skills: Exposure to Machine Learning Techniques

Job Description:

5+ Years of experience with Developing/Fine tuning and implementing programs/applications

Using Python/PySpark/Scala on Big Data/Hadoop Platform.

Roles and Responsibilities:

a)     Work with a Leading Bank’s Risk Management team on specific projects/requirements pertaining to risk Models in

 consumer and wholesale banking

b)     Enhance Machine Learning Models using PySpark or Scala

c)     Work with Data Scientists to Build ML Models based on Business Requirements and Follow ML Cycle to Deploy them all

the way to Production Environment

d)     Participate Feature Engineering, Training Models, Scoring and retraining

e)     Architect Data Pipeline and Automate Data Ingestion and Model Jobs

 

Skills and competencies:

Required:

·       Strong analytical skills in conducting sophisticated statistical analysis using bureau/vendor data, customer performance

Data and macro-economic data to solve business problems.

·       Working experience in languages PySpark & Scala to develop code to validate and implement models and codes in

Credit Risk/Banking

·       Experience with distributed systems such as Hadoop/MapReduce, Spark, streaming data processing, cloud architecture.

  • Familiarity with machine learning frameworks and libraries (like scikit-learn, SparkML, tensorflow, pytorch etc.
  • Experience in systems integration, web services, batch processing
  • Experience in migrating codes to PySpark/Scala is big Plus
  • The ability to act as liaison conveying information needs of the business to IT and data constraints to the business

applies equal conveyance regarding business strategy and IT strategy, business processes and work flow

·       Flexibility in approach and thought process

·       Attitude to learn and comprehend the periodical changes in the regulatory requirement as per FED

 

 

Read more
Tata Consultancy Services
Bengaluru (Bangalore), Hyderabad, Pune, Delhi, Kolkata, Chennai
5 - 8 yrs
₹7L - ₹30L / yr
skill iconScala
skill iconPython
PySpark
Apache Hive
Spark
+3 more

Skills and competencies:

Required:

·        Strong analytical skills in conducting sophisticated statistical analysis using bureau/vendor data, customer performance

Data and macro-economic data to solve business problems.

·        Working experience in languages PySpark & Scala to develop code to validate and implement models and codes in

Credit Risk/Banking

·        Experience with distributed systems such as Hadoop/MapReduce, Spark, streaming data processing, cloud architecture.

  • Familiarity with machine learning frameworks and libraries (like scikit-learn, SparkML, tensorflow, pytorch etc.
  • Experience in systems integration, web services, batch processing
  • Experience in migrating codes to PySpark/Scala is big Plus
  • The ability to act as liaison conveying information needs of the business to IT and data constraints to the business

applies equal conveyance regarding business strategy and IT strategy, business processes and work flow

·        Flexibility in approach and thought process

·        Attitude to learn and comprehend the periodical changes in the regulatory requirement as per FED

Read more
Remote, Bengaluru (Bangalore), Pune, Chennai, Nagpur
5 - 15 yrs
₹20L - ₹30L / yr
databricks
PySpark
Apache Spark
CI/CD
Data engineering


Technical Architect (Databricks)

  • 10+ Years Data Engineering Experience with expertise in Databricks
  • 3+ years of consulting experience
  • Completed Data Engineering Professional certification & required classes
  • Minimum 2-3 projects delivered with hands-on experience in Databricks
  • Completed Apache Spark Programming with Databricks, Data Engineering with Databricks, Optimizing Apache Spark™ on Databricks
  • Experience in Spark and/or Hadoop, Flink, Presto, other popular big data engines
  • Familiarity with Databricks multi-hop pipeline architecture

 

 

Sr. Data Engineer (Databricks)

 

  • 5+ Years Data Engineering Experience with expertise in Databricks
  • Completed Data Engineering Associate certification & required classes
  • Minimum 1 project delivered with hands-on experience in development on Databricks
  • Completed Apache Spark Programming with Databricks, Data Engineering with Databricks, Optimizing Apache Spark™ on Databricks
  • SQL delivery experience, and familiarity with Bigquery, Synapse or Redshift
  • Proficient in Python, knowledge of additional databricks programming languages (Scala)


Read more
VyTCDC
Gobinath Sundaram
Posted by Gobinath Sundaram
Chennai, Bengaluru (Bangalore), Hyderabad, Mumbai, Pune, Noida
4 - 6 yrs
₹3L - ₹21L / yr
AWS Data Engineer
skill iconAmazon Web Services (AWS)
skill iconPython
PySpark
databricks
+1 more

 Key Responsibilities

  • Design and implement ETL/ELT pipelines using Databricks, PySpark, and AWS Glue
  • Develop and maintain scalable data architectures on AWS (S3, EMR, Lambda, Redshift, RDS)
  • Perform data wrangling, cleansing, and transformation using Python and SQL
  • Collaborate with data scientists to integrate Generative AI models into analytics workflows
  • Build dashboards and reports to visualize insights using tools like Power BI or Tableau
  • Ensure data quality, governance, and security across all data assets
  • Optimize performance of data pipelines and troubleshoot bottlenecks
  • Work closely with stakeholders to understand data requirements and deliver actionable insights

🧪 Required Skills

Skill AreaTools & TechnologiesCloud PlatformsAWS (S3, Lambda, Glue, EMR, Redshift)Big DataDatabricks, Apache Spark, PySparkProgrammingPython, SQLData EngineeringETL/ELT, Data Lakes, Data WarehousingAnalyticsData Modeling, Visualization, BI ReportingGen AI IntegrationOpenAI, Hugging Face, LangChain (preferred)DevOps (Bonus)Git, Jenkins, Terraform, Docker

📚 Qualifications

  • Bachelor's or Master’s degree in Computer Science, Data Science, or related field
  • 3+ years of experience in data engineering or data analytics
  • Hands-on experience with Databricks, PySpark, and AWS
  • Familiarity with Generative AI tools and frameworks is a strong plus
  • Strong problem-solving and communication skills

🌟 Preferred Traits

  • Analytical mindset with attention to detail
  • Passion for data and emerging technologies
  • Ability to work independently and in cross-functional teams
  • Eagerness to learn and adapt in a fast-paced environment


Read more
Deqode

at Deqode

1 recruiter
Alisha Das
Posted by Alisha Das
Bengaluru (Bangalore), Mumbai, Pune, Chennai, Gurugram
5.6 - 7 yrs
₹10L - ₹28L / yr
skill iconAmazon Web Services (AWS)
skill iconPython
PySpark
SQL

Job Summary:

As an AWS Data Engineer, you will be responsible for designing, developing, and maintaining scalable, high-performance data pipelines using AWS services. With 6+ years of experience, you’ll collaborate closely with data architects, analysts, and business stakeholders to build reliable, secure, and cost-efficient data infrastructure across the organization.

Key Responsibilities:

  • Design, develop, and manage scalable data pipelines using AWS Glue, Lambda, and other serverless technologies
  • Implement ETL workflows and transformation logic using PySpark and Python on AWS Glue
  • Leverage AWS Redshift for warehousing, performance tuning, and large-scale data queries
  • Work with AWS DMS and RDS for database integration and migration
  • Optimize data flows and system performance for speed and cost-effectiveness
  • Deploy and manage infrastructure using AWS CloudFormation templates
  • Collaborate with cross-functional teams to gather requirements and build robust data solutions
  • Ensure data integrity, quality, and security across all systems and processes

Required Skills & Experience:

  • 6+ years of experience in Data Engineering with strong AWS expertise
  • Proficient in Python and PySpark for data processing and ETL development
  • Hands-on experience with AWS Glue, Lambda, DMS, RDS, and Redshift
  • Strong SQL skills for building complex queries and performing data analysis
  • Familiarity with AWS CloudFormation and infrastructure as code principles
  • Good understanding of serverless architecture and cost-optimized design
  • Ability to write clean, modular, and maintainable code
  • Strong analytical thinking and problem-solving skills


Read more
Bengaluru (Bangalore), Pune, Chennai
5 - 12 yrs
₹5L - ₹25L / yr
PySpark
Automation
SQL

Skill Name: ETL Automation Testing

Location: Bangalore, Chennai and Pune

Experience: 5+ Years


Required:

Experience in ETL Automation Testing

Strong experience in Pyspark.

Read more
Deqode

at Deqode

1 recruiter
Roshni Maji
Posted by Roshni Maji
Pune, Bengaluru (Bangalore), Gurugram, Chennai, Mumbai
5 - 7 yrs
₹6L - ₹20L / yr
skill iconAmazon Web Services (AWS)
Amazon Redshift
AWS Glue
skill iconPython
PySpark

Position: AWS Data Engineer

Experience: 5 to 7 Years

Location: Bengaluru, Pune, Chennai, Mumbai, Gurugram

Work Mode: Hybrid (3 days work from office per week)

Employment Type: Full-time

About the Role:

We are seeking a highly skilled and motivated AWS Data Engineer with 5–7 years of experience in building and optimizing data pipelines, architectures, and data sets. The ideal candidate will have strong experience with AWS services including Glue, Athena, Redshift, Lambda, DMS, RDS, and CloudFormation. You will be responsible for managing the full data lifecycle from ingestion to transformation and storage, ensuring efficiency and performance.

Key Responsibilities:

  • Design, develop, and optimize scalable ETL pipelines using AWS Glue, Python/PySpark, and SQL.
  • Work extensively with AWS services such as Glue, Athena, Lambda, DMS, RDS, Redshift, CloudFormation, and other serverless technologies.
  • Implement and manage data lake and warehouse solutions using AWS Redshift and S3.
  • Optimize data models and storage for cost-efficiency and performance.
  • Write advanced SQL queries to support complex data analysis and reporting requirements.
  • Collaborate with stakeholders to understand data requirements and translate them into scalable solutions.
  • Ensure high data quality and integrity across platforms and processes.
  • Implement CI/CD pipelines and best practices for infrastructure as code using CloudFormation or similar tools.

Required Skills & Experience:

  • Strong hands-on experience with Python or PySpark for data processing.
  • Deep knowledge of AWS Glue, Athena, Lambda, Redshift, RDS, DMS, and CloudFormation.
  • Proficiency in writing complex SQL queries and optimizing them for performance.
  • Familiarity with serverless architectures and AWS best practices.
  • Experience in designing and maintaining robust data architectures and data lakes.
  • Ability to troubleshoot and resolve data pipeline issues efficiently.
  • Strong communication and stakeholder management skills.


Read more
Deqode

at Deqode

1 recruiter
Roshni Maji
Posted by Roshni Maji
Bengaluru (Bangalore), Pune, Mumbai, Chennai, Gurugram
5 - 7 yrs
₹5L - ₹19L / yr
skill iconPython
PySpark
skill iconAmazon Web Services (AWS)
aws
Amazon Redshift
+1 more

Position: AWS Data Engineer

Experience: 5 to 7 Years

Location: Bengaluru, Pune, Chennai, Mumbai, Gurugram

Work Mode: Hybrid (3 days work from office per week)

Employment Type: Full-time

About the Role:

We are seeking a highly skilled and motivated AWS Data Engineer with 5–7 years of experience in building and optimizing data pipelines, architectures, and data sets. The ideal candidate will have strong experience with AWS services including Glue, Athena, Redshift, Lambda, DMS, RDS, and CloudFormation. You will be responsible for managing the full data lifecycle from ingestion to transformation and storage, ensuring efficiency and performance.

Key Responsibilities:

  • Design, develop, and optimize scalable ETL pipelines using AWS Glue, Python/PySpark, and SQL.
  • Work extensively with AWS services such as Glue, Athena, Lambda, DMS, RDS, Redshift, CloudFormation, and other serverless technologies.
  • Implement and manage data lake and warehouse solutions using AWS Redshift and S3.
  • Optimize data models and storage for cost-efficiency and performance.
  • Write advanced SQL queries to support complex data analysis and reporting requirements.
  • Collaborate with stakeholders to understand data requirements and translate them into scalable solutions.
  • Ensure high data quality and integrity across platforms and processes.
  • Implement CI/CD pipelines and best practices for infrastructure as code using CloudFormation or similar tools.

Required Skills & Experience:

  • Strong hands-on experience with Python or PySpark for data processing.
  • Deep knowledge of AWS Glue, Athena, Lambda, Redshift, RDS, DMS, and CloudFormation.
  • Proficiency in writing complex SQL queries and optimizing them for performance.
  • Familiarity with serverless architectures and AWS best practices.
  • Experience in designing and maintaining robust data architectures and data lakes.
  • Ability to troubleshoot and resolve data pipeline issues efficiently.
  • Strong communication and stakeholder management skills.


Read more
Deqode

at Deqode

1 recruiter
Shraddha Katare
Posted by Shraddha Katare
Bengaluru (Bangalore), Pune, Chennai, Mumbai, Gurugram
5 - 7 yrs
₹5L - ₹19L / yr
skill iconAmazon Web Services (AWS)
skill iconPython
PySpark
SQL
redshift

Profile: AWS Data Engineer

Mode- Hybrid

Experience- 5+7 years

Locations - Bengaluru, Pune, Chennai, Mumbai, Gurugram


Roles and Responsibilities

  • Design and maintain ETL pipelines using AWS Glue and Python/PySpark
  • Optimize SQL queries for Redshift and Athena
  • Develop Lambda functions for serverless data processing
  • Configure AWS DMS for database migration and replication
  • Implement infrastructure as code with CloudFormation
  • Build optimized data models for performance
  • Manage RDS databases and AWS service integrations
  • Troubleshoot and improve data processing efficiency
  • Gather requirements from business stakeholders
  • Implement data quality checks and validation
  • Document data pipelines and architecture
  • Monitor workflows and implement alerting
  • Keep current with AWS services and best practices


Required Technical Expertise:

  • Python/PySpark for data processing
  • AWS Glue for ETL operations
  • Redshift and Athena for data querying
  • AWS Lambda and serverless architecture
  • AWS DMS and RDS management
  • CloudFormation for infrastructure
  • SQL optimization and performance tuning
Read more
Deqode

at Deqode

1 recruiter
Alisha Das
Posted by Alisha Das
Pune, Mumbai, Bengaluru (Bangalore), Chennai
4 - 7 yrs
₹5L - ₹15L / yr
skill iconAmazon Web Services (AWS)
skill iconPython
PySpark
Glue semantics
Amazon Redshift
+1 more

Job Overview:

We are seeking an experienced AWS Data Engineer to join our growing data team. The ideal candidate will have hands-on experience with AWS Glue, Redshift, PySpark, and other AWS services to build robust, scalable data pipelines. This role is perfect for someone passionate about data engineering, automation, and cloud-native development.

Key Responsibilities:

  • Design, build, and maintain scalable and efficient ETL pipelines using AWS Glue, PySpark, and related tools.
  • Integrate data from diverse sources and ensure its quality, consistency, and reliability.
  • Work with large datasets in structured and semi-structured formats across cloud-based data lakes and warehouses.
  • Optimize and maintain data infrastructure, including Amazon Redshift, for high performance.
  • Collaborate with data analysts, data scientists, and product teams to understand data requirements and deliver solutions.
  • Automate data validation, transformation, and loading processes to support real-time and batch data processing.
  • Monitor and troubleshoot data pipeline issues and ensure smooth operations in production environments.

Required Skills:

  • 5 to 7 years of hands-on experience in data engineering roles.
  • Strong proficiency in Python and PySpark for data transformation and scripting.
  • Deep understanding and practical experience with AWS Glue, AWS Redshift, S3, and other AWS data services.
  • Solid understanding of SQL and database optimization techniques.
  • Experience working with large-scale data pipelines and high-volume data environments.
  • Good knowledge of data modeling, warehousing, and performance tuning.

Preferred/Good to Have:

  • Experience with workflow orchestration tools like Airflow or Step Functions.
  • Familiarity with CI/CD for data pipelines.
  • Knowledge of data governance and security best practices on AWS.
Read more
ZeMoSo Technologies

at ZeMoSo Technologies

11 recruiters
Agency job
via TIGI HR Solution Pvt. Ltd. by Vaidehi Sarkar
Mumbai, Bengaluru (Bangalore), Hyderabad, Chennai, Pune
4 - 8 yrs
₹10L - ₹15L / yr
Data engineering
skill iconPython
SQL
Data Warehouse (DWH)
skill iconAmazon Web Services (AWS)
+3 more

Work Mode: Hybrid


Need B.Tech, BE, M.Tech, ME candidates - Mandatory



Must-Have Skills:

● Educational Qualification :- B.Tech, BE, M.Tech, ME in any field.

● Minimum of 3 years of proven experience as a Data Engineer.

● Strong proficiency in Python programming language and SQL.

● Experience in DataBricks and setting up and managing data pipelines, data warehouses/lakes.

● Good comprehension and critical thinking skills.


● Kindly note Salary bracket will vary according to the exp. of the candidate - 

- Experience from 4 yrs to 6 yrs - Salary upto 22 LPA

- Experience from 5 yrs to 8 yrs - Salary upto 30 LPA

- Experience more than 8 yrs - Salary upto 40 LPA

Read more
Xebia IT Architects

at Xebia IT Architects

2 recruiters
Vijay S
Posted by Vijay S
Bengaluru (Bangalore), Gurugram, Pune, Hyderabad, Chennai, Bhopal, Jaipur
10 - 15 yrs
₹30L - ₹40L / yr
Spark
Google Cloud Platform (GCP)
skill iconPython
Apache Airflow
PySpark
+1 more

We are looking for a Senior Data Engineer with strong expertise in GCP, Databricks, and Airflow to design and implement a GCP Cloud Native Data Processing Framework. The ideal candidate will work on building scalable data pipelines and help migrate existing workloads to a modern framework.


  • Shift: 2 PM 11 PM
  • Work Mode: Hybrid (3 days a week) across Xebia locations
  • Notice Period: Immediate joiners or those with a notice period of up to 30 days


Key Responsibilities:

  • Design and implement a GCP Native Data Processing Framework leveraging Spark and GCP Cloud Services.
  • Develop and maintain data pipelines using Databricks and Airflow for transforming Raw → Silver → Gold data layers.
  • Ensure data integrity, consistency, and availability across all systems.
  • Collaborate with data engineers, analysts, and stakeholders to optimize performance.
  • Document standards and best practices for data engineering workflows.

Required Experience:


  • 7-8 years of experience in data engineering, architecture, and pipeline development.
  • Strong knowledge of GCP, Databricks, PySpark, and BigQuery.
  • Experience with Orchestration tools like Airflow, Dagster, or GCP equivalents.
  • Understanding of Data Lake table formats (Delta, Iceberg, etc.).
  • Proficiency in Python for scripting and automation.
  • Strong problem-solving skills and collaborative mindset.


⚠️ Please apply only if you have not applied recently or are not currently in the interview process for any open roles at Xebia.


Looking forward to your response!


Best regards,

Vijay S

Assistant Manager - TAG

https://www.linkedin.com/in/vijay-selvarajan/

Read more
Bengaluru (Bangalore), Mumbai, Delhi, Gurugram, Pune, Hyderabad, Ahmedabad, Chennai
3 - 7 yrs
₹8L - ₹15L / yr
AWS Lambda
Amazon S3
Amazon VPC
Amazon EC2
Amazon Redshift
+3 more

Technical Skills:


  • Ability to understand and translate business requirements into design.
  • Proficient in AWS infrastructure components such as S3, IAM, VPC, EC2, and Redshift.
  • Experience in creating ETL jobs using Python/PySpark.
  • Proficiency in creating AWS Lambda functions for event-based jobs.
  • Knowledge of automating ETL processes using AWS Step Functions.
  • Competence in building data warehouses and loading data into them.


Responsibilities:


  • Understand business requirements and translate them into design.
  • Assess AWS infrastructure needs for development work.
  • Develop ETL jobs using Python/PySpark to meet requirements.
  • Implement AWS Lambda for event-based tasks.
  • Automate ETL processes using AWS Step Functions.
  • Build data warehouses and manage data loading.
  • Engage with customers and stakeholders to articulate the benefits of proposed solutions and frameworks.
Read more
one-to-one, one-to-many, and many-to-many

one-to-one, one-to-many, and many-to-many

Agency job
via The Hub by Sridevi Viswanathan
Chennai
5 - 10 yrs
₹1L - ₹15L / yr
AWS CloudFormation
skill iconPython
PySpark
AWS Lambda

5-7 years of experience in Data Engineering with solid experience in design, development and implementation of end-to-end data ingestion and data processing system in AWS platform.

2-3 years of experience in AWS Glue, Lambda, Appflow, EventBridge, Python, PySpark, Lake House, S3, Redshift, Postgres, API Gateway, CloudFormation, Kinesis, Athena, KMS, IAM.

Experience in modern data architecture, Lake House, Enterprise Data Lake, Data Warehouse, API interfaces, solution patterns, standards and optimizing data ingestion.

Experience in build of data pipelines from source systems like SAP Concur, Veeva Vault, Azure Cost, various social media platforms or similar source systems.

Expertise in analyzing source data and designing a robust and scalable data ingestion framework and pipelines adhering to client Enterprise Data Architecture guidelines.

Proficient in design and development of solutions for real-time (or near real time) stream data processing as well as batch processing on the AWS platform.

Work closely with business analysts, data architects, data engineers, and data analysts to ensure that the data ingestion solutions meet the needs of the business.

Troubleshoot and provide support for issues related to data quality and data ingestion solutions. This may involve debugging data pipeline processes, optimizing queries, or troubleshooting application performance issues.

Experience in working in Agile/Scrum methodologies, CI/CD tools and practices, coding standards, code reviews, source management (GITHUB), JIRA, JIRA Xray and Confluence.

Experience or exposure to design and development using Full Stack tools.

Strong analytical and problem-solving skills, excellent communication (written and oral), and interpersonal skills.

Bachelor's or master's degree in computer science or related field.

 

 

Read more
A fast growing Big Data company

A fast growing Big Data company

Agency job
via Careerconnects by Kumar Narayanan
Noida, Bengaluru (Bangalore), Chennai, Hyderabad
6 - 8 yrs
₹10L - ₹15L / yr
AWS Glue
SQL
skill iconPython
PySpark
Data engineering
+6 more

AWS Glue Developer 

Work Experience: 6 to 8 Years

Work Location:  Noida, Bangalore, Chennai & Hyderabad

Must Have Skills: AWS Glue, DMS, SQL, Python, PySpark, Data integrations and Data Ops, 

Job Reference ID:BT/F21/IND


Job Description:

Design, build and configure applications to meet business process and application requirements.


Responsibilities:

7 years of work experience with ETL, Data Modelling, and Data Architecture Proficient in ETL optimization, designing, coding, and tuning big data processes using Pyspark Extensive experience to build data platforms on AWS using core AWS services Step function, EMR, Lambda, Glue and Athena, Redshift, Postgres, RDS etc and design/develop data engineering solutions. Orchestrate using Airflow.


Technical Experience:

Hands-on experience on developing Data platform and its components Data Lake, cloud Datawarehouse, APIs, Batch and streaming data pipeline Experience with building data pipelines and applications to stream and process large datasets at low latencies.


➢ Enhancements, new development, defect resolution and production support of Big data ETL development using AWS native services.

➢ Create data pipeline architecture by designing and implementing data ingestion solutions.

➢ Integrate data sets using AWS services such as Glue, Lambda functions/ Airflow.

➢ Design and optimize data models on AWS Cloud using AWS data stores such as Redshift, RDS, S3, Athena.

➢ Author ETL processes using Python, Pyspark.

➢ Build Redshift Spectrum direct transformations and data modelling using data in S3.

➢ ETL process monitoring using CloudWatch events.

➢ You will be working in collaboration with other teams. Good communication must.

➢ Must have experience in using AWS services API, AWS CLI and SDK


Professional Attributes:

➢ Experience operating very large data warehouses or data lakes Expert-level skills in writing and optimizing SQL Extensive, real-world experience designing technology components for enterprise solutions and defining solution architectures and reference architectures with a focus on cloud technology.

➢ Must have 6+ years of big data ETL experience using Python, S3, Lambda, Dynamo DB, Athena, Glue in AWS environment.

➢ Expertise in S3, RDS, Redshift, Kinesis, EC2 clusters highly desired.


Qualification:

➢ Degree in Computer Science, Computer Engineering or equivalent.


Salary: Commensurate with experience and demonstrated competence

Read more
A Product Based Client,Chennai

A Product Based Client,Chennai

Agency job
via SangatHR by Anna Poorni
Chennai
4 - 8 yrs
₹10L - ₹15L / yr
Data Warehouse (DWH)
Informatica
ETL
Spark
PySpark
+2 more

Analytics Job Description

We are hiring an Analytics Engineer to help drive our Business Intelligence efforts. You will

partner closely with leaders across the organization, working together to understand the how

and why of people, team and company challenges, workflows and culture. The team is

responsible for delivering data and insights that drive decision-making, execution, and

investments for our product initiatives.

You will work cross-functionally with product, marketing, sales, engineering, finance, and our

customer-facing teams enabling them with data and narratives about the customer journey.

You’ll also work closely with other data teams, such as data engineering and product analytics,

to ensure we are creating a strong data culture at Blend that enables our cross-functional partners

to be more data-informed.


Role : DataEngineer 

Please find below the JD for the DataEngineer Role..

  Location: Guindy,Chennai

How you’ll contribute:

• Develop objectives and metrics, ensure priorities are data-driven, and balance short-

term and long-term goals


• Develop deep analytical insights to inform and influence product roadmaps and

business decisions and help improve the consumer experience

• Work closely with GTM and supporting operations teams to author and develop core

data sets that empower analyses

• Deeply understand the business and proactively spot risks and opportunities

• Develop dashboards and define metrics that drive key business decisions

• Build and maintain scalable ETL pipelines via solutions such as Fivetran, Hightouch,

and Workato

• Design our Analytics and Business Intelligence architecture, assessing and

implementing new technologies that fitting


• Work with our engineering teams to continually make our data pipelines and tooling

more resilient


Who you are:

• Bachelor’s degree or equivalent required from an accredited institution with a

quantitative focus such as Economics, Operations Research, Statistics, Computer Science OR 1-3 Years of Experience as a Data Analyst, Data Engineer, Data Scientist

• Must have strong SQL and data modeling skills, with experience applying skills to

thoughtfully create data models in a warehouse environment.

• A proven track record of using analysis to drive key decisions and influence change

• Strong storyteller and ability to communicate effectively with managers and

executives

• Demonstrated ability to define metrics for product areas, understand the right

questions to ask and push back on stakeholders in the face of ambiguous, complex

problems, and work with diverse teams with different goals

• A passion for documentation.

• A solution-oriented growth mindset. You’ll need to be a self-starter and thrive in a

dynamic environment.

• A bias towards communication and collaboration with business and technical

stakeholders.

• Quantitative rigor and systems thinking.

• Prior startup experience is preferred, but not required.

• Interest or experience in machine learning techniques (such as clustering, decision

tree, and segmentation)

• Familiarity with a scientific computing language, such as Python, for data wrangling

and statistical analysis

• Experience with a SQL focused data transformation framework such as dbt

• Experience with a Business Intelligence Tool such as Mode/Tableau


Mandatory Skillset:


-Very Strong in SQL

-Spark OR pyspark OR Python

-Shell Scripting


Read more
Chennai, Hyderabad
5 - 10 yrs
₹10L - ₹25L / yr
PySpark
Data engineering
Big Data
Hadoop
Spark
+2 more

Bigdata with cloud:

 

Experience : 5-10 years

 

Location : Hyderabad/Chennai

 

Notice period : 15-20 days Max

 

1.  Expertise in building AWS Data Engineering pipelines with AWS Glue -> Athena -> Quick sight

2.  Experience in developing lambda functions with AWS Lambda

3.  Expertise with Spark/PySpark – Candidate should be hands on with PySpark code and should be able to do transformations with Spark

4.  Should be able to code in Python and Scala.

5.  Snowflake experience will be a plus

Read more
Aureus Tech Systems

at Aureus Tech Systems

3 recruiters
Naveen Yelleti
Posted by Naveen Yelleti
Kolkata, Hyderabad, Chennai, Bengaluru (Bangalore), Bhubaneswar, Visakhapatnam, Vijayawada, Trichur, Thiruvananthapuram, Mysore, Delhi, Noida, Gurugram, Nagpur
1 - 7 yrs
₹4L - ₹15L / yr
PySpark
Data engineering
Big Data
Hadoop
Spark
+2 more

Skills and requirements

  • Experience analyzing complex and varied data in a commercial or academic setting.
  • Desire to solve new and complex problems every day.
  • Excellent ability to communicate scientific results to both technical and non-technical team members.


Desirable

  • A degree in a numerically focused discipline such as, Maths, Physics, Chemistry, Engineering or Biological Sciences..
  • Hands on experience on Python, Pyspark, SQL
  • Hands on experience on building End to End Data Pipelines.
  • Hands on Experience on Azure Data Factory, Azure Data Bricks, Data Lake - added advantage
  • Hands on Experience in building data pipelines.
  • Experience with Bigdata Tools, Hadoop, Hive, Sqoop, Spark, SparkSQL
  • Experience with SQL or NoSQL databases for the purposes of data retrieval and management.
  • Experience in data warehousing and business intelligence tools, techniques and technology, as well as experience in diving deep on data analysis or technical issues to come up with effective solutions.
  • BS degree in math, statistics, computer science or equivalent technical field.
  • Experience in data mining structured and unstructured data (SQL, ETL, data warehouse, Machine Learning etc.) in a business environment with large-scale, complex data sets.
  • Proven ability to look at solutions in unconventional ways. Sees opportunities to innovate and can lead the way.
  • Willing to learn and work on Data Science, ML, AI.
Read more
GradMener Technology Pvt. Ltd.
Pune, Chennai
5 - 9 yrs
₹15L - ₹20L / yr
skill iconScala
PySpark
Spark
SQL Azure
Hadoop
+4 more
  • 5+ years of experience in a Data Engineering role on cloud environment
  • Must have good experience in Scala/PySpark (preferably on data-bricks environment)
  • Extensive experience with Transact-SQL.
  • Experience in Data-bricks/Spark.
  • Strong experience in Dataware house projects
  • Expertise in database development projects with ETL processes.
  • Manage and maintain data engineering pipelines
  • Develop batch processing, streaming and integration solutions
  • Experienced in building and operationalizing large-scale enterprise data solutions and applications
  • Using one or more of Azure data and analytics services in combination with custom solutions
  • Azure Data Lake, Azure SQL DW (Synapse), and SQL Database products or equivalent products from other cloud services providers
  • In-depth understanding of data management (e. g. permissions, security, and monitoring).
  • Cloud repositories for e.g. Azure GitHub, Git
  • Experience in an agile environment (Prefer Azure DevOps).

Good to have

  • Manage source data access security
  • Automate Azure Data Factory pipelines
  • Continuous Integration/Continuous deployment (CICD) pipelines, Source Repositories
  • Experience in implementing and maintaining CICD pipelines
  • Power BI understanding, Delta Lake house architecture
  • Knowledge of software development best practices.
  • Excellent analytical and organization skills.
  • Effective working in a team as well as working independently.
  • Strong written and verbal communication skills.
  • Expertise in database development projects and ETL processes.
Read more
MNC

MNC

Agency job
via Eurka IT SOL by Srikanth a
Chennai
5 - 11 yrs
₹10L - ₹15L / yr
PySpark
SQL
Test Automation (QA)
Big Data
skill iconData Science

Lead QA: more than 5 years experience , led the team of more than 5 people in big data platform, should have experience in Test Automation framework, should have experience of Test process documentation

Read more
A leading global information technology and business process

A leading global information technology and business process

Agency job
via Jobdost by Mamatha A
Chennai
5 - 14 yrs
₹13L - ₹21L / yr
skill iconPython
skill iconJava
PySpark
skill iconJavascript
Hadoop

Python + Data scientist : 
• Hands-on and sound knowledge of Python, Pyspark, Java script

• Build data-driven models to understand the characteristics of engineering systems

• Train, tune, validate, and monitor predictive models

• Sound knowledge on Statistics

• Experience in developing data processing tasks using PySpark such as reading,

merging, enrichment, loading of data from external systems to target data destinations

• Working knowledge on Big Data or/and Hadoop environments

• Experience creating CI/CD Pipelines using Jenkins or like tools

• Practiced in eXtreme Programming (XP) disciplines 

Read more
Amagi Media Labs

at Amagi Media Labs

3 recruiters
Rajesh C
Posted by Rajesh C
Chennai
5 - 7 yrs
₹20L - ₹30L / yr
Technical support
Tech Support
SQL
Informatica
PySpark
+1 more
Job Title: Support Engineer L3 Job Location: Chennai
We CondéNast are looking for a Support engineer Level 2 who would be responsible for
monitoring and maintaining the production systems to ensure the business continuity is
maintained. Your Responsibilities would also include prompt communication to business
and internal teams about process delays, stability, issue, resolutions.
Primary Responsibilities
● 5+ years experience in Production support
● The Support Data Engineer is responsible for monitoring of the data pipelines
that are in production.
● Level 3 support activities - Analysing issue, debug programs & Jobs, bug fix
● The position will contribute to the monitoring, rerun or reschedule, code fix
of pipelines for a variety of projects on a daily basis.
● Escalate failures to Data-Team/DevOps incase of Infrastructure Failures or unable
to revive the data-pipelines.
● Ensure accurate alerts are raised incase of pipeline failures and corresponding
stakeholders (Business/Data Teams) are notified about the same within the
agreed upon SLAs.
● Prepare and present success/failure metrics by accurately logging the
monitoring stats.
● Able to work in shifts to provide overlap with US Business teams
● Other duties as requested or assigned.
Desired Skills & Qualification
● Have Strong working knowledge of Pyspark, Informatica, SQL(PRESTO), Batch
Handling through schedulers(databricks, Astronomer will be an
advantage),AWS-S3, SQL, Airflow and Hive/Presto
● Have basic knowledge on Shell scripts and/or Bash commands.
● Able to execute queries in Databases and produce outputs.
● Able to understand and execute the steps provided by Data-Team to
revive data-pipelines.
● Strong verbal, written communication skills and strong interpersonal
skills.
● Graduate/Diploma in computer science or information technology.
About Condé Nast
CONDÉ NAST GLOBAL
Condé Nast is a global media house with over a century of distinguished publishing
history. With a portfolio of iconic brands like Vogue, GQ, Vanity Fair, The New Yorker and
Bon Appétit, we at Condé Nast aim to tell powerful, compelling stories of communities,
culture and the contemporary world. Our operations are headquartered in New York and
London, with colleagues and collaborators in 32 markets across the world, including
France, Germany, India, China, Japan, Spain, Italy, Russia, Mexico, and Latin America.
Condé Nast has been raising the industry standards and setting records for excellence in
the publishing space. Today, our brands reach over 1 billion people in print, online, video,
and social media.
CONDÉ NAST INDIA (DATA)
Over the years, Condé Nast successfully expanded and diversified into digital, TV, and
social platforms - in other words, a staggering amount of user data. Condé Nast made the
right move to invest heavily in understanding this data and formed a whole new Data
team entirely dedicated to data processing, engineering, analytics, and visualization. This
team helps drive engagement, fuel process innovation, further content enrichment, and
increase market revenue. The Data team aimed to create a company culture where data
was the common language and facilitate an environment where insights shared in
real-time could improve performance. The Global Data team operates out of Los Angeles,
New York, Chennai, and London. The team at Condé Nast Chennai works extensively with
data to amplify its brands' digital capabilities and boost online revenue. We are broadly
divided into four groups, Data Intelligence, Data Engineering, Data Science, and
Operations (including Product and Marketing Ops, Client Services) along with Data
Strategy and monetization. The teams built capabilities and products to create
data-driven solutions for better audience engagement.
What we look forward to:
We want to welcome bright, new minds into our midst and work together to create
diverse forms of self-expression. At Condé Nast, we encourage the imaginative and
celebrate the extraordinary. We are a media company for the future, with a remarkable
past. We are Condé Nast, and It Starts Here.
Read more
Virtusa

at Virtusa

2 recruiters
Agency job
via Response Informatics by Anupama Lavanya Uppala
Chennai, Bengaluru (Bangalore), Mumbai, Hyderabad, Pune
3 - 10 yrs
₹10L - ₹25L / yr
PySpark
skill iconPython
  • Minimum 1 years of relevant experience, in PySpark (mandatory)
  • Hands on experience in development, test, deploy, maintain and improving data integration pipeline in AWS cloud environment is added plus 
  • Ability to play lead role and independently manage 3-5 member of Pyspark development team 
  • EMR ,Python and PYspark mandate.
  • Knowledge and awareness working with AWS Cloud technologies like Apache Spark, , Glue, Kafka, Kinesis, and Lambda in S3, Redshift, RDS
Read more
Agilisium

Agilisium

Agency job
via Recruiting India by Moumita Santra
Chennai
10 - 19 yrs
₹12L - ₹40L / yr
Big Data
Apache Spark
Spark
PySpark
ETL
+1 more

Job Sector: IT, Software

Job Type: Permanent

Location: Chennai

Experience: 10 - 20 Years

Salary: 12 – 40 LPA

Education: Any Graduate

Notice Period: Immediate

Key Skills: Python, Spark, AWS, SQL, PySpark

Contact at triple eight two zero nine four two double seven

 

Job Description:

Requirements

  • Minimum 12 years experience
  • In depth understanding and knowledge on distributed computing with spark.
  • Deep understanding of Spark Architecture and internals
  • Proven experience in data ingestion, data integration and data analytics with spark, preferably PySpark.
  • Expertise in ETL processes, data warehousing and data lakes.
  • Hands on with python for Big data and analytics.
  • Hands on in agile scrum model is an added advantage.
  • Knowledge on CI/CD and orchestration tools is desirable.
  • AWS S3, Redshift, Lambda knowledge is preferred
Thanks
Read more
Virtusa

at Virtusa

2 recruiters
Agency job
via Devenir by Rakesh Kumar
Chennai, Hyderabad
4 - 6 yrs
₹10L - ₹20L / yr
PySpark
skill iconAmazon Web Services (AWS)
skill iconPython
  • Hands-on experience in Development
  • 4-6 years of Hands on experience with Python scripts
  • 2-3 years of Hands on experience in PySpark coding. Worked in spark cluster computing technology.
  • 3-4 years of Hands on end to end data pipeline experience working on AWS environments
  • 3-4 years of Hands on experience working on AWS services – Glue, Lambda, Step Functions, EC2, RDS, SES, SNS, DMS, CloudWatch etc.
  • 2-3 years of Hands on experience working on AWS redshift
  • 6+ years of Hands on experience with writing Unix Shell scripts
  • Good communication skills
Read more
netmedscom

at netmedscom

3 recruiters
Vijay Hemnath
Posted by Vijay Hemnath
Chennai
2 - 5 yrs
₹6L - ₹25L / yr
Big Data
Hadoop
Apache Hive
skill iconScala
Spark
+12 more

We are looking for an outstanding Big Data Engineer with experience setting up and maintaining Data Warehouse and Data Lakes for an Organization. This role would closely collaborate with the Data Science team and assist the team build and deploy machine learning and deep learning models on big data analytics platforms.

Roles and Responsibilities:

  • Develop and maintain scalable data pipelines and build out new integrations and processes required for optimal extraction, transformation, and loading of data from a wide variety of data sources using 'Big Data' technologies.
  • Develop programs in Scala and Python as part of data cleaning and processing.
  • Assemble large, complex data sets that meet functional / non-functional business requirements and fostering data-driven decision making across the organization.  
  • Responsible to design and develop distributed, high volume, high velocity multi-threaded event processing systems.
  • Implement processes and systems to validate data, monitor data quality, ensuring production data is always accurate and available for key stakeholders and business processes that depend on it.
  • Perform root cause analysis on internal and external data and processes to answer specific business questions and identify opportunities for improvement.
  • Provide high operational excellence guaranteeing high availability and platform stability.
  • Closely collaborate with the Data Science team and assist the team build and deploy machine learning and deep learning models on big data analytics platforms.

Skills:

  • Experience with Big Data pipeline, Big Data analytics, Data warehousing.
  • Experience with SQL/No-SQL, schema design and dimensional data modeling.
  • Strong understanding of Hadoop Architecture, HDFS ecosystem and eexperience with Big Data technology stack such as HBase, Hadoop, Hive, MapReduce.
  • Experience in designing systems that process structured as well as unstructured data at large scale.
  • Experience in AWS/Spark/Java/Scala/Python development.
  • Should have Strong skills in PySpark (Python & SPARK). Ability to create, manage and manipulate Spark Dataframes. Expertise in Spark query tuning and performance optimization.
  • Experience in developing efficient software code/frameworks for multiple use cases leveraging Python and big data technologies.
  • Prior exposure to streaming data sources such as Kafka.
  • Should have knowledge on Shell Scripting and Python scripting.
  • High proficiency in database skills (e.g., Complex SQL), for data preparation, cleaning, and data wrangling/munging, with the ability to write advanced queries and create stored procedures.
  • Experience with NoSQL databases such as Cassandra / MongoDB.
  • Solid experience in all phases of Software Development Lifecycle - plan, design, develop, test, release, maintain and support, decommission.
  • Experience with DevOps tools (GitHub, Travis CI, and JIRA) and methodologies (Lean, Agile, Scrum, Test Driven Development).
  • Experience building and deploying applications on on-premise and cloud-based infrastructure.
  • Having a good understanding of machine learning landscape and concepts. 

 

Qualifications and Experience:

Engineering and post graduate candidates, preferably in Computer Science, from premier institutions with proven work experience as a Big Data Engineer or a similar role for 3-5 years.

Certifications:

Good to have at least one of the Certifications listed here:

    AZ 900 - Azure Fundamentals

    DP 200, DP 201, DP 203, AZ 204 - Data Engineering

    AZ 400 - Devops Certification

Read more
netmedscom

at netmedscom

3 recruiters
Vijay Hemnath
Posted by Vijay Hemnath
Chennai
5 - 10 yrs
₹10L - ₹30L / yr
skill iconMachine Learning (ML)
Software deployment
CI/CD
Cloud Computing
Snow flake schema
+19 more

We are looking for an outstanding ML Architect (Deployments) with expertise in deploying Machine Learning solutions/models into production and scaling them to serve millions of customers. A candidate with an adaptable and productive working style which fits in a fast-moving environment.

 

Skills:

- 5+ years deploying Machine Learning pipelines in large enterprise production systems.

- Experience developing end to end ML solutions from business hypothesis to deployment / understanding the entirety of the ML development life cycle.
- Expert in modern software development practices; solid experience using source control management (CI/CD).
- Proficient in designing relevant architecture / microservices to fulfil application integration, model monitoring, training / re-training, model management, model deployment, model experimentation/development, alert mechanisms.
- Experience with public cloud platforms (Azure, AWS, GCP).
- Serverless services like lambda, azure functions, and/or cloud functions.
- Orchestration services like data factory, data pipeline, and/or data flow.
- Data science workbench/managed services like azure machine learning, sagemaker, and/or AI platform.
- Data warehouse services like snowflake, redshift, bigquery, azure sql dw, AWS Redshift.
- Distributed computing services like Pyspark, EMR, Databricks.
- Data storage services like cloud storage, S3, blob, S3 Glacier.
- Data visualization tools like Power BI, Tableau, Quicksight, and/or Qlik.
- Proven experience serving up predictive algorithms and analytics through batch and real-time APIs.
- Solid working experience with software engineers, data scientists, product owners, business analysts, project managers, and business stakeholders to design the holistic solution.
- Strong technical acumen around automated testing.
- Extensive background in statistical analysis and modeling (distributions, hypothesis testing, probability theory, etc.)
- Strong hands-on experience with statistical packages and ML libraries (e.g., Python scikit learn, Spark MLlib, etc.)
- Experience in effective data exploration and visualization (e.g., Excel, Power BI, Tableau, Qlik, etc.)
- Experience in developing and debugging in one or more of the languages Java, Python.
- Ability to work in cross functional teams.
- Apply Machine Learning techniques in production including, but not limited to, neuralnets, regression, decision trees, random forests, ensembles, SVM, Bayesian models, K-Means, etc.

 

Roles and Responsibilities:

Deploying ML models into production, and scaling them to serve millions of customers.

Technical solutioning skills with deep understanding of technical API integrations, AI / Data Science, BigData and public cloud architectures / deployments in a SaaS environment.

Strong stakeholder relationship management skills - able to influence and manage the expectations of senior executives.
Strong networking skills with the ability to build and maintain strong relationships with both business, operations and technology teams internally and externally.

Provide software design and programming support to projects.

 

 Qualifications & Experience:

Engineering and post graduate candidates, preferably in Computer Science, from premier institutions with proven work experience as a Machine Learning Architect (Deployments) or a similar role for 5-7 years.

 

Read more
Fragma Data Systems

at Fragma Data Systems

8 recruiters
Evelyn Charles
Posted by Evelyn Charles
Remote, Bengaluru (Bangalore), Hyderabad, Chennai, Mumbai, Pune
8 - 15 yrs
₹16L - ₹28L / yr
PySpark
SQL Azure
azure synapse
Windows Azure
Azure Data Engineer
+3 more
Technology Skills:
  • Building and operationalizing large scale enterprise data solutions and applications using one or more of AZURE data and analytics services in combination with custom solutions - Azure Synapse/Azure SQL DWH, Azure Data Lake, Azure Blob Storage, Spark, HDInsights, Databricks, CosmosDB, EventHub/IOTHub.
  • Experience in migrating on-premise data warehouses to data platforms on AZURE cloud. 
  • Designing and implementing data engineering, ingestion, and transformation functions
Good to Have: 
  • Experience with Azure Analysis Services
  • Experience in Power BI
  • Experience with third-party solutions like Attunity/Stream sets, Informatica
  • Experience with PreSales activities (Responding to RFPs, Executing Quick POCs)
  • Capacity Planning and Performance Tuning on Azure Stack and Spark.
Read more
Get to hear about interesting companies hiring right now
Company logo
Company logo
Company logo
Company logo
Company logo
Linkedin iconFollow Cutshort
Why apply via Cutshort?
Connect with actual hiring teams and get their fast response. No spam.
Find more jobs
Get to hear about interesting companies hiring right now
Company logo
Company logo
Company logo
Company logo
Company logo
Linkedin iconFollow Cutshort