CUDA Jobs in Chennai

3+ CUDA Jobs in Chennai | CUDA Job openings in Chennai

Apply to 3+ CUDA Jobs in Chennai on CutShort.io. Explore the latest CUDA Job opportunities across top companies like Google, Amazon & Adobe.

Devops / MLOPS Engineer

at Byteridge

3 recruiters

Posted by Sweety S

Bengaluru (Bangalore), Mumbai, Gurugram, Noida, Pune, Hyderabad, Chennai

5 - 7 yrs

₹25L - ₹31L / yr

Amazon Web Services (AWS)

Python

Graphics Processing Unit (GPU)

AWS Sagemaker

GPU

+2 more

Byteridge is seeking a Rapid Prototyping Engineer specializing in AI Infrastructure & Optimization to work with our most strategic customers on deploying, fine-tuning, and optimizing large language models at scale. You will be at the forefront of Byteridge's AI infrastructure capabilities, helping customers unlock the full potential of foundation models through expert-level deployment on GPU infrastructure.

This highly technical role requires deep expertise in machine learning infrastructure, GPU optimization, and production ML systems, combined with the ability to translate complex technical concepts into customer success.

What You'll Do

Model Deployment & Optimization

• Lead end-to-end deployments of large language models on AWS infrastructure for strategic

customers

• Design and implement training, fine-tuning, and inference pipelines using Amazon SageMaker AI

• Optimize model performance through GPU-level tuning, kernel optimization, and infrastructure

configuration

• Deploy models on diverse GPU architectures including NVIDIA and AWS custom silicon (Trainium,

Inferentia)

Infrastructure Architecture & Performance

• Architect scalable ML infrastructure using SageMaker AI Inference, HyperPod, and distributed

training frameworks

• Implement CUDA-level optimizations and custom kernels for improved model performance

• Design storage and networking architectures optimized for high-throughput ML workloads

• Troubleshoot and resolve complex performance bottlenecks at the GPU driver and kernel level

Customer Engagement & Technical Leadership

• Partner with AWS AI Specialist Solution Architects and customer ML teams to understand model

requirements and deployment constraints

• Provide technical guidance on model selection, fine-tuning strategies, and production best practices

• Conduct performance benchmarking and cost optimization analysis for ML workloads

• Share field insights with AWS product teams to influence infrastructure and service roadmaps

What We're Looking For

Core Qualifications

• Bachelor's degree in computer science, Engineering, or equivalent practical experience (Master's or

PhD preferred)

• 5+ years of experience in machine learning infrastructure, model deployment, or GPU computing

• Strong programming skills in Python and experience with ML frameworks (PyTorch, TensorFlow, JAX)

• Deep understanding of LLM architectures, training methodologies, and inference optimization

Technical Expertise (High-Level Alignment)

• Hands-on experience training, fine-tuning, or deploying large language models in production

• Proficiency with GPU programming, CUDA, and kernel-level optimization techniques

• Experience with distributed training frameworks and multi-GPU/multi-node orchestration

• Strong knowledge of AWS core services: EC2 (GPU instances), S3, EFS, VPC, and networking

Preferred Experience

• Direct experience with Amazon SageMaker AI (Training, Inference, HyperPod) or equivalent ML

platforms

• Understanding of GPU architectures (NVIDIA A100, H100) and AWS custom silicon (Trainium,

Inferentia)

• Experience with model compression techniques (quantization, pruning, distillation)

• Knowledge of MLOps practices, model monitoring, and production ML system design

• Background in high-performance computing, distributed systems, or systems programming

Essential Attributes

• Ability to dive deep into technical problems and debug complex infrastructure issues

• Strong analytical skills with data-driven approach to optimization

• Excellent communication skills to explain complex technical concepts to diverse audiences

• Comfortable working in ambiguous, fast-paced environments with evolving requirements

• Ownership mindset with ability to drive projects from architecture to production

What You'll Do

Model Deployment & Optimization

• Lead end-to-end deployments of large language models on AWS infrastructure for strategic

customers

• Design and implement training, fine-tuning, and inference pipelines using Amazon SageMaker AI

• Optimize model performance through GPU-level tuning, kernel optimization, and infrastructure

configuration

• Deploy models on diverse GPU architectures including NVIDIA and AWS custom silicon (Trainium,

Inferentia)

Infrastructure Architecture & Performance

• Architect scalable ML infrastructure using SageMaker AI Inference, HyperPod, and distributed

training frameworks

• Implement CUDA-level optimizations and custom kernels for improved model performance

• Design storage and networking architectures optimized for high-throughput ML workloads

• Troubleshoot and resolve complex performance bottlenecks at the GPU driver and kernel level

Customer Engagement & Technical Leadership

• Partner with AWS AI Specialist Solution Architects and customer ML teams to understand model

requirements and deployment constraints

• Provide technical guidance on model selection, fine-tuning strategies, and production best practices

• Conduct performance benchmarking and cost optimization analysis for ML workloads

• Share field insights with AWS product teams to influence infrastructure and service roadmaps

What We're Looking For

Core Qualifications

• Bachelor's degree in computer science, Engineering, or equivalent practical experience (Master's or

PhD preferred)

• 5+ years of experience in machine learning infrastructure, model deployment, or GPU computing

• Strong programming skills in Python and experience with ML frameworks (PyTorch, TensorFlow, JAX)

• Deep understanding of LLM architectures, training methodologies, and inference optimization

Technical Expertise (High-Level Alignment)

• Hands-on experience training, fine-tuning, or deploying large language models in production

• Proficiency with GPU programming, CUDA, and kernel-level optimization techniques

• Experience with distributed training frameworks and multi-GPU/multi-node orchestration

• Strong knowledge of AWS core services: EC2 (GPU instances), S3, EFS, VPC, and networking

Preferred Experience

• Direct experience with Amazon SageMaker AI (Training, Inference, HyperPod) or equivalent ML

platforms

• Understanding of GPU architectures (NVIDIA A100, H100) and AWS custom silicon (Trainium,

Inferentia)

• Experience with model compression techniques (quantization, pruning, distillation)

• Knowledge of MLOps practices, model monitoring, and production ML system design

• Background in high-performance computing, distributed systems, or systems programming

Essential Attributes

• Ability to dive deep into technical problems and debug complex infrastructure issues

• Strong analytical skills with data-driven approach to optimization

• Excellent communication skills to explain complex technical concepts to diverse audiences

• Comfortable working in ambiguous, fast-paced environments with evolving requirements

• Ownership mindset with ability to drive projects from architecture to production

ML Engineer ( Infrastructure & Optimisation)

at Byteridge

3 recruiters

Posted by Sweety S

Bengaluru (Bangalore), Chennai, Hyderabad, Pune, Noida, Gurugram, Mumbai

5 - 7 yrs

₹16L - ₹25L / yr

Python

Amazon Web Services (AWS)

CUDA

GPU computing

Amazon EC2

+6 more

You will be at the forefront of Byteridge's AI infrastructure capabilities, helping customers unlock the full potential of foundation models through expert-level deployment on GPU infrastructure.

What You'll Do

Model Deployment & Optimization

• Lead end-to-end deployments of large language models on AWS infrastructure for strategic

customers

• Design and implement training, fine-tuning, and inference pipelines using Amazon SageMaker AI

• Optimize model performance through GPU-level tuning, kernel optimization, and infrastructure

configuration

• Deploy models on diverse GPU architectures including NVIDIA and AWS custom silicon (Trainium,

Inferentia)

Infrastructure Architecture & Performance

• Architect scalable ML infrastructure using SageMaker AI Inference, HyperPod, and distributed

training frameworks

• Implement CUDA-level optimizations and custom kernels for improved model performance

• Design storage and networking architectures optimized for high-throughput ML workloads

• Troubleshoot and resolve complex performance bottlenecks at the GPU driver and kernel level

Customer Engagement & Technical Leadership

• Partner with AWS AI Specialist Solution Architects and customer ML teams to understand model

requirements and deployment constraints

• Provide technical guidance on model selection, fine-tuning strategies, and production best practices

• Conduct performance benchmarking and cost optimization analysis for ML workloads

• Share field insights with AWS product teams to influence infrastructure and service roadmaps

What We're Looking For

Core Qualifications

• Bachelor's degree in Computer Science, Engineering, or equivalent practical experience (Master's or

PhD preferred)

• 5+ years of experience in machine learning infrastructure, model deployment, or GPU computing

• Strong programming skills in Python and experience with ML frameworks (PyTorch, TensorFlow, JAX)• Deep understanding of LLM architectures, training methodologies, and inference optimization

Technical Expertise (High-Level Alignment)

• Hands-on experience training, fine-tuning, or deploying large language models in production

• Proficiency with GPU programming, CUDA, and kernel-level optimization techniques

• Experience with distributed training frameworks and multi-GPU/multi-node orchestration

• Strong knowledge of AWS core services: EC2 (GPU instances), S3, EFS, VPC, and networking

Preferred Experience

• Direct experience with Amazon SageMaker AI (Training, Inference, HyperPod) or equivalent ML

platforms

• Understanding of GPU architectures (NVIDIA A100, H100) and AWS custom silicon (Trainium,

Inferentia)

• Experience with model compression techniques (quantization, pruning, distillation)

• Knowledge of MLOps practices, model monitoring, and production ML system design

• Background in high-performance computing, distributed systems, or systems programming

Essential Attributes

• Ability to dive deep into technical problems and debug complex infrastructure issues

• Strong analytical skills with data-driven approach to optimization

• Excellent communication skills to explain complex technical concepts to diverse audiences

• Comfortable working in ambiguous, fast-paced environments with evolving requirements

• Ownership mindset with ability to drive projects from architecture to production

You will be at the forefront of Byteridge's AI infrastructure capabilities, helping customers unlock the full potential of foundation models through expert-level deployment on GPU infrastructure.

What You'll Do

Model Deployment & Optimization

• Lead end-to-end deployments of large language models on AWS infrastructure for strategic

customers

• Design and implement training, fine-tuning, and inference pipelines using Amazon SageMaker AI

• Optimize model performance through GPU-level tuning, kernel optimization, and infrastructure

configuration

• Deploy models on diverse GPU architectures including NVIDIA and AWS custom silicon (Trainium,

Inferentia)

Infrastructure Architecture & Performance

• Architect scalable ML infrastructure using SageMaker AI Inference, HyperPod, and distributed

training frameworks

• Implement CUDA-level optimizations and custom kernels for improved model performance

• Design storage and networking architectures optimized for high-throughput ML workloads

• Troubleshoot and resolve complex performance bottlenecks at the GPU driver and kernel level

Customer Engagement & Technical Leadership

• Partner with AWS AI Specialist Solution Architects and customer ML teams to understand model

requirements and deployment constraints

• Provide technical guidance on model selection, fine-tuning strategies, and production best practices

• Conduct performance benchmarking and cost optimization analysis for ML workloads

• Share field insights with AWS product teams to influence infrastructure and service roadmaps

What We're Looking For

Core Qualifications

• Bachelor's degree in Computer Science, Engineering, or equivalent practical experience (Master's or

PhD preferred)

• 5+ years of experience in machine learning infrastructure, model deployment, or GPU computing

• Strong programming skills in Python and experience with ML frameworks (PyTorch, TensorFlow, JAX)• Deep understanding of LLM architectures, training methodologies, and inference optimization

Technical Expertise (High-Level Alignment)

• Hands-on experience training, fine-tuning, or deploying large language models in production

• Proficiency with GPU programming, CUDA, and kernel-level optimization techniques

• Experience with distributed training frameworks and multi-GPU/multi-node orchestration

• Strong knowledge of AWS core services: EC2 (GPU instances), S3, EFS, VPC, and networking

Preferred Experience

• Direct experience with Amazon SageMaker AI (Training, Inference, HyperPod) or equivalent ML

platforms

• Understanding of GPU architectures (NVIDIA A100, H100) and AWS custom silicon (Trainium,

Inferentia)

• Experience with model compression techniques (quantization, pruning, distillation)

• Knowledge of MLOps practices, model monitoring, and production ML system design

• Background in high-performance computing, distributed systems, or systems programming

Essential Attributes

• Ability to dive deep into technical problems and debug complex infrastructure issues

• Strong analytical skills with data-driven approach to optimization

• Excellent communication skills to explain complex technical concepts to diverse audiences

• Comfortable working in ambiguous, fast-paced environments with evolving requirements

• Ownership mindset with ability to drive projects from architecture to production

AI Manager

Global Leader in Diversified Electronics

Agency job

via Peak Hire Solutions by Dharati Thakkar

Chennai

7 - 16 yrs

₹30L - ₹65L / yr

Machine Learning (ML)

Artificial Intelligence (AI)

Algorithms

Python

C++

+10 more

JOB DESCRIPTION/PREFERRED QUALIFICATIONS:

KEY RESPONSIBILITIES:

Lead and mentor a team of algorithm engineers, providing guidance and support to ensure their professional growth and success.
Develop and maintain the infrastructure required for the deployment and execution of algorithms at scale.
Collaborate with data scientists, software engineers, and product managers to design and implement robust and scalable algorithmic solutions.
Optimize algorithm performance and resource utilization to meet business objectives.
Stay up to date with the latest advancements in algorithm engineering and infrastructure technologies and apply them to improve our systems.
Drive continuous improvement in development processes, tools, and methodologies.

QUALIFICATIONS:

Bachelor's or Master's degree in Computer Science, Engineering, or a related field.
Proven experience in developing computer vision and image processing algorithms and ML/DL algorithms.
Familiar with high performance computing, parallel programming and distributed systems.
Strong leadership and team management skills, with a track record of successfully leading engineering teams.
Proficiency in programming languages such as Python, C++ and CUDA.
Excellent problem-solving and analytical skills.
Strong communication and collaboration abilities.

PREFERRED QUALIFICATIONS:

Experience with machine learning frameworks and libraries (e.g., TensorFlow, PyTorch, Scikit-learn).
Experience with GPU architecture and algo development toolkits like Docker, Apptainer.

MINIMUM QUALIFICATIONS:

Bachelor's degree plus 8 + years of experience
Master's degree plus 8 + years of experience
Familiar with high performance computing, parallel programming and distributed systems.

MUST-HAVE SKILLS:

Phd with 6 yrs industry exp or M.Tech + 8 yrs experience or B.Tech + 10 yrs experience.
14 yrs exp if an IC role.
Minimum 1 yrs experience working as a Manager/Lead
8 years' experience in any of the programming languages such as Python/C++/CUDA.
8 years' experience in Machine learning, Artificial intelligence, Deep learning.
2 to 3 years exp in Image processing & Computer vision is a MUST
Product / Semi-conductor / Hardware Manufacturing company experience is a MUST. Candidates should be from engineering product companies
Candidates from Tier 1 colleges like (IIT, IIIT, VIT, NIT) (Preferred)
Relocation to Chennai is mandatory

NICE TO HAVE SKILLS:

Candidates from Semicon or manufacturing companies
Candidates with more than 8 CPGA

JOB DESCRIPTION/PREFERRED QUALIFICATIONS:

KEY RESPONSIBILITIES:

Lead and mentor a team of algorithm engineers, providing guidance and support to ensure their professional growth and success.
Develop and maintain the infrastructure required for the deployment and execution of algorithms at scale.
Collaborate with data scientists, software engineers, and product managers to design and implement robust and scalable algorithmic solutions.
Optimize algorithm performance and resource utilization to meet business objectives.
Stay up to date with the latest advancements in algorithm engineering and infrastructure technologies and apply them to improve our systems.
Drive continuous improvement in development processes, tools, and methodologies.

QUALIFICATIONS:

Bachelor's or Master's degree in Computer Science, Engineering, or a related field.
Proven experience in developing computer vision and image processing algorithms and ML/DL algorithms.
Familiar with high performance computing, parallel programming and distributed systems.
Strong leadership and team management skills, with a track record of successfully leading engineering teams.
Proficiency in programming languages such as Python, C++ and CUDA.
Excellent problem-solving and analytical skills.
Strong communication and collaboration abilities.

PREFERRED QUALIFICATIONS:

Experience with machine learning frameworks and libraries (e.g., TensorFlow, PyTorch, Scikit-learn).
Experience with GPU architecture and algo development toolkits like Docker, Apptainer.

MINIMUM QUALIFICATIONS:

Bachelor's degree plus 8 + years of experience
Master's degree plus 8 + years of experience
Familiar with high performance computing, parallel programming and distributed systems.

MUST-HAVE SKILLS:

Phd with 6 yrs industry exp or M.Tech + 8 yrs experience or B.Tech + 10 yrs experience.
14 yrs exp if an IC role.
Minimum 1 yrs experience working as a Manager/Lead
8 years' experience in any of the programming languages such as Python/C++/CUDA.
8 years' experience in Machine learning, Artificial intelligence, Deep learning.
2 to 3 years exp in Image processing & Computer vision is a MUST
Product / Semi-conductor / Hardware Manufacturing company experience is a MUST. Candidates should be from engineering product companies
Candidates from Tier 1 colleges like (IIT, IIIT, VIT, NIT) (Preferred)
Relocation to Chennai is mandatory

NICE TO HAVE SKILLS:

Candidates from Semicon or manufacturing companies
Candidates with more than 8 CPGA

Get to hear about interesting companies hiring right now

Follow Cutshort

Why apply via Cutshort?

Connect with actual hiring teams and get their fast response. No spam.

Find more jobs

Get to hear about interesting companies hiring right now

Follow Cutshort