2+ CUDA Jobs in Delhi, NCR and Gurgaon | CUDA Job openings in Delhi, NCR and Gurgaon
Apply to 2+ CUDA Jobs in Delhi, NCR and Gurgaon on CutShort.io. Explore the latest CUDA Job opportunities across top companies like Google, Amazon & Adobe.
You will be at the forefront of Byteridge's AI infrastructure capabilities, helping customers unlock the full potential of foundation models through expert-level deployment on GPU infrastructure.
This highly technical role requires deep expertise in machine learning infrastructure, GPU optimization, and production ML systems, combined with the ability to translate complex technical concepts into customer success.
What You'll Do
Model Deployment & Optimization
• Lead end-to-end deployments of large language models on AWS infrastructure for strategic
customers
• Design and implement training, fine-tuning, and inference pipelines using Amazon SageMaker AI
• Optimize model performance through GPU-level tuning, kernel optimization, and infrastructure
configuration
• Deploy models on diverse GPU architectures including NVIDIA and AWS custom silicon (Trainium,
Inferentia)
Infrastructure Architecture & Performance
• Architect scalable ML infrastructure using SageMaker AI Inference, HyperPod, and distributed
training frameworks
• Implement CUDA-level optimizations and custom kernels for improved model performance
• Design storage and networking architectures optimized for high-throughput ML workloads
• Troubleshoot and resolve complex performance bottlenecks at the GPU driver and kernel level
Customer Engagement & Technical Leadership
• Partner with AWS AI Specialist Solution Architects and customer ML teams to understand model
requirements and deployment constraints
• Provide technical guidance on model selection, fine-tuning strategies, and production best practices
• Conduct performance benchmarking and cost optimization analysis for ML workloads
• Share field insights with AWS product teams to influence infrastructure and service roadmaps
What We're Looking For
Core Qualifications
• Bachelor's degree in Computer Science, Engineering, or equivalent practical experience (Master's or
PhD preferred)
• 5+ years of experience in machine learning infrastructure, model deployment, or GPU computing
• Strong programming skills in Python and experience with ML frameworks (PyTorch, TensorFlow, JAX)• Deep understanding of LLM architectures, training methodologies, and inference optimization
Technical Expertise (High-Level Alignment)
• Hands-on experience training, fine-tuning, or deploying large language models in production
• Proficiency with GPU programming, CUDA, and kernel-level optimization techniques
• Experience with distributed training frameworks and multi-GPU/multi-node orchestration
• Strong knowledge of AWS core services: EC2 (GPU instances), S3, EFS, VPC, and networking
Preferred Experience
• Direct experience with Amazon SageMaker AI (Training, Inference, HyperPod) or equivalent ML
platforms
• Understanding of GPU architectures (NVIDIA A100, H100) and AWS custom silicon (Trainium,
Inferentia)
• Experience with model compression techniques (quantization, pruning, distillation)
• Knowledge of MLOps practices, model monitoring, and production ML system design
• Background in high-performance computing, distributed systems, or systems programming
Essential Attributes
• Ability to dive deep into technical problems and debug complex infrastructure issues
• Strong analytical skills with data-driven approach to optimization
• Excellent communication skills to explain complex technical concepts to diverse audiences
• Comfortable working in ambiguous, fast-paced environments with evolving requirements
• Ownership mindset with ability to drive projects from architecture to production
JioTesseract, a digital arm of Reliance Industries, is India's leading and largest AR/VR organization with the mission to democratize mixed reality for India and the world. We make products at the cross of hardware, software, content and services with focus on making India the leader in spatial computing. We specialize in creating solutions in AR, VR and AI, with some of our notable products such as JioGlass, JioDive, 360 Streaming, Metaverse, AR/VR headsets for consumers and enterprise space.
Mon-fri role, In office, with excellent perks and benefits!
Position Overview
We are seeking a Software Architect to lead the design and development of high-performance robotics and AI software stacks utilizing NVIDIA technologies. This role will focus on defining scalable, modular, and efficient architectures for robot perception, planning, simulation, and embedded AI applications. You will collaborate with cross-functional teams to build next-generation autonomous systems 9
Key Responsibilities:
1. System Architecture & Design
● Define scalable software architectures for robotics perception, navigation, and AI-driven decision-making.
● Design modular and reusable frameworks that leverage NVIDIA’s Jetson, Isaac ROS, Omniverse, and CUDA ecosystems.
● Establish best practices for real-time computing, GPU acceleration, and edge AI inference.
2. Perception & AI Integration
● Architect sensor fusion pipelines using LIDAR, cameras, IMUs, and radar with DeepStream, TensorRT, and ROS2.
● Optimize computer vision, SLAM, and deep learning models for edge deployment on Jetson Orin and Xavier.
● Ensure efficient GPU-accelerated AI inference for real-time robotics applications.
3. Embedded & Real-Time Systems
● Design high-performance embedded software stacks for real-time robotic control and autonomy.
● Utilize NVIDIA CUDA, cuDNN, and TensorRT to accelerate AI model execution on Jetson platforms.
● Develop robust middleware frameworks to support real-time robotics applications in ROS2 and Isaac SDK.
4. Robotics Simulation & Digital Twins
● Define architectures for robotic simulation environments using NVIDIA Isaac Sim & Omniverse.
● Leverage synthetic data generation (Omniverse Replicator) for training AI models.
● Optimize sim-to-real transfer learning for AI-driven robotic behaviors.
5. Navigation & Motion Planning
● Architect GPU-accelerated motion planning and SLAM pipelines for autonomous robots.
● Optimize path planning, localization, and multi-agent coordination using Isaac ROS Navigation.
● Implement reinforcement learning-based policies using Isaac Gym.
6. Performance Optimization & Scalability
● Ensure low-latency AI inference and real-time execution of robotics applications.
● Optimize CUDA kernels and parallel processing pipelines for NVIDIA hardware.
● Develop benchmarking and profiling tools to measure software performance on edge AI devices.
Required Qualifications:
● Master’s or Ph.D. in Computer Science, Robotics, AI, or Embedded Systems.
● Extensive experience (7+ years) in software development, with at least 3-5 years focused on architecture and system design, especially for robotics or embedded systems.
● Expertise in CUDA, TensorRT, DeepStream, PyTorch, TensorFlow, and ROS2.
● Experience in NVIDIA Jetson platforms, Isaac SDK, and GPU-accelerated AI.
● Proficiency in programming languages such as C++, Python, or similar, with deep understanding of low-level and high-level design principles.
● Strong background in robotic perception, planning, and real-time control.
● Experience with cloud-edge AI deployment and scalable architectures.
Preferred Qualifications
● Hands-on experience with NVIDIA DRIVE, NVIDIA Omniverse, and Isaac Gym
● Knowledge of robot kinematics, control systems, and reinforcement learning
● Expertise in distributed computing, containerization (Docker), and cloud robotics
● Familiarity with automotive, industrial automation, or warehouse robotics
● Experience designing architectures for autonomous systems or multi-robot systems.
● Familiarity with cloud-based solutions, edge computing, or distributed computing for robotics
● Experience with microservices or service-oriented architecture (SOA)
● Knowledge of machine learning and AI integration within robotic systems
● Knowledge of testing on edge devices with HIL and simulations (Isaac Sim, Gazebo, V-REP etc.)


