Cuda jobs

6+ CUDA Jobs in India

Apply to 6+ CUDA Jobs on CutShort.io. Find your next job, effortlessly. Browse CUDA Jobs and apply today!

C++ Developer

ACG World

Agency job

via Tecaky by Piyali Patel

Mumbai

4 - 5 yrs

₹7L - ₹10L / yr

CUDA

Linux/Unix

C++

Programming

Graphics Processing Unit (GPU)

We are seeking an experienced Developer with a strong background in C++, CUDA programming, and Linux to guide our development team in building cutting-edge solutions for device integration and high-performance computing tasks. This is a hands-on leadership position that combines technical expertise with team management skills to deliver high-quality software products.

No of position – 2

Duration – 1 year contractual position (candidate will be on Hiring Panda Payroll)

Experience Range – 4 to 5 years

Notice Period – Immediate, candidate should join within 7 days

Location – Kandivali, Mumbai (Only local candidates of Mumbai is acceptable)

Work Mode- Work from Office

Primary responsibilities:

Software Development:

• Develop and maintain high-performance applications using C++ and CUDA.

• Design and implement parallel algorithms for GPUs to accelerate computational

workloads.

Performance Optimization:

• Optimize CUDA kernels for performance, scalability, and memory efficiency.

• Analyze performance bottlenecks and propose innovative solutions.

Code Review and Testing:

• Conduct code reviews to ensure adherence to coding standards and best practices.

• Develop and execute test cases to validate functionality and performance.

Collaboration:

• Work closely with the software engineering and research teams to understand

requirements and deliver robust solutions.

• Provide technical guidance and mentoring to junior team members when necessary.

Documentation:

• Write and maintain technical documentation, including design specifications and user

manuals.

Required Skills:

• C++: Strong proficiency in modern C++ (C++11/14/17/20).

• CUDA Programming: Extensive experience in developing, debugging, and optimizing

CUDA applications.

• GPU Optimization: Familiarity with memory hierarchy, shared memory, streams, and

warp-level operations in CUDA.

• Parallel Computing: Solid understanding of parallel algorithms and multi-threaded

programming.

• Mathematical and Analytical Skills: Strong foundation in linear algebra, calculus, and

numerical methods.

• Tools: Experience with debugging/profiling tools like Nsight, CUDA Memcheck, or similar.

No of position – 2

Duration – 1 year contractual position (candidate will be on Hiring Panda Payroll)

Experience Range – 4 to 5 years

Notice Period – Immediate, candidate should join within 7 days

Location – Kandivali, Mumbai (Only local candidates of Mumbai is acceptable)

Work Mode- Work from Office

Primary responsibilities:

Software Development:

• Develop and maintain high-performance applications using C++ and CUDA.

• Design and implement parallel algorithms for GPUs to accelerate computational

workloads.

Performance Optimization:

• Optimize CUDA kernels for performance, scalability, and memory efficiency.

• Analyze performance bottlenecks and propose innovative solutions.

Code Review and Testing:

• Conduct code reviews to ensure adherence to coding standards and best practices.

• Develop and execute test cases to validate functionality and performance.

Collaboration:

• Work closely with the software engineering and research teams to understand

requirements and deliver robust solutions.

• Provide technical guidance and mentoring to junior team members when necessary.

Documentation:

• Write and maintain technical documentation, including design specifications and user

manuals.

Required Skills:

• C++: Strong proficiency in modern C++ (C++11/14/17/20).

• CUDA Programming: Extensive experience in developing, debugging, and optimizing

CUDA applications.

• GPU Optimization: Familiarity with memory hierarchy, shared memory, streams, and

warp-level operations in CUDA.

• Parallel Computing: Solid understanding of parallel algorithms and multi-threaded

programming.

• Mathematical and Analytical Skills: Strong foundation in linear algebra, calculus, and

numerical methods.

• Tools: Experience with debugging/profiling tools like Nsight, CUDA Memcheck, or similar.

Software Architect

at Jio Tesseract

Posted by TARUN MISHRA

Bengaluru (Bangalore), Pune, Hyderabad, Delhi, Gurugram, Noida, Ghaziabad, Faridabad, Mumbai, Navi Mumbai, Kolkata, Rajasthan

5 - 24 yrs

₹9L - ₹70L / yr

C++

Visual C++

Embedded C++

Artificial Intelligence (AI)

+32 more

JioTesseract, a digital arm of Reliance Industries, is India's leading and largest AR/VR organization with the mission to democratize mixed reality for India and the world. We make products at the cross of hardware, software, content and services with focus on making India the leader in spatial computing. We specialize in creating solutions in AR, VR and AI, with some of our notable products such as JioGlass, JioDive, 360 Streaming, Metaverse, AR/VR headsets for consumers and enterprise space.

Mon-fri role, In office, with excellent perks and benefits!

Position Overview

We are seeking a Software Architect to lead the design and development of high-performance robotics and AI software stacks utilizing NVIDIA technologies. This role will focus on defining scalable, modular, and efficient architectures for robot perception, planning, simulation, and embedded AI applications. You will collaborate with cross-functional teams to build next-generation autonomous systems 9

Key Responsibilities:

1. System Architecture & Design

● Define scalable software architectures for robotics perception, navigation, and AI-driven decision-making.

● Design modular and reusable frameworks that leverage NVIDIA’s Jetson, Isaac ROS, Omniverse, and CUDA ecosystems.

● Establish best practices for real-time computing, GPU acceleration, and edge AI inference.

2. Perception & AI Integration

● Architect sensor fusion pipelines using LIDAR, cameras, IMUs, and radar with DeepStream, TensorRT, and ROS2.

● Optimize computer vision, SLAM, and deep learning models for edge deployment on Jetson Orin and Xavier.

● Ensure efficient GPU-accelerated AI inference for real-time robotics applications.

3. Embedded & Real-Time Systems

● Design high-performance embedded software stacks for real-time robotic control and autonomy.

● Utilize NVIDIA CUDA, cuDNN, and TensorRT to accelerate AI model execution on Jetson platforms.

● Develop robust middleware frameworks to support real-time robotics applications in ROS2 and Isaac SDK.

4. Robotics Simulation & Digital Twins

● Define architectures for robotic simulation environments using NVIDIA Isaac Sim & Omniverse.

● Leverage synthetic data generation (Omniverse Replicator) for training AI models.

● Optimize sim-to-real transfer learning for AI-driven robotic behaviors.

5. Navigation & Motion Planning

● Architect GPU-accelerated motion planning and SLAM pipelines for autonomous robots.

● Optimize path planning, localization, and multi-agent coordination using Isaac ROS Navigation.

● Implement reinforcement learning-based policies using Isaac Gym.

6. Performance Optimization & Scalability

● Ensure low-latency AI inference and real-time execution of robotics applications.

● Optimize CUDA kernels and parallel processing pipelines for NVIDIA hardware.

● Develop benchmarking and profiling tools to measure software performance on edge AI devices.

Required Qualifications:

● Master’s or Ph.D. in Computer Science, Robotics, AI, or Embedded Systems.

● Extensive experience (7+ years) in software development, with at least 3-5 years focused on architecture and system design, especially for robotics or embedded systems.

● Expertise in CUDA, TensorRT, DeepStream, PyTorch, TensorFlow, and ROS2.

● Experience in NVIDIA Jetson platforms, Isaac SDK, and GPU-accelerated AI.

● Proficiency in programming languages such as C++, Python, or similar, with deep understanding of low-level and high-level design principles.

● Strong background in robotic perception, planning, and real-time control.

● Experience with cloud-edge AI deployment and scalable architectures.

Preferred Qualifications

● Hands-on experience with NVIDIA DRIVE, NVIDIA Omniverse, and Isaac Gym

● Knowledge of robot kinematics, control systems, and reinforcement learning

● Expertise in distributed computing, containerization (Docker), and cloud robotics

● Familiarity with automotive, industrial automation, or warehouse robotics

● Experience designing architectures for autonomous systems or multi-robot systems.

● Familiarity with cloud-based solutions, edge computing, or distributed computing for robotics

● Experience with microservices or service-oriented architecture (SOA)

● Knowledge of machine learning and AI integration within robotic systems

● Knowledge of testing on edge devices with HIL and simulations (Isaac Sim, Gazebo, V-REP etc.)