9+ CUDA Jobs in India
Apply to 9+ CUDA Jobs on CutShort.io. Find your next job, effortlessly. Browse CUDA Jobs and apply today!
AI based systems design and development, entire pipeline from image/ video ingest, metadata ingest, processing, encoding, transmitting.
Implementation and testing of advanced computer vision algorithms.
Dataset search, preparation, annotation, training, testing, fine tuning of vision CNN models. Multimodal AI, LLMs, hardware deployment, explainability.
Detailed analysis of results. Documentation, version control, client support, upgrades.
About the Role
We are looking for a passionate AI Engineer Intern (B.Tech, M.Tech / M.S. or equivalent) with strong foundations in Artificial Intelligence, Computer Vision, and Deep Learning to join our R&D team.
You will help us build and train realistic face-swap and deepfake video models, powering the next generation of AI-driven video synthesis technology.
This is a remote, individual-contributor role offering exposure to cutting-edge AI model development in a startup-like environment.
Key Responsibilities
- Research, implement, and fine-tune face-swap / deepfake architectures (e.g., FaceSwap, SimSwap, DeepFaceLab, LatentSync, Wav2Lip).
- Train and optimize models for realistic facial reenactment and temporal consistency.
- Work with GANs, VAEs, and diffusion models for video synthesis.
- Handle dataset creation, cleaning, and augmentation for face-video tasks.
- Collaborate with the AI core team to deploy trained models in production environments.
- Maintain clean, modular, and reproducible pipelines using Git and experiment-tracking tools.
Required Qualifications
- B.Tech, M.Tech / M.S. (or equivalent) in AI / ML / Computer Vision / Deep Learning.
- Certifications in AI or Deep Learning (DeepLearning.AI, NVIDIA DLI, Coursera, etc.).
- Proficiency in PyTorch or TensorFlow, OpenCV, FFmpeg.
- Understanding of CNNs, Autoencoders, GANs, Diffusion Models.
- Familiarity with datasets like CelebA, VoxCeleb, FFHQ, DFDC, etc.
- Good grasp of data preprocessing, model evaluation, and performance tuning.
Preferred Skills
- Prior hands-on experience with face-swap or lip-sync frameworks.
- Exposure to 3D morphable models, NeRF, motion transfer, or facial landmark tracking.
- Knowledge of multi-GPU training and model optimization.
- Familiarity with Rust / Python backend integration for inference pipelines.
What We Offer
- Work directly on production-grade AI video synthesis systems.
- Remote-first, flexible working hours.
- Mentorship from senior AI researchers and engineers.
- Opportunity to transition into a full-time role upon outstanding performance.
Location: Remote | Stipend: ₹10,000/month | Duration: 3–6 months
JOB DESCRIPTION/PREFERRED QUALIFICATIONS:
KEY RESPONSIBILITIES:
- Lead and mentor a team of algorithm engineers, providing guidance and support to ensure their professional growth and success.
- Develop and maintain the infrastructure required for the deployment and execution of algorithms at scale.
- Collaborate with data scientists, software engineers, and product managers to design and implement robust and scalable algorithmic solutions.
- Optimize algorithm performance and resource utilization to meet business objectives.
- Stay up to date with the latest advancements in algorithm engineering and infrastructure technologies and apply them to improve our systems.
- Drive continuous improvement in development processes, tools, and methodologies.
QUALIFICATIONS:
- Bachelor's or Master's degree in Computer Science, Engineering, or a related field.
- Proven experience in developing computer vision and image processing algorithms and ML/DL algorithms.
- Familiar with high performance computing, parallel programming and distributed systems.
- Strong leadership and team management skills, with a track record of successfully leading engineering teams.
- Proficiency in programming languages such as Python, C++ and CUDA.
- Excellent problem-solving and analytical skills.
- Strong communication and collaboration abilities.
PREFERRED QUALIFICATIONS:
- Experience with machine learning frameworks and libraries (e.g., TensorFlow, PyTorch, Scikit-learn).
- Experience with GPU architecture and algo development toolkits like Docker, Apptainer.
MINIMUM QUALIFICATIONS:
- Bachelor's degree plus 8 + years of experience
- Master's degree plus 8 + years of experience
- Familiar with high performance computing, parallel programming and distributed systems.
MUST-HAVE SKILLS:
- Phd with 6 yrs industry exp or M.Tech + 8 yrs experience or B.Tech + 10 yrs experience.
- 14 yrs exp if an IC role.
- Minimum 1 yrs experience working as a Manager/Lead
- 8 years' experience in any of the programming languages such as Python/C++/CUDA.
- 8 years' experience in Machine learning, Artificial intelligence, Deep learning.
- 2 to 3 years exp in Image processing & Computer vision is a MUST
- Product / Semi-conductor / Hardware Manufacturing company experience is a MUST. Candidates should be from engineering product companies
- Candidates from Tier 1 colleges like (IIT, IIIT, VIT, NIT) (Preferred)
- Relocation to Chennai is mandatory
NICE TO HAVE SKILLS:
- Candidates from Semicon or manufacturing companies
- Candidates with more than 8 CPGA
Job Objective:
We are seeking an experienced Developer with a strong background in C++, CUDA programming,
and Linux to guide our development team in building cutting-edge solutions for device
integration and high-performance computing tasks. This is a hands-on leadership position that
combines technical expertise with team management skills to deliver high-quality software
products.
Primary responsibilities:
Software Development:
• Develop and maintain high-performance applications using C++ and CUDA.
• Design and implement parallel algorithms for GPUs to accelerate computational
workloads.
Performance Optimization:
• Optimize CUDA kernels for performance, scalability, and memory efficiency.
• Analyze performance bottlenecks and propose innovative solutions.
Code Review and Testing:
• Conduct code reviews to ensure adherence to coding standards and best practices.
• Develop and execute test cases to validate functionality and performance.
Collaboration:
• Work closely with the software engineering and research teams to understand
requirements and deliver robust solutions.
• Provide technical guidance and mentoring to junior team members when necessary.
Documentation:
• Write and maintain technical documentation, including design specifications and user
manuals.
Required Skills:
• C++: Strong proficiency in modern C++ (C++11/14/17/20).
• CUDA Programming: Extensive experience in developing, debugging, and optimizing
CUDA applications.
• GPU Optimization: Familiarity with memory hierarchy, shared memory, streams, and
warp-level operations in CUDA.
• Parallel Computing: Solid understanding of parallel algorithms and multi-threaded
programming.
• Mathematical and Analytical Skills: Strong foundation in linear algebra, calculus, and
numerical methods.
• Tools: Experience with debugging/profiling tools like Nsight, CUDA Memcheck, or
We are seeking an experienced Developer with a strong background in C++, CUDA programming, and Linux to guide our development team in building cutting-edge solutions for device integration and high-performance computing tasks. This is a hands-on leadership position that combines technical expertise with team management skills to deliver high-quality software products.
No of position – 2
Duration – 1 year contractual position (candidate will be on Hiring Panda Payroll)
Experience Range – 4 to 5 years
Notice Period – Immediate, candidate should join within 7 days
Location – Kandivali, Mumbai (Only local candidates of Mumbai is acceptable)
Work Mode- Work from Office
Primary responsibilities:
Software Development:
• Develop and maintain high-performance applications using C++ and CUDA.
• Design and implement parallel algorithms for GPUs to accelerate computational
workloads.
Performance Optimization:
• Optimize CUDA kernels for performance, scalability, and memory efficiency.
• Analyze performance bottlenecks and propose innovative solutions.
Code Review and Testing:
• Conduct code reviews to ensure adherence to coding standards and best practices.
• Develop and execute test cases to validate functionality and performance.
Collaboration:
• Work closely with the software engineering and research teams to understand
requirements and deliver robust solutions.
• Provide technical guidance and mentoring to junior team members when necessary.
Documentation:
• Write and maintain technical documentation, including design specifications and user
manuals.
Required Skills:
• C++: Strong proficiency in modern C++ (C++11/14/17/20).
• CUDA Programming: Extensive experience in developing, debugging, and optimizing
CUDA applications.
• GPU Optimization: Familiarity with memory hierarchy, shared memory, streams, and
warp-level operations in CUDA.
• Parallel Computing: Solid understanding of parallel algorithms and multi-threaded
programming.
• Mathematical and Analytical Skills: Strong foundation in linear algebra, calculus, and
numerical methods.
• Tools: Experience with debugging/profiling tools like Nsight, CUDA Memcheck, or similar.
JioTesseract, a digital arm of Reliance Industries, is India's leading and largest AR/VR organization with the mission to democratize mixed reality for India and the world. We make products at the cross of hardware, software, content and services with focus on making India the leader in spatial computing. We specialize in creating solutions in AR, VR and AI, with some of our notable products such as JioGlass, JioDive, 360 Streaming, Metaverse, AR/VR headsets for consumers and enterprise space.
Mon-fri role, In office, with excellent perks and benefits!
Position Overview
We are seeking a Software Architect to lead the design and development of high-performance robotics and AI software stacks utilizing NVIDIA technologies. This role will focus on defining scalable, modular, and efficient architectures for robot perception, planning, simulation, and embedded AI applications. You will collaborate with cross-functional teams to build next-generation autonomous systems 9
Key Responsibilities:
1. System Architecture & Design
● Define scalable software architectures for robotics perception, navigation, and AI-driven decision-making.
● Design modular and reusable frameworks that leverage NVIDIA’s Jetson, Isaac ROS, Omniverse, and CUDA ecosystems.
● Establish best practices for real-time computing, GPU acceleration, and edge AI inference.
2. Perception & AI Integration
● Architect sensor fusion pipelines using LIDAR, cameras, IMUs, and radar with DeepStream, TensorRT, and ROS2.
● Optimize computer vision, SLAM, and deep learning models for edge deployment on Jetson Orin and Xavier.
● Ensure efficient GPU-accelerated AI inference for real-time robotics applications.
3. Embedded & Real-Time Systems
● Design high-performance embedded software stacks for real-time robotic control and autonomy.
● Utilize NVIDIA CUDA, cuDNN, and TensorRT to accelerate AI model execution on Jetson platforms.
● Develop robust middleware frameworks to support real-time robotics applications in ROS2 and Isaac SDK.
4. Robotics Simulation & Digital Twins
● Define architectures for robotic simulation environments using NVIDIA Isaac Sim & Omniverse.
● Leverage synthetic data generation (Omniverse Replicator) for training AI models.
● Optimize sim-to-real transfer learning for AI-driven robotic behaviors.
5. Navigation & Motion Planning
● Architect GPU-accelerated motion planning and SLAM pipelines for autonomous robots.
● Optimize path planning, localization, and multi-agent coordination using Isaac ROS Navigation.
● Implement reinforcement learning-based policies using Isaac Gym.
6. Performance Optimization & Scalability
● Ensure low-latency AI inference and real-time execution of robotics applications.
● Optimize CUDA kernels and parallel processing pipelines for NVIDIA hardware.
● Develop benchmarking and profiling tools to measure software performance on edge AI devices.
Required Qualifications:
● Master’s or Ph.D. in Computer Science, Robotics, AI, or Embedded Systems.
● Extensive experience (7+ years) in software development, with at least 3-5 years focused on architecture and system design, especially for robotics or embedded systems.
● Expertise in CUDA, TensorRT, DeepStream, PyTorch, TensorFlow, and ROS2.
● Experience in NVIDIA Jetson platforms, Isaac SDK, and GPU-accelerated AI.
● Proficiency in programming languages such as C++, Python, or similar, with deep understanding of low-level and high-level design principles.
● Strong background in robotic perception, planning, and real-time control.
● Experience with cloud-edge AI deployment and scalable architectures.
Preferred Qualifications
● Hands-on experience with NVIDIA DRIVE, NVIDIA Omniverse, and Isaac Gym
● Knowledge of robot kinematics, control systems, and reinforcement learning
● Expertise in distributed computing, containerization (Docker), and cloud robotics
● Familiarity with automotive, industrial automation, or warehouse robotics
● Experience designing architectures for autonomous systems or multi-robot systems.
● Familiarity with cloud-based solutions, edge computing, or distributed computing for robotics
● Experience with microservices or service-oriented architecture (SOA)
● Knowledge of machine learning and AI integration within robotic systems
● Knowledge of testing on edge devices with HIL and simulations (Isaac Sim, Gazebo, V-REP etc.)
Job Description :
Position Name : CUDA Developer
Experience : 5+ Years
Opportunity : Full-time, 8 hours/day (4-hour overlap with PST)
Notice Period : Immediate
Summary :
We are looking for an experienced CUDA (Compute Unified Device Architecture) Developer with strong expertise in parallel computing and performance optimization.
Responsibilities :
- Optimize and debug CUDA applications for high performance.
- Improve algorithm efficiency through effective parallelization.
- Stay updated with the latest CUDA advancements.
Requirements :
- Experience : 5+ Years in Software Development, including 3+ Years in CUDA and 5+ Years in C++.
- Technical Skills : Proficiency in C/C++, CUDA (12.0+ preferred), cuBLAS, cuDNN, and performance tuning.
- Other Skills : Strong problem-solving, teamwork, and communication skills.
Pls Contact to sairam.akirala
@
kiaraglobalservices.com
798
981 217 8
Skills and attributes for success
To qualify for the role, you must have
- Experience of minimum 2 years, CTC – 18LPA
- Ability to develop Deep Learning frameworks to solve problems.
- Design and create platforms for image processing and visualization.
- Knowledge of computer vision libraries.
- Understanding of dataflow programming.
- B.E. (E&TC /Computer / IT / Mechanical / Electronics)
- C++, video analytics, CUDA, Deepstream
The opportunity
We are currently looking for a Computer Vision Engineer, to join our office in Pune. As a Computer Vision Engineer, you will support our clients in defining and implementing a data journey aligned with their strategic objectives.
Some of your responsibilities will include:
- Work with the research team to research, develop, evaluate, and optimize various computer vision and deep learning models for different problems.
- Take ownership to drive computer vision solutions and meet customer requirements.
- Deploying developed computer vision models on edge devices after optimization to meet customer requirements and maintain them to later improve to address additional customer requirements in the future.
- Developing data handling and machine learning pipelines for training
- In-depth understanding of computer vision models including object detection, semantic segmentation, and key-point detection
- Implementing algorithms in robust, efficient, and well-tested code.
Designation: Graphics and Simulation Engineer
Experience: 3-15 Yrs
Position Type: Full Time
Position Location: Hyderabad
Description:
We are looking for engineers to work on applied research problems related to computer graphics in autonomous driving of electric tractors. The team works towards creating a universe of farm environments in which tractors can driver around for the purposes of simulation, synthetic data generation for deep learning training, simulation of edges cases and modelling physics.
Technical Skills:
● Background in OpenGL, OpenCL, graphics algorithms and optimization is necessary.
● Solid theoretical background in computational geometry and computer graphics is desired. Deep learning background is optional.
● Experience in two view and multi-view geometry.
● Necessary Skills: Python, C++, Boost, OpenGL, OpenCL, Unity3D/Unreal, WebGL, CUDA.
● Academic experience for freshers in graphics is also preferred.
● Experienced candidates in Computer Graphics with no prior Deep Learning experience willing to apply their knowledge to vision problems are also encouraged to apply.
● Software development experience on low-power embedded platforms is a plus.
Responsibilities:
● Understanding of engineering principles and a clear understanding of data structures and algorithms.
● Ability to understand, optimize and debug imaging algorithms.
● Ability to drive a project from conception to completion, research papers to code with disciplined approach to software development on Linux platform
● Demonstrate outstanding ability to perform innovative and significant research in the form of technical papers, thesis, or patents.
● Optimize runtime performance of designed models.
● Deploy models to production and monitor performance and debug inaccuracies and exceptions.
● Communicate and collaborate with team members in India and abroad for the fulfillment of your duties and organizational objectives.
● Thrive in a fast-paced environment and have the ability to own the project end to end with minimum hand holding
● Learn & adapt new technologies & skillsets
● Work on projects independently with timely delivery & defect free approach.
● Thesis focusing on the above skill set may be given more preference.





