Cutshort logo
IT Industry  logo
AI Runtime Lead (LLM DevOps, PyTorch)
IT Industry
AI Runtime Lead (LLM DevOps, PyTorch)
IT Industry 's logo

AI Runtime Lead (LLM DevOps, PyTorch)

at IT Industry

Agency job
5 - 8 yrs
₹12L - ₹16L / yr
Bengaluru (Bangalore)
Skills
skill iconPython

Review Criteria:

Mandatory:

  • Strong AI Runtime Engineering (Lead / Staff) Profiles
  • Must have 4+ years of software engineering experience
  • Must have proven 1+ years of experience designing, building, and owning AI runtime infrastructure supporting distributed training and/or inference at scale
  • Must have hands-on experience optimizing deep learning runtimes such as PyTorch, TensorFlow, etc
  • Must have strong low-level performance engineering experience, including profiling, debugging, and optimizing system throughput, latency, and reliability
  • Must have experience leading or mentoring a team, including technical guidance, code reviews, and delivery ownership
  • Must have strong programming skills in Python, Java, C++ , etc


Preferred:

Experience with Kubernetes, Ray, TorchElastic, or custom AI job orchestration frameworks

Exposure to LLM training pipelines, checkpointing, elastic or distributed training orchestration


Role & Responsibilities:

As Lead/Staff AI Runtime Engineer, you’ll play a pivotal role in the design, development, and optimization of the core runtime infrastructure that powers distributed training and deployment of large AI models (LLMs and beyond). This is a hands-on leadership role - perfect for a systems-minded software engineer who thrives at the intersection of AI workloads, runtimes, and performance-critical infrastructure. You’ll own critical components of our PyTorch-based stack, lead technical direction, and collaborate across engineering, research, and product to push the boundaries of elastic, fault-tolerant, high-performance model execution.


What you’ll do:

Lead Runtime Design & Development:

  • Own the core runtime architecture supporting AI training and inference at scale.
  • Design resilient and elastic runtime features (e.g. dynamic node scaling, job recovery) within our custom PyTorch stack.
  • Optimize distributed training reliability, orchestration, and job-level fault tolerance.


Drive Performance at Scale:

  • Profile and enhance low-level system performance across training and inference pipelines.
  • Improve packaging, deployment, and integration of customer models in production environments.
  • Ensure consistent throughput, latency, and reliability metrics across multi-node, multi- GPU setups.


Build Internal Tooling & Frameworks:

  • Design and maintain libraries and services that support model lifecycle: training, check pointing, fault recovery, packaging, and deployment.
  • Implement observability hooks, diagnostics, and resilience mechanisms for deep learning workloads.
  • Champion best practices in CI/CD, testing, and software quality across the AI Runtime stack.


Collaborate & Mentor:

  • Work cross-functionally with Research, Infrastructure, and Product teams to align runtime development with customer and platform needs.
  • Guide technical discussions, mentor junior engineers, and help scale the AI Runtime team’s capabilities.


Ideal Candidate:

  • 5+ years of experience in systems/software engineering, with deep exposure to AI runtime, distributed systems, or compiler/runtime interaction.
  • Experience in delivering PaaS services.
  • Proven experience optimizing and scaling deep learning runtimes (e.g. PyTorch, TensorFlow, JAX) for large-scale training and/or inference.
  • Strong programming skills in Python and C++ (Go or Rust is a plus).
  • Familiarity with distributed training frameworks, low-level performance tuning, and resource orchestration.
  • Experience working with multi-GPU, multi-node, or cloud-native AI workloads.
  • Solid understanding of containerized workloads, job scheduling, and failure recovery inproduction environments.
Read more
Users love Cutshort
Read about what our users have to say about finding their next opportunity on Cutshort.
Shubham Vishwakarma's profile image

Shubham Vishwakarma

Full Stack Developer - Averlon
I had an amazing experience. It was a delight getting interviewed via Cutshort. The entire end to end process was amazing. I would like to mention Reshika, she was just amazing wrt guiding me through the process. Thank you team.
Companies hiring on Cutshort
companies logos

Similar jobs

GyanSys Inc.
Bengaluru (Bangalore)
4 - 8 yrs
₹14L - ₹15L / yr
skill iconMachine Learning (ML)
skill iconData Science
skill iconPython
PyTorch
TensorFlow
+5 more

Role: Sr. Data Scientist

Exp: 4-8 Years

CTC: up to 25 LPA



Technical Skills:

● Strong programming skills in Python, with hands-on experience in deep learning frameworks like TensorFlow, PyTorch, or Keras.

● Familiarity with Databricks notebooks, MLflow, and Delta Lake for scalable machine learning workflows.

● Experience with MLOps best practices, including model versioning, CI/CD pipelines, and automated deployment.

● Proficiency in data preprocessing, augmentation, and handling large-scale image/video datasets.

● Solid understanding of computer vision algorithms, including CNNs, transfer learning, and transformer-based vision models (e.g., ViT).

● Exposure to natural language processing (NLP) techniques is a plus.



• Educational Qualifications:

  • B.E./B.Tech/M.Tech/MCA in Computer Science, Electronics & Communication, Electrical Engineering, or a related field.
  • A master’s degree in computer science, Artificial Intelligence, or a specialization in Deep Learning or Computer Vision is highly preferred



If interested share your resume on 82008 31681

Read more
Hunarstreet Technologies pvt ltd
Hunarstreet Technologies pvt ltd
Agency job
via Hunarstreet Technologies Pvt Ltd by Sakshi Patankar
BORIVALI, Mumbai
1 - 4 yrs
₹2L - ₹4L / yr
EHS
EHS management



  Occupational Health & Safety Management: Ø Handling government official's visits and its timely compliance, ensure renewal of fire NOC, Petroleum NOC, on site emergency plan approval etc.

>   Develop a safe, pollution free work environment and robust EHS Culture at site.

>   To enhance engagement of employees for building EHS culture.

Ø Improve the awareness and capabilities of people at site on EHS.

>   Implementation of EHS Global Standards, Like work permit system , LOTO Permit , contractor safety management , chemical safety management , Personnel protective equipment, Incident reporting and investigation , Risk Assessment and process hazard analysis, change control management, Employee Engagement , Machine Safety procedure etc.

'    Evaluation of training need assessment and ensure the compliance as per calender.

>   EHS round on monthly basis with engaging department SLT's, shift in charges and EHS profesionals.

>   Creating awareness among employees and workers by celebrating EHS Programs — National Safety Week, Road Safety Week, Fire Service week etc.

Ø Encourage near miss reporting, employee's reward and recognitions.

>   Development of departmental EHS coordinators, encourage them to report unsafe observations , near miss reporting , conducting department EHS meet , participation in EHS promotional activities.

Ø Ensure electrical safety with coordination engineering department.

>   Conducting job safety analysis, Risk Assessment & Process hazard analysis.

Ø Ensure standard PPE's requirement, use of need base PPE's and its safe disposal.

>   Encourage and ensure incident reporting & investigation, root cause analysis and compliance of corrective action and preventive action.

>   Ensure emergency preparedness and response compliance and ready to operate 24X7 fire Extinguishers, Fire Hydrant System, Fire alarm system, Public Announcement System, SCBA ,Safety shower, Spill control Kit, ambulance etc.

'    Ensure pressure vessel, Lifting tool tackles, hoist and lifts etc. testing and compliance, form-9, form-33 etc.

>   Workplace health & hygiene monitoring.


>   Ensure the compliance of EHS legal register and compliance register.

Ø Ensure the compliance of local EHS procedures, EHS Manual, EHS Policy, On site emergency plan.

Read more
Wissen Technology
at Wissen Technology
4 recruiters
Sukanya Mohan
Posted by Sukanya Mohan
Bengaluru (Bangalore)
3 - 5 yrs
Best in industry
skill iconPython
Apache Spark
Hadoop
SQL

Responsibilities:


• Build customer facing solution for Data Observability product to monitor Data Pipelines

• Work on POCs to build new data pipeline monitoring capabilities.

• Building next-generation scalable, reliable, flexible, high-performance data pipeline capabilities for ingestion of data from multiple sources containing complex dataset.

•Continuously improve services you own, making them more performant, and utilising resources in the most optimised way.

• Collaborate closely with engineering, data science team and product team to propose an optimal solution for a given problem statement

• Working closely with DevOps team on performance monitoring and MLOps


Required Skills:

• 3+ Years of Data related technology experience.

• Good understanding of distributed computing principles

• Experience in Apache Spark

•  Hands on programming with Python

• Knowledge of Hadoop v2, Map Reduce, HDFS

• Experience with building stream-processing systems, using technologies such as Apache Storm, Spark-Streaming or Flink

• Experience with messaging systems, such as Kafka or RabbitMQ

• Good understanding of Big Data querying tools, such as Hive

• Experience with integration of data from multiple data sources

• Good understanding of SQL queries, joins, stored procedures, relational schemas

• Experience with NoSQL databases, such as HBase, Cassandra/Scylla, MongoDB

• Knowledge of ETL techniques and frameworks

• Performance tuning of Spark Jobs

• General understanding of Data Quality is a plus point

• Experience on Databricks,snowflake and BigQuery or similar lake houses would be a big plus

• Nice to have some knowledge in DevOps

Read more
EbixCash
Rupali Dhuriya
Posted by Rupali Dhuriya
Mumbai
0 - 2 yrs
₹1.7L - ₹3L / yr
Communication Skills
  • Strong communication abilities
  • Exceptional communication and the capacity to switch up speaking approach
  • The capacity to adjust to challenging circumstances
  • The capacity to adjust to challenging circumstances
  • Having a firm understanding of the products or services the business provides
  • Ability to listen and solve problems
  • Ability to cope with rejection while remaining calm
  • Outstanding capacity to manage conflicts and address grievances during negotiations


Read more
codersbrain
at codersbrain
1 recruiter
Aishwarya Hire
Posted by Aishwarya Hire
Remote only
8 - 10 yrs
₹10L - ₹15L / yr
SAP
SAP PS
  • SAP PS Implementation Experience: End-to-end Implementation experience in different domains - Banking, Manufacture, Civil or any other Industry.

  • Good configuration knowledge of PS structures: WBS, Network, Milestones, Cost Planning, Budgeting, Material Requirement planning, Project quotation, Time sheets, Goods issues, and other project management activities in SAP PS.

  • Must have completed at least two end-to-end implementations.

  • Experience on complete PS module cycle from project creation to settlement.

  • Integration knowledge with CO, FI and MM, SD and PP.

  • Must be proficient in handling Issues/support functions.

Read more
Deforus Technologies Pvt Ltd
Krunal Salvi
Posted by Krunal Salvi
Mumbai
2 - 5 yrs
₹6L - ₹12L / yr
User Interface (UI) Design
User Experience (UX) Design
Mobile App Design
Web design
skill iconAdobe XD
+11 more
We are looking for a full-time User Interface and User Experience Designer to join our growing team.

Qualification Required:
  • Proven work experience as a UI/UX Designer.
  • Minimum 3 to 7 years of experience designing interfaces for mobile and web applications.
  • Portfolio of a design project.
  • Up-to-date knowledge of design software like Adobe XD, Illustrator, and photoshop.
  • Team spirit: strong communication skills to collaborate with various clients.
  • Good time management skills.
  • Thinker and problem-solving skills.

Responsibilities:
  • Executing all visual design stages from concept to final hand-off to the client.
  • Conceptualizing original ideas that bring simplicity with user-friendliness to the complex design.
  • Create wireframes, storyboards, user flows, process flows and site maps to effectively communicate interaction and design ideas.
  • Combine creativity with an awareness of the design elements.
  • Conduct ongoing user research.
  • Don't forget to attach previous work or Behance portfolio link.

Regards,
Deforus Technologies Pvt. Ltd.
Read more
Dev Technosys
at Dev Technosys
2 recruiters
Shreya Bhardwaj
Posted by Shreya Bhardwaj
Jaipur
1 - 3 yrs
₹1L - ₹3L / yr
skill iconiOS App Development
skill iconObjective C
skill iconSwift
Cocoa
Xcode
Job Requirements:-
1. Must be very strong in Objective-C, Cocoa, Cocoa Touch, and XCode
2. Must have built and published commercial iPhone and iPad applications
3. Experience with Objective-C, JavaScript, and JSON
4. Experience writing rich GUI's for the iPhone and iPad
5. should able to demonstrate apps that was developed and in app store.
6. Well verse in OOPs/Objective C concepts, Web service and Parsing JSON/XML.
7. Expertise in iPhone development, including implementing applications with standard iPhone / iPad UI components, creating custom UI.
Read more
Getbasis
at Getbasis
2 recruiters
Mayank Arya
Posted by Mayank Arya
Bengaluru (Bangalore)
1 - 3 yrs
₹6L - ₹14L / yr
skill iconAndroid Development
skill iconKotlin
skill iconJava
Model-View-View-Model (MVVM)
RxJava

We are looking for a self-driven, passionate Android developer who can join our mission in building a product that solves real problems in the simplest and most elegant manner possible. We want to make financial services inclusive and accessible to all.

 

Welcome aboard if:

- You have done Android development with Kotlin and some state management frameworks for at least 1 year

- You push the code to Git before you leave the office

- You take pride in building the features and take end-to-end ownership

- You love Agile and believe in writing more than enough tests, code reviews, and iterative development

Read more
company logo
Agency job
via Vivriti Capital by Rama S
Chennai
3 - 8 yrs
₹7L - ₹25L / yr
skill iconRuby on Rails (ROR)
skill iconPython
skill iconJava
Data Structures
Algorithms
+1 more

 

Company profile

 

We are the pioneering player in the FinTech industry in India in the institutional credit space. We have created a one of a kind online marketplace for institutional credit ‘CredAvenue’, bringing together issuers and lenders, while also participating in the marketplace through their own balance sheet. Within a short span of two years, CredAvenue has gained immense traction and boasts of multiple clients across sectors, 120+ investors across multiple segments and a high volume of credit deal closures. To read about the latest numbers we have clocked, please visit http://www.credavenue.com">www.credavenue.com.

 

We are backed by two of the leading global Private Equity firms and have also been successful in attracting high quality talent from some of the leading companies and universities globally. With the fundamentals in place, we are now gearing up for our next phase of high growth, and we are further building up our team to take the company to the next level.

 

Primary Responsibilities

  • Responsible for full software life-cycle, system design and development of front-end & back-end systems
  • Writing high-quality code, participating in code reviews, designing/architecting systems of varying complexity and scope
  • Identify libraries and technologies worth experimentation
  • Build innovative solutions from scratch and liaise with architects and engineers to build solutions
  • Mentoring other team members

Required Skill

  •  Degree in Computer Science or relevant experience
  •  2-7 years of relevant hands-on software engineering experience doing software design and development
  •  Proven experience of working on back-end web frameworks like RoR (preferable) or Python/Django or Node.js
  •  Good command over at least one JavaScript frameworks like React.js, Vue.js or Angular
  •  Excellent understanding of relational database structures, having knowledge of unstructured databases (NoSQL) will be an  added advantage
  •  Expertise in object-oriented design, unit testing, integration testing, data structures, algorithms, scalable APIs, etc.
  •  Knowledge of working on cloud technologies and exposure of AWS services (EC2, RDS, S3, etc)
  •  Work in a fast-paced environment and make pragmatic engineering decisions in a short amount of time
  •  Experience with Agile Development and Scrum methodologies.

 

 

 

Work Environment Details

  • An opportunity to play a formative role in an ambitious financial services marketplace spanning investment banking, debt capital markets, institutional finance, retail lending and asset management
  • A journey that will challenge and reward you in a manner few others will
Read more
Marketonix
at Marketonix
1 recruiter
Manasi j
Posted by Manasi j
Bengaluru (Bangalore)
2 - 5 yrs
₹4L - ₹8L / yr
skill iconC#
SQL server
skill iconPHP
skill iconPython
skill iconRuby on Rails (ROR)
+3 more
Requirement  - Backend Developer
Experience: 2 - 3 Years
Location: Bangalore
Salary: 8 Lakhs
Qualification: Any
Industry: Any
Gender: Any
Skills required: C#,SQL Server ,WEB-API and My-SQL knowledge but optional.
______________________________________________
Read more
Why apply to jobs via Cutshort
people_solving_puzzle
Personalized job matches
Stop wasting time. Get matched with jobs that meet your skills, aspirations and preferences.
people_verifying_people
Verified hiring teams
See actual hiring teams, find common social connections or connect with them directly.
ai_chip
Move faster with AI
We use AI to get you faster responses, recommendations and unmatched user experience.
Did not find a job you were looking for?
icon
Search for relevant jobs from 10000+ companies such as Google, Amazon & Uber actively hiring on Cutshort.
companies logo
companies logo
companies logo
companies logo
companies logo
Get to hear about interesting companies hiring right now
Company logo
Company logo
Company logo
Company logo
Company logo
Linkedin iconFollow Cutshort
Users love Cutshort
Read about what our users have to say about finding their next opportunity on Cutshort.
Shubham Vishwakarma's profile image

Shubham Vishwakarma

Full Stack Developer - Averlon
I had an amazing experience. It was a delight getting interviewed via Cutshort. The entire end to end process was amazing. I would like to mention Reshika, she was just amazing wrt guiding me through the process. Thank you team.
Companies hiring on Cutshort
companies logos