Deep Learning Performance Architect Intern - 2025

NVIDIA

NVIDIA is developing processor and system architectures that accelerate deep learning and high-performance computing applications. We are looking for a talented deep learning performance architect to join our AI performance modelling, analysis and optimization efforts. In this position, you will have a chance to work on DL performance analysis, and optimization on state-of-the-art hardware architectures for various LLM and Multi-model workloads. You will make your contributions to our dynamic technology focused company.

What you'll be doing:

  • Analyze state-of-the-art DL networks (LLM, VLA, and Multimodal model etc.), identify and prototype performance opportunities to influence SW and Architecture team for NVIDIA's current and next gen inference products.
  • Develop analytical models for the state-of-the-art deep learning networks and algorithm to innovate processor and system architectures design for performance and efficiency.
  • Specify hardware/software configurations and metrics to analyze performance, power, and accuracy in existing and future uniprocessor and multiprocessor configurations.
  • Collaborate across the company to guide the direction of next-gen deep learning HW/SW by working with architecture, software, and product teams.

What we need to see:

  • Pursuing BS or higher degree in a relevant technical field (CS, EE, CE, Math, etc.).
  • Strong programming skills in Python, C, C++.
  • Strong background in computer architecture.

Ways to stand out from the crowd:

  • Experience with GPU Computing and parallel programming models such as CUDA and OpenCL.
  • Experience with workload analysis on other deep learning accelerators.
  • Background with deep neural network training, inference and optimization in leading frameworks (e.g. Pytorch, Tensorflow, TensorRT).

#deeplearning

Read Full Description
Confirmed 5 hours ago. Posted 13 days ago.

Discover Similar Jobs

Suggested Articles