NVIDIA Dynamo is an innovative, open-source platform focused on efficient, scalable inference for large language and reasoning models in distributed GPU environments. By bringing to bear sophisticated techniques in serving architecture, GPU resource management, and intelligent request handling, Dynamo achieves high-performance AI inference for demanding applications. Our team is addressing the most challenging issues in distributed AI infrastructure, and we’re searching for engineers enthusiastic about building the next generation of scalable AI systems.
As an Applied AI Research Software Engineering intern on the Dynamo project, you will address some of the most sophisticated and high-impact challenges in distributed inference, including: Dynamo k8s serving platform, disaggregated serving, dynamic GPU scheduling, intelligent routing, and distributed KV cache management.
What you'll be doing:
What We Need To See:
Ways To Stand Out From The Crowd
NVIDIA is widely considered to be one of the technology world’s most desirable employers. We have some of the most forward-thinking and hardworking people in the world working for us. If you're creative and autonomous, we want to hear from you!
The hourly rate for our interns is 18 USD - 71 USD. Our internship hourly rates are a standard pay determined based on the position and your location, year in school, degree, and experience.
You will also be eligible for Intern benefits. NVIDIA accepts applications on an ongoing basis.
NVIDIA is committed to fostering a diverse work environment and proud to be an equal opportunity employer. As we highly value diversity in our current and future employees, we do not discriminate (including in our hiring and promotion practices) on the basis of race, religion, color, national origin, gender, gender expression, sexual orientation, age, marital status, veteran status, disability status or any other characteristic protected by law.
Read Full Description