Sr Software Engineer, AI Tools – On-Device Generative AI Model Optimization

Qualcomm

Company:

Qualcomm Technologies, Inc.

Job Area:

Engineering Group, Engineering Group > Machine Learning Engineering

General Summary:

As a leading technology innovator, Qualcomm pushes the boundaries of what's possible to enable next-generation AI experiences and drive agentic transformation, creating a smarter, connected future for all. As a Qualcomm Machine Learning Engineer, you will develop and implement cutting-edge tools and solutions to enable state-of-the-art AI solutions across various technology verticals.

All Qualcomm employees are expected to actively support diversity on their teams, and in the Company.

This role is open to both San Diego, CA and Raleigh, NC and will be onsite full-time.

Minimum Qualifications:

  • Bachelor's degree in Computer Science, Engineering, Information Systems, or related field and 2+ years of Hardware Engineering, Software Engineering, Systems Engineering, or related work experience.

OR

Master's degree in Computer Science, Engineering, Information Systems, or related field and 1+ year of Hardware Engineering, Software Engineering, Systems Engineering, or related work experience.

OR

PhD in Computer Science, Engineering, Information Systems, or related field.

What You'll Do

Model Reauthoring & Architecture Adaptation

  • Reauthor generative AI architectures for efficient execution on Qualcomm AI hardware. This covers LLMs (Llama, Phi, Qwen) and multimodal models (vision-language, speech, diffusion), including custom attention, normalization, positional embedding, and modality-specific components.
  • Translate hardware execution constraints — operator support, memory layout, dispatch behavior — into model-level transformations. These transformations need to preserve accuracy while enabling efficient on-device execution.
  • Build clean extension points so internal teams and external contributors can onboard new architectures without changing core pipeline code.

Inference Optimization for Edge Hardware

  • Integrate inference acceleration techniques into the model preparation pipeline. This includes memory-efficient attention, decode acceleration, and serving-time optimizations.
  • Translate end-customer deployment constraints — target SoC, context length, latency budget, memory envelope — into concrete model preparation strategies.

Custom Model & OEM Enablement

  • Work with research teams to develop reauthoring strategies for custom OEM models and customer-specific use cases. Take research prototypes and turn them into production deployments.

Cross-Functional Collaboration

  • Partner with compiler teams to understand on-target constraints. Decide on the right response: a graph-level optimization or model-level reauthoring.
  • Partner with quantization engineers so architectural decisions compose cleanly with the quantization stack.

Pipeline & Tooling

  • Contribute reauthoring and adaptation stages to a multi-stage model preparation pipeline. Build developer-facing diagnostics that give clear, actionable feedback when models fail to lower or run efficiently.

Minimum Qualifications

  • Bachelor's degree in Computer Science, Engineering, or related field and 4+ years of Software Engineering, ML Engineering, or related experience
  • OR Master's degree in Computer Science, Engineering, or related field and 3+ years of relevant experience
  • OR PhD in Computer Science, Engineering, or related field and 2+ years of relevant experience
  • 2+ years in ML systems, model optimization, or inference engineering. Proficient in Python in large, typed codebases.
  • Strong written and verbal communication. Comfortable operating across compiler, research, and partner-facing teams.

Preferred Qualifications

  • Deep implementation-level knowledge of generative AI architectures across LLMs and multimodal models
  • Demonstrated experience optimizing inference for edge or resource-constrained deployments, with measurable latency or memory wins to point to.
  • Strong PyTorch internals knowledge — module customization, export flows, tracing. Familiarity with the HuggingFace transformers ecosystem.
  • Familiarity with on-device runtimes and SoC-level constraints (memory bandwidth, compute precision, NPU/DSP execution). Exposure to QAIRT/QNN, ONNXRuntime, LiteRT-LLM or similar is a plus.
  • Working understanding of how quantization interacts with model architecture decisions, even if you're not a quantization specialist.
  • Experience using agentic coding tools such as GitHub Copilot, Cursor, Claude Code, Codeium, or similar AI-assisted development tools to improve coding productivity and problem-solving

Level of Responsibility

  • Works independently on open-ended optimization challenges. Provides technical guidance and mentorship to teammates.
  • Decisions have broad impact on model accuracy, on-device performance, and the developer experience of teams using the preparation pipeline.
  • Communicates complex model architecture and inference optimization concepts to a range of audiences: hardware engineers, research scientists, compiler engineers, OEM partners, and external developers.
  • Has meaningful influence on the generative AI optimization roadmap, supported model strategy, and cross-team integration priorities.

Qualcomm is an equal opportunity employer. If you are an individual with a disability and need an accommodation during the application/hiring process, rest assured that Qualcomm is committed to providing an accessible process. You may e-mail disability-accomodations@qualcomm.com or call Qualcomm's toll-free number found here. Upon request, Qualcomm will provide reasonable accommodations to support individuals with disabilities to be able participate in the hiring process. Qualcomm is also committed to making our workplace accessible for individuals with disabilities. (Keep in mind that this email address is used to provide reasonable accommodations for individuals with disabilities. We will not respond here to requests for updates on applications or resume inquiries).

To all Staffing and Recruiting Agencies: Our Careers Site is only for individuals seeking a job at Qualcomm. Staffing and recruiting agencies and individuals being represented by an agency are not authorized to use this site or to submit profiles, applications or resumes, and any such submissions will be considered unsolicited. Qualcomm does not accept unsolicited resumes or applications from agencies. Please do not forward resumes to our jobs alias, Qualcomm employees or any other company location. Qualcomm is not responsible for any fees related to unsolicited resumes/applications.

EEO Employer: Qualcomm is an equal opportunity employer; all qualified applicants will receive consideration for employment without regard to race, color, religion, sex, sexual orientation, gender identity, national origin, disability, Veteran status, or any other protected classification.

Qualcomm expects its employees to abide by all applicable policies and procedures, including but not limited to security and other requirements regarding protection of Company confidential information and other confidential and/or proprietary information, to the extent those requirements are permissible under applicable law.

Pay range and Other Compensation & Benefits:

$140,800.00 - $211,200.00

The above pay scale reflects the broad, minimum to maximum, pay scale for this job code for the location for which it has been posted. Even more importantly, please note that salary is only one component of total compensation at Qualcomm. We also offer a competitive annual discretionary bonus program and opportunity for annual RSU grants (employees on sales-incentive plans are not eligible for our annual bonus). In addition, our highly competitive benefits package is designed to support your success at work, at home, and at play. Your recruiter will be happy to discuss all that Qualcomm has to offer – and you can review more details about our US benefits at this link.

If you would like more information about this role, please contact Qualcomm Careers.

Read Full Description
Confirmed 21 hours ago. Posted 4 days ago.

Discover Similar Jobs

Suggested Articles