AI Container Platform Engineer

Intel

Education
Benefits
Special Commitments
Skills

Job Details:

Job Description: 

Intel is hiring a senior AI Container Platform Engineer for development and optimization of its AI infrastructure. This includes collaboration with other team members to design and build a highly resilient Kubernetes container platform that is optimized for AI workloads on Intel accelerator hardware. Successful candidates will have a strong desire to address challenges with an attention to detail and focus to build a highly reliable and scalable container platform. A foundational understanding of software development principles and testing methodologies along with a high degree of independence, adaptability, drive, and willingness to accept new challenges, are a must. Candidate should also have both demonstrated software development skills in a range of languages and strong Linux systems expertise.

Responsibilities:

  • Work with others to design, build, and maintain a Kubernetes container platform on bare metal on-premises hardware.
  • Participate in building advanced tooling for deployment, testing, monitoring, logging, administration, auditing, and operations of multiple Kubernetes clusters in distributed data centers.
  • Research and implement solutions related to Kubernetes container RBAC, networking, storage, scheduling, registries, certificate management, and more to build a highly reliable, scalable, secure, and resource-optimized AI container platform.
  • Evaluation and selection of third-party commercial and open-source components for the AI container platform

Qualifications:

You must possess the below requirements to be initially considered for this position. Preferred qualifications are in addition to the requirements and are considered a plus factor in identifying top candidates. Experience listed below would be obtained through a combination of your schoolwork and/or classes and/or research and/or relevant previous job and/or internship experiences.

Minimum Qualifications:

The candidate must possess a Bachelor’s degree or Master’s degree in Computer Engineering, Computer Science, Information Systems, or a related field with 8+ years of relevant work experience.

5+ years of experience in below areas:

  • Python, Golang or another modern programming language
  • Linux based operating systems such as CentOS, Ubuntu, SUSE, or Rocky
  • Bash shell scripting and Linux command-line acumen

2+ years of experience in below areas:

  • Software engineering team in a Cloud or on-premises data center environment supporting critical services.
  • Linux containers and container runtimes (Docker, containerd, cri-o)
  • Kubernetes
  • IP networking, load balancing, DNS
  • Pod scheduling and node topology management
  • Environment As Code via configuration management tools such as ansible, terraform, salt, chef, or puppet.
  • Container Network Interface (CNI), Container Storage Interface (CSI), and Kubernetes schedulers
  • Istio and/or service meshes.
  • AI/ML workloads
  • Performance benchmarking
  • Hardware accelerators and specialized devices (GPU, HPU, HPC)
  • Git development workflow
  • Kubespray, Kops, or Kubadm

Preferred Qualification:

  • Slurm, Volcano, MPI, PyTorch, TensorFlow or other schedulers and AI domain frameworks
  • On-premises data center networking
  • Cloud development or architecture (AWS, GCP, Azure, etc.)
  • Secret vault integration with Kubernetes
  • Identity provider configuration with SSO
  • Ability to communicate detailed technical concepts in a clear and concise manner.

Job Type:

Experienced Hire

Shift:

Shift 1 (United States of America)

Primary Location: 

US, Texas, Austin

Additional Locations:

US, Arizona, Phoenix, US, California, Folsom, US, California, San Diego, US, California, Santa Clara, US, Oregon, Hillsboro

Business group:

The Data Center & Artificial Intelligence Group (DCAI) is at the heart of Intel’s transformation from a PC company to a company that runs the cloud and billions of smart, connected computing devices. The data center is the underpinning for every data-driven service, from artificial intelligence to 5G to high-performance computing, and DCG delivers the products and technologies—spanning software, processors, storage, I/O, and networking solutions—that fuel cloud, communications, enterprise, and government data centers around the world.

Posting Statement:

All qualified applicants will receive consideration for employment without regard to race, color, religion, religious creed, sex, national origin, ancestry, age, physical or mental disability, medical condition, genetic information, military and veteran status, marital status, pregnancy, gender, gender expression, gender identity, sexual orientation, or any other characteristic protected by local law, regulation, or ordinance.

Position of Trust

N/A

Benefits:

We offer a total compensation package that ranks among the best in the industry. It consists of competitive pay, stock, bonuses, as well as, benefit programs which include health, retirement, and vacation. Find more information about all of our Amazing Benefits here: https://www.intel.com/content/www/us/en/jobs/benefits.html

Annual Salary Range for jobs which could be performed in

US, California:$186,552.00-$279,772.00

Salary range dependent on a number of factors including location and experience.

Work Model for this Role

This role will be eligible for our hybrid work model which allows employees to split their time between working on-site at their assigned Intel site and off-site. In certain circumstances the work model may change to accommodate business needs.

Read Full Description
Confirmed 13 hours ago. Posted 30+ days ago.

Discover Similar Jobs

Suggested Articles