Machine Learning Engineer - Model Training Infrastructure

ByteDance

Responsibilities

The mission of our AML team is to push the next-generation AI infrastructure and recommendation platform for the ads ranking, search ranking, live & ecom ranking in our company. We also drive substantial impact on core businesses of the company. Currently, we are looking for Machine Learning Engineer in Model Training Infrastructure to join our team to support and advance that mission.

Responsibilities:

  • Responsible for the design and implementation of a global-scale machine learning system for feeds, ads and search ranking models.
  • Responsible for improving use-ability and flexibility of the machine learning infrastructure.
  • Responsible for improving the workflow of model training and serving, data pipelines, storage system and resource management for multi-tenancy machine learning systems.
  • Responsible for designing and developing key components of ML infrastructure and mentoring interns.

Qualifications

Minimum Qualifications

  • At least 5 years of experience in developing and deploying large-scale systems.
  • Proficient in C/C++/CUDA/Python, and have solid programming skills.
  • Familiar with deep learning frameworks (TensorFlow/Pytorch).
  • Experience on improving core machine learning infrastructure(TensorFlow, Pytorch, and Jax).

Preferred Qualifications:

  • Experience contributing to an open sourced machine learning framework (TensorFlow/PyTorch).
  • Experience in using/designing open-source machine learning lifecycle management systems: TFX
Read Full Description
Confirmed 13 hours ago. Posted 30+ days ago.

Discover Similar Jobs

Suggested Articles