Software Engineer Graduate (Applied Machine Learning - Orchestration) - 2026 Start (PhD)

ByteDance

Responsibilities

About the team:

The mission of our AML team is to push the next-generation AI infrastructure and recommendation platform for the ads ranking, search ranking, live & ecom ranking in our company. We also drive substantial impact on core businesses of the company. Currently, we are looking for Software Engineers New Graduate to join our team to support and advance that mission.

Responsibiltiies:

  • Responsible for the design and implementation of a global-scale machine learning system for feeds, ads and search ranking models.
  • Responsible for improving the machine learning infrastructure's usability, flexibility, and efficiency.
  • Responsible for improving the workflow of model training and serving, data pipelines and resource management for multi-tenancy machine learning systems.
  • Responsible for designing and developing key components of ML infrastructure

Qualifications

Minimum Qualifications:

  • Currently pursuing a PhD in Software Development, Computer Science, Computer Engineering, or a related technical discipline
  • Proficient in C/C++/Python/Golang, and have solid programming skills(e.g. algorithms and data structures).
  • Familiar with deep learning frameworks (TensorFlow/Pytorch).
  • Ability to work independently and complete projects from beginning to end and in a timely manner.
  • Good communication and teamwork skills to clearly communicate technical concepts with other teammates.

Preferred Qualifications:

  • Experience contributing to an open sourced machine learning framework (TensorFlow/PyTorch), experience on improving core machine learning infrastructure.
  • Experience in big data orchestration frameworks (e.g., K8s/Spark/Hadoop/Flink), experience in resource management and task scheduling for large scale distributed systems, experience in building solutions with AWS, GCP, Azures, OCI, AliCloud or other cloud services. [Scheduling]
  • Strong background in one of the following fields: Hardware-Software Co-Design, High Performance Computing, ML Hardware Acceleration (e.g., GPU/TPU/RDMA) or ML for Systems.
  • Experience in developing and deploying large-scale systems(e.g. Monitoring, Analyzing, Troubleshooting, and Notification systems), strong understanding of code optimizing, routine task automation and failure self-healing, familiar with IaC technologies like Terraform/Ansible. [EffeciencyTool]
Read Full Description
Confirmed 12 hours ago. Posted a day ago.

Discover Similar Jobs

Suggested Articles