About the Role We are looking for an experienced AI/ML technical expert to join our team, focused on the application of multimodal LLMs, unsupervised learning, and clustering algorithms. The ideal candidate will work closely with product, operations, and engineering teams to apply advanced natural language processing, computer vision, and deep learning technologies to solve business challenges and extract actionable insights. The success of TikTok's data-driven business model relies heavily on the continuous supply of high-quality labeled data, which will grow exponentially as our business scales. However, the current cost of data labeling remains a significant challenge. To address this, the Data Solutions team is designed to understand and process data at scale for all TikTok business needs. Our team uses both quantitative and qualitative data to uncover insights and turn these findings into real products that power TikTok’s exponential growth. Our responsibilities span infrastructure development, content understanding capabilities, and global labeling delivery management. Responsibilities - Apply multimodal large language models, natural language processing, and computer vision techniques to design and develop data products, extract insights, and optimize business strategies. - Develop cutting-edge algorithms and prototypes to solve business problems using the latest advancements in deep learning, machine learning, statistics, and optimization. - Leverage unsupervised learning and clustering algorithms to identify patterns, trends, and opportunities from large datasets, proposing data-driven business solutions. - Collaborate with product managers and cross-functional teams to define user stories, success metrics, and manage data projects from ideation to implementation. - Partner with engineering teams to deploy and scale data models, ensuring smooth integration and performance.
Minimum Qualifications - Strong background in computer science and is in a deep understanding of the mathematical fundamentals of statistics, machine learning, and analytics. - At least 3 years of experience in software development or model/data development, with hands-on experience in applying LLM technologies (e.g., Agent-based LLM, Prompt Tuning, Chain of Thought, Retrieval Augmented Generation, Supervised Fine-Tuning, RLHF) to solve complex business problems. - Solid experience with unsupervised learning, clustering algorithms, and the ability to extract insights from large datasets, recognize patterns, and develop data models. - Proficiency in Python and SQL, with experience in ML/DL frameworks like TensorFlow and PyTorch. Preferred Qualifications - Expertise in SQL, Hive, Presto, or Spark, with experience handling large-scale datasets. - A strong understanding of building data pipelines, model development, testing, and deployment. - Experience in labeling data products and working with multimodal content labels is a plus. - Strong communication skills in English, with the ability to clearly convey technical concepts to both technical and non-technical stakeholders. - Intellectual curiosity, strong problem-solving abilities, and excellent quantitative analysis skills with the capacity to deconstruct problems, identify root causes, and propose actionable solutions.
Read Full Description