Team Introduction: The TikTok Data Ecosystem Team plays a critical role in supporting TikTok’s personalized recommendation system, which serves over 1 billion users. We are responsible for building scalable, reliable, and high-performance infrastructure for storing and serving machine learning features — especially user behavior sequences and contextual embeddings used in large-scale recommendation and pretraining models. Our work sits at the intersection of systems and machine learning: ensuring training-serving consistency, low-latency access to temporal features, and scalable ingestion pipelines across online and offline environments. We explore and integrate with various underlying storage engines, including RocksDB, HBase, and time-series databases, depending on the access pattern, feature type, and serving latency required by ML models. Responsibilities: - Build and optimize the core infrastructure of TikTok’s feature store, powering both training data pipelines and real-time inference systems. - Design efficient storage strategies for user behavior sequences, long-range contextual features, and sparse embeddings — ensuring freshness, consistency, and high availability. - Work with underlying storage engines such as RocksDB, HBase, and time-series databases to support feature retention, versioning, compaction, and fast lookup. - Collaborate with recommendation algorithm teams to design schemas and access patterns tailored to evolving model needs. - Integrate online and offline data pipelines to reduce training-serving skew and support continuous training and A/B testing scenarios. - Investigate techniques such as temporal sampling, embedding quantization, caching, and hybrid tiered storage to improve cost-efficiency and latency.
Minimum Qualifications: - Currently pursuing a Bachelor’s degree or above in Computer Science, Software Engineering, or a related technical field. - Solid foundation in distributed systems, data storage, and stream/batch processing architectures. - Experience in programming with Java, C++, or Python. - Understanding of key-value stores, LSM-tree architectures, or time-series databases at a system level. - Eagerness to work on ambiguous, real-world infrastructure problems that impact ML product outcomes. Preferred Qualifications: - Graduating in December 2025 or later with intent to return to your program. - Experience working with RocksDB, HBase, or time-series storage engines like IoTDB, OpenTSDB, or custom LSM-tree variants. - Familiarity with feature store design, feature lifecycle management, and streaming ingestion pipelines. - Understanding of recommendation system workflows, such as two-tower models, real-time CTR prediction, or user intent modeling. - Contributions to open-source storage/ML infra projects or participation in ML system hackathons.
Read Full Description