Software Engineer / Researcher, AI-Native database systems

ByteDance

Responsibilities

About the Team

Join ByteDance’s database R&D team, where you’ll build and own cutting-edge database products supporting Bytedance’s global infrastructure. Our diverse portfolio includes relational databases, distributed caches, key-value stores, document databases, graph databases, wide-column stores, search engines, and multi-model databases. In this role, you’ll have the opportunity to enhance these services in a cloud-native environment, embracing a culture of intellectual curiosity, self-direction, and problem-solving.

About the Role

We are building the next-generation AI-native database systems—intelligent, multimodal, and designed for the era of large models. Our systems are not just data stores; they’re reasoning engines, retrieval platforms, and real-time memory for AI agents. As a Senior Software Engineer or Researcher, you will be at the forefront of rethinking how databases work when built from the ground up for AI workloads. You’ll help create infrastructure that powers intelligent systems across TikTok, CapCut, and future applications that haven’t been imagined yet.

Responsibilities

  • Architect and implement AI-native databases that seamlessly integrate structured, unstructured, and vectorized data.
  • Design storage engines optimized for embedding ingestion, multimodal retrieval, and real-time AI interaction.
  • Build scalable and distributed vector search systems with low-latency guarantees.
  • Develop AI-augmented query processors that leverage large language models (LLMs) for semantic parsing, intent understanding, and cost estimation.
  • Collaborate on developing retrieval-augmented generation (RAG) infrastructure and LLM agent memory backends.
  • Drive innovations in learned index structures, self-optimizing databases, and AI-integrated transaction systems.
  • Publish and contribute to broader research and open-source communities.

Qualifications

Minimum Qualification

  • Bachelor’s, Master’s, or Ph.D. in Computer Science or related fields with strong systems or AI research experience.
  • 2+ years in core database systems, large-scale distributed infrastructure, or machine learning systems.
  • Strong coding and system-level design skills in C++ / Rust / Go.
  • Deep expertise in one or more of the following areas: Storage engine architecture (LSM-trees, column stores, HTAP systems) / Vector retrieval systems, similarity search, and ANN indexing / AI infra or model-serving infrastructure (especially for embeddings / RAG / LLMs) / Semantic search, agent systems, or AI-native memory frameworks
  • Ability to collaborate across research, engineering, and product teams to translate ideas into production systems.

Preferred Qualifications:

  • Experience with open-source systems such as Faiss, Milvus, DuckDB, ClickHouse, TiKV, RocksDB.
  • Publications at top-tier conferences (e.g., SIGMOD, VLDB, NeurIPS, MLSys, ICDE).
  • Familiarity with GCP, AWS, or Azure’s database and AI integration strategies.
  • Prior contributions to RAG, memory-augmented models, or self-tuning database components.
Read Full Description
Confirmed 8 hours ago. Posted 30+ days ago.

Discover Similar Jobs

Suggested Articles