Big Data Engineer (Libra) - Data Platform

ByteDance

Education
Benefits
Qualifications
Skills

Responsibilities

About the team

Libra is a large-scale online one-stop A/B testing platform developed by Data Platform. Some of its features include:

  • Provides experimental evaluation services for all product lines within the company, covering solutions for complex scenarios such as recommendation, algorithm, function, UI, marketing, advertising, operation, social isolation, causal inference, etc.
  • Provides services throughout the entire experimental lifecycle from experimental design, experimental creation, indicator calculation, statistical analysis to final evaluation launch.
  • Supports the entire company's business on the road of rapid iterative trial and error, boldly assuming and carefully verifying.

Responsibilities

  • Responsible for data system of experimentation platform operation and maintenance.
  • Construct PB-level data warehouses, participate in and be responsible for data warehouse design, modeling, and development, etc.
  • Build ETL data pipelines and automated ETL data pipeline systems.
  • Build an expert system for metric data processing that combines offline and real-time processing.

Qualifications

Minimum Qualifications

  • Bachelor's degree in Computer Science, a related technical field involving software or systems engineering, or equivalent practical experience.
  • Proficiency with big data frameworks such as Presto, Hive, Spark, Flink, Clickhouse, Hadoop, and have experience in large-scale data processing.
  • Minimum 1 year of experience in Data Engineering.
  • Experience writing code in Java, Scala, SQL, Python or a similar language.
  • Experience with data warehouse implementation methodologies, and have supported actual business scenarios.

Preferred Qualifications

  • Knowledge about a variety of strategies for ingesting, modeling, processing, and persisting data, ETL design, job scheduling and dimensional modeling.
  • Expertise in designing, analyzing, and troubleshooting large-scale distributed systems is a plus (Hadoop, M/R, Hive, Spark, Presto, Flume, Kafka, ClickHouse, Flink or comparable solutions).
  • Work/internship experience in internet companies, and those with big data processing experience are preferred.
Read Full Description
Confirmed 19 hours ago. Posted 20 days ago.

Discover Similar Jobs

Suggested Articles