Security Data Engineer - USDS

TikTok

Responsibilities

We are seeking a versatile and results-driven Data Engineer to join the USDS Cyber Defense & Engineering team at TikTok. In this role, you’ll collaborate with cross-functional teams across regions to develop innovative solutions that strengthen the security and privacy of our users, helping make TikTok the most trusted platform. The ideal candidate is a strong problem solver with proven technical expertise, business acumen, and a passion for cybersecurity. You thrive in dynamic environments and have a track record of delivering impactful, cross-team projects. In order to enhance collaboration and cross-functional partnerships, among other things, at this time, our organization follows a hybrid work schedule that requires employees to work in the office 3 days a week, or as directed by their manager/department. We regularly review our hybrid work model, and the specific requirements may change at any time. Responsibilities - Identify critical analytical problems and apply an engineering mindset to find innovative solutions with data. - Develop expertise in the USDS data to build and own relevant pipelines/ETLs. - Own data integrity, availability, transformation logic, and efficient data access to support the growing needs of the organization. - Identify gaps in existing data, create data product specs, and implement data tracking. - Incorporate automation wherever possible to improve data pipelines and analyses. - Build and maintain documentation to ensure data accessibility - Be our go-to data expert and have a deep understanding of our data warehouse and data processing layers. - Be a primary point of contact for ad hoc data requests and reporting

Qualifications

Minimum Qualifications: - Bachelors degree in Statistics, Economics, Computer Science or another quantitative field - 5+ years of experience working with data analytics and data engineering, including experience with data cleaning and preprocessing, data analysis and dashboard development. - Proficiency in distributed data processing using Big Data technologies like Spark/Scala, Hadoop/HDFS/AWS/S3, Cassandra and Kafka. - Proficiency in ETL , data warehousing, data modeling, data design, SQL, and NoSQL databases - Proactive, self-driven and impact driven person - Comfortable with ambiguity; able to thrive with minimal oversight and process Preferred Qualifications - Proficiency in Python for Data Science: Strong command of Python libraries essential for data manipulation, analysis, and feature engineering, including Pandas and NumPy. - Familiarity with Machine Learning Concepts & Libraries: Understanding of core machine learning concepts (e.g., supervised, unsupervised learning, model evaluation) and practical experience with libraries like Scikit-learn for building and integrating basic ML workflows into data pipelines. - Exposure to Deep Learning Frameworks (e.g., PyTorch, TensorFlow): While not necessarily building models from scratch, an understanding of how these frameworks consume and process data is highly beneficial for designing data pipelines that feed into deep learning models (e.g., for anomaly detection, threat intelligence). - Experience with MLOps Principles & Tools: Familiarity with concepts like feature stores, model versioning, and continuous integration/delivery for ML pipelines. Experience with tools like MLflow, Kubeflow, or similar platforms for managing the ML lifecycle is a strong plus. - Data Preparation for AI/ML: Proven ability to prepare, clean, and transform large-scale datasets specifically for machine learning model training and inference, ensuring data quality and consistency.

Read Full Description
Confirmed 12 hours ago. Posted 5 days ago.

Discover Similar Jobs

Suggested Articles