Team Intro This is a Site Reliability Engineer role, focusing on the data pipeline reliability for the Video Platform team in USDS. Data SREs monitor data and keep production batch and realtime processing jobs up and running with the highest level of availability, ensuring our users have the freshest, complete and correct data possible. In order to enhance collaboration and cross-functional partnerships, among other things, at this time, our organization follows a hybrid work schedule that requires employees to work in the office 3 days a week, or as directed by their manager/department. We regularly review our hybrid work model, and the specific requirements may change at any time. Responsibilities:- Manage day-to-day operations of data service, realtime/batch data pipelines, such as Service Level Agreement management, pipeline deployment, performance tuning and troubleshooting - Proactively monitor and troubleshoot data pipelines and systems for performance issues, errors, or anomalies - Create tools, build alarms and dashboards, drive internal process improvements, and automation to monitor and improve data engineering operations - Improve systems reliability, efficiency, and velocity through scaling, optimization of both resources and data processing workflows, potentially refactoring code or implementing new solutions - Develop and deploy new reliable and scalable data pipelines and infrastructure components as required by business needs - Work closely with data engineering and various vertical teams within the Video Architecture platform
Minimum Qualifications - Bachelor's in Computer Science or a related technical background involving software/system engineering, or equivalent working experience - Good programming experience with SQL and at least one of the following languages: Java, Python, Go, or Scala - Experience in data engineering, with a focus on data systems reliability, scalability, performance and capacity management - Solid experience with big data technologies (e.g., Hadoop, Spark, Flink, YARN) and databases (SQL, NoSQL) - Knowledge of data pipeline and workflow management tools (e.g., Airflow, Luigi) - Experience in building data solutions with AWS, Google, Azure and other cloud services is a plus Preferred Qualifications - Demonstrated independent thinking capabilities and troubleshooting skills in large scale distributed systems - Good communication and coordination skills
Read Full Description