Tech Lead - Data Reliability Engineering

StoneX

Overview

Job Description

Tech Lead – Data Reliability Engineering

Reports to: Director Data Distribution

Company: StoneX Group Inc.

Position Purpose: As a Technical Lead in Data Reliability Engineering at StoneX you will guide our data platform and team of engineers to ensure the datasets are reliable, scalable and performant. Leveraging your expertise in data engineering, system architecture, and reliability best practices, you will work closely with cross-functional teams to design, implement, and maintain large scale production systems with high efficiency and availability using the combination of software and systems engineering practices.

Responsibilities

Primary Accountabilities/Responsibilities:

  • Lead the development and maintenance of automation scripts and tools for deployment, configuration management, and system monitoring.
  • Work closely with data scientists, analysts, and data engineers to understand system requirements and optimize data-related workflows. Establish and measure those SLAs on a day to day basis through automation and work with stakeholders to ensure the data is consumed effectively
  • Provide technical guidance to the team on troubleshooting issues in collaboration with data stakeholders and consumers.
  • Creates and maintains technical and system documentation on how technology works using collaboration tools like Confluence. Creates clear documentation for new code and systems used. Documenting systems designs, presentations, and business requirements for consumption and consideration at the manager level.
  • Establish and maintain monitoring, alerting, and logging systems to proactively identify and address issues affecting data reliability.
  • SRE/DRE approach and who can understand dataset SLO’s, SLA’s, and build tooling to capture metrics and continuous monitoring.
  • Experience in building and operating large scale observability platforms for monitoring and logging preferably on data platforms.
  • Familiarity on dimensions on data quality.
  • Conduct root cause analysis for incidents and guide the team in implementing preventative measures.
  • Monitors and evaluates overall strategic data infrastructure; tracks system efficiency and reliability; identifies and recommends efficiency improvements and mitigates operational vulnerabilities
  • Implement monitoring and alerting systems to proactively identify and address issues in the data delivery infrastructure.
  • Continuously optimize data delivery processes for speed, accuracy, and resource efficiency.
  • Respond to and resolve emergent service problems. Design solutions using automation and self-repair rather than relying on alarming and human intervention
  • Participate in on-call rotations, guiding the team in responding to and resolving data-related incidents promptly.
  • Document incident response procedures and lead post-mortem analysis.
  • Handle small team of 2-6 data engineers.

Qualifications

Job Requirements:

  • 7+ years of experience in Data Architecture, Data Management and/or Production Support across large enterprises.
  • 5 years of hands-on Data Engineering or Software Development experience in capital markets / trading industry

Strong understanding of Enterprise architecture patterns, Object Oriented & Service Oriented principles, design patterns, industry best practices

  • Previous experience to handle small team of 2-6 Engineers
  • Experience facilitating discussions and resolving issues across a diverse set of cross functional business & IT stakeholders.
  • Certification in relevant technologies (e.g., AWS Certified DevOps Engineer, Google Cloud Professional DevOps Engineer).
  • Understanding of data governance, security, and compliance best practices.
  • Experience in a technical leadership role within a data reliability or data engineering team.
  • Familiarity with continuous integration/continuous deployment (CI/CD) pipelines.
  • Proficiency in Python, spark and one object-oriented programming language among C#, Java
  • Exposure to Docker/Containers, microservices, distributed systems architecture, Kubernetes and cloud computing preferably Azure.
  • Excellent communications skills and ability to wok with business to extract critical concepts and transform into technical task items
  • Ability to work and lead in an Agile methodology environment

Class: Full-time, exempt

Physical requirements/Working conditions:

  • Climate controlled office environment
  • Minimal physical requirements other than occasional light lifting of boxed materials
  • Dynamic, time-sensitive environment
  • Travel as required
Read Full Description
Confirmed an hour ago. Posted 30+ days ago.

Discover Similar Jobs

Suggested Articles