Site Reliability Engineer (SRE) - Senior

ECS

ECS is seeking a Site Reliability Engineer (SRE) – Senior to work in our Arlington, VA office. Please Note: This position is contingent upon contract award.

Program Description

ECS is seeking talented professionals to join our successful and growing team in building the next-generation Threat Intelligence Enterprise Service (TIES) solution. The TIES Program is the Cybersecurity and Infrastructure Security Agency’s (CISA) dynamic approach to fulfilling its federally mandated cyber information sharing responsibilities and ensuring real-time automated threat intelligence reaches key security partners. The TIES product is an integrated suite of multiple Commercial Off the Shelf (COTS) products, software configuration packages, and custom code which work together to operate as an integrated solution tailored to meet CISA requirements.

We seek driven professionals who excel in a dynamic, fast-paced, and highly collaborative environment, where critical thinking, problem-solving, and a mission-focused approach are essential. A passion for continuous learning, improvement, and cybersecurity is vital.

As a small team committed to radically improving government, every member directly shapes ECS’s direction and success. We take pride in our stewardship, holding deep responsibility for the solutions we develop. Collaboration is at the heart of our work—both within our team and alongside our federal partners.

Role & Responsibilities:

ECS is seeking a Site Reliability Engineer (SRE) - Senior to play a key role in defining, implementing the SRE requirements for the TIES program to ensure the reliability, availability, and performance of our critical production environments.

The Senior SRE will contribute to a culture of continuous improvement, identifying areas for enhancement, and driving initiatives to improve system reliability, scalability, and efficiency.

The successful candidate will have demonstrated hands-on experience designing, implementing, and maintaining solutions to ensure that systems, including infrastructure and applications, are resilient, highly available, and performant. The Senior SRE will also play a critical role in defining and measuring the Service Level Objectives (SLOs) and Service Level Indicators (SLIs) for our solution.

The Senior SRE will be responsible for setting up comprehensive logging, monitoring, and alerting solutions using the Elastic stack and other tools as necessary to ensure the continuous performance of services. Additionally, they will respond to incidents, perform root cause analyses, and implement solutions to prevent reoccurrences. The Senior SRE will work in close collaboration with other SRE team members, developers, testers, infrastructure engineers, DevOps engineers, and other stakeholders to integrate reliability and observability into the software development lifecycle.

Read Full Description
Confirmed 22 hours ago. Posted 6 days ago.

Discover Similar Jobs

Suggested Articles