Kestra Financial is seeking a dynamic and experienced Senior Manager to lead our vital Advisor Platform Support team, a cornerstone of our Advisor Platform Engineering organization. This pivotal role combines leadership for both our L2 Application Support and Site Reliability Engineering (SRE) teams, dedicated to ensuring the stability, reliability, and peak performance of our mission-critical, advisor-facing technology platform. You will embody Kestra's core values of "Make it Happen," "One Team," and "Serve," by fostering a high-performing, collaborative culture that delivers exceptional support to our advisors, business support teams, and home office staff.

Reporting to the Head of Advisor Platform Engineering and Architecture, you will leverage your hands-on technical expertise and proven leadership capabilities to drive team growth (scaling from 5 to 8 professionals), operational excellence, and continuous improvement. This position is based at our Encino Trace campus in Southwest Austin, TX, with an on-site schedule Monday through Thursday and optional remote work on Fridays. If you have a passion for building robust systems, mentoring talented engineers, and ensuring our Azure-based technology stack empowers our users, we want to hear from you!

ESSENTIAL DUTIES AND RESPONSIBILITIES: To perform this job successfully, this individual must be able to perform each essential duty satisfactorily:

  • Lead, mentor, and develop a high-performing, combined team of L2 Application Support engineers and Site Reliability Engineers, fostering a culture of collaboration, continuous learning, and accountability.
  • Drive the strategy and execution for platform reliability, scalability, and performance, implementing and championing SRE best practices, including incident response, blameless post-mortem analysis, error budgets, and service reliability monitoring.
  • Oversee day-to-day operations of the Advisor Platform Support teams, ensuring timely resolution of incidents, proactive problem management, adherence to service level objectives (SLOs), and exceptional user support.
  • Establish, track, and report on key performance indicators (KPIs) and SLOs for platform reliability, availability, performance, and customer satisfaction, driving continuous improvement initiatives.
  • Champion operational excellence through automation of toil, robust monitoring strategies, proactive problem resolution, and the development of comprehensive runbooks.
  • Collaborate closely with Business Support Teams, Advisors, Home Office Staff, development, product, and infrastructure teams to understand support needs, ensure new features are supportable, and align on strategic objectives.
  • Participate in on-call rotation leadership and ensure proper escalation procedures are in place for 24/7 incident management; drive efficient incident management processes and ensure timely resolution of platform issues.
  • Partner with development teams to embed operability, reliability, and supportability best practices into the software development lifecycle (SDLC).
  • Manage team capacity, resource allocation, and hiring plans to support business growth and strategic initiatives.
  • Provide regular, constructive feedback, coaching, and career development opportunities to team members, addressing complex team challenges and fostering professional growth.
  • Synthesize functional insights from support operations and reliability engineering to guide team decision-making and contribute to departmental strategy and technology roadmaps.
  • Advocate for Kestra’s core values, cultivating an environment of trust, mutual respect, psychological safety, open communication, and empathetic listening.
  • Ensure compliance with information security, audit, and disaster recovery requirements relevant to a regulated financial services environment.
  • Manage vendor relationships, support contracts, and SRE tooling budgets as needed.
  • Champion Agile development practices (Scrum/Kanban) and promote effective collaboration across teams, ensuring adherence to CI/CD practices using Azure DevOps and GitHub.

KNOWLEDGE, SKILLS, AND/OR ABILITIES: To perform this job successfully, individuals should have the following skills and abilities:

  • Advanced understanding of Microsoft Azure cloud services, including IaaS, PaaS, and SaaS offerings (e.g., Azure App Service, Function Apps, Container Apps, Azure SQL, Cosmos DB, Azure Service Bus, Vnets, Azure Monitor/Log Analytics, API Management, Front Door, Application Gateway, CDN).
  • Strong expertise in monitoring, observability, and telemetry tools and practices (e.g., Azure Application Insights, Log Analytics, KQL, alerting dashboards).
  • Proficiency in scripting and automation languages (e.g., PowerShell, Python, Bash).
  • Deep knowledge of Site Reliability Engineering (SRE) principles and practices (e.g., SLIs/SLOs, error budgets, blameless post-mortems, automation, proactive monitoring).
  • Experience with CI/CD pipelines, Git-based source control (Azure DevOps, GitHub), infrastructure-as-code (IaC) tools, and automated deployment strategies.
  • Strong understanding of application performance monitoring (APM) and troubleshooting techniques.
  • Proven experience with incident management, root cause analysis, and post-mortem processes.
  • Excellent leadership and team development capabilities, with a passion for mentoring and growing technical talent.
  • Exceptional problem-solving, analytical, and troubleshooting skills, with the ability to guide teams in analyzing root causes and implementing effective solutions.
  • Strong project management, prioritization, and organizational skills, with the ability to manage multiple tasks autonomously in a dynamic environment.
  • Excellent communication (written and verbal) and interpersonal skills, with the ability to translate technical concepts for non-technical stakeholders and adapt communication style effectively.
  • Experience with Agile development methodologies (Scrum/Kanban) and working collaboratively in an Agile environment.
  • Knowledge of backend technologies such as C# and .NET (including .NET Framework 4.x and .NET 8+).
  • Understanding of frontend technologies like TypeScript and React. (Desired) (Desired)
  • Experience with database technologies (e.g., MS SQL Server, Cosmos DB) and data engineering concepts. (Desired) (Desired)
  • Familiarity with ITIL or similar service management frameworks, particularly incident, problem, and change management. (Desired) (Desired)
  • Experience with third-party software integration via APIs, SSO, and File Transfer. (Desired)
  • Working knowledge of security best practices and SSO/OAuth flows. (Desired)

SUPERVISORY RESPONSIBILITIES: No

EDUCATION AND/OR EXPERIENCE:

  • Bachelor’s degree in Computer Science, Engineering, Information Technology, or a related technical field. (Master’s degree is a plus)
  • 7-10 years of progressive experience in IT operations, platform support, software engineering, or site reliability engineering.
  • Minimum 3-5 years of management experience leading and developing multi-disciplinary technical teams (e.g., Application Support, SRE, Operations).
  • Proven track record of operating and scaling high-availability, customer-facing SaaS platforms, preferably on Microsoft Azure.
  • Demonstrated success implementing SRE principles and Agile/DevOps practices and improving operational processes within technical teams.
  • Experience managing both application support and infrastructure/SRE functions is highly desirable.
  • Experience with Azure DevOps, Git version control, and CI/CD pipelines.
  • Background in financial services or other highly regulated industries. (Desired)
  • Experience with budgeting, vendor management, and capacity planning for cloud operations. (Desired)

CERTIFICATIONS, LICENSES, REGISTRATIONS:

  • Microsoft Azure certifications (e.g., Azure Administrator Associate, Azure DevOps Engineer Expert, Azure Solutions Architect Expert). (Desired)
  • ITIL Foundation or higher certification. (Desired)
  • Site Reliability Engineering (SRE) related certifications. (Desired)
  • Agile/Scrum certifications (e.g., Certified ScrumMaster (CSM), SAFe Agilist). (Desired)
  • Project Management certification (e.g., PMP). (Desired)

PHYSICAL DEMAND: The physical demands described here are representative of those that must be met by an employee to successfully perform the essential functions of this job. Reasonable accommodation may be made to enable individuals with disabilities to perform the essential functions.

  • Ability to sit or stand at a computer workstation for extended periods while using a keyboard, mouse, and multiple monitors.
  • Frequent, repetitive hand-finger motions for typing, writing, and handling small peripherals or cables.
  • Near-vision sufficient to read electronic documents, review code, and distinguish basic on-screen colors (e.g., for UI verification).
  • Clear spoken communication and active listening for in-person and virtual meetings, incident bridges, and phone calls.
  • Ability to walk short distances, navigate a standard office environment, climb one flight of stairs, and stand during white-boarding or presentations. Sit-stand desks and other ergonomic furniture are available upon request.
  • Ability to lift and move equipment or boxed materials weighing up to 20 lbs (e.g., laptops, small servers, office supplies).
  • Hybrid roles: Primary work performed on-site at the Encino Trace campus (Southwest Austin, TX) Monday–Thursday; optional remote work on Fridays.
  • Fully remote roles: Primary work performed from the employee’s home office within approved locations; reliable high-speed internet and an ergonomically safe workspace are required.
  • Participation in overnight or weekend on-call rotations and critical production releases may require work outside standard business hours.
Read Full Description
Confirmed 22 hours ago. Posted 28 days ago.

Discover Similar Jobs

Suggested Articles