Site Reliability Engineer


The Site Reliability Engineer will work full-time in supporting the MINDBODY software where it relates to operational tasks and needs. This team will participate in the 24x7x365 on-call schedule, working with the Network Operations Center and Production Engineering teams. Core tasks will include resolving production environment issues, creating tools for the Network Operations Center, and actively analyzing and improving error logs. THE BASICS Previous knowledge and hands-on experience in a Site Reliability environment, including demonstrated knowledge of well-known DevOps technologies/services Experience administering Microsoft application servers and web servers Understanding of networking, including load balancing Solid background in C# (MVC), SQL, ASP.NET (MVC), .NET Web Services, ADO/ADO.NET, IIS, WCF, PowerShell Experience with release management tools, including TFS/Visual Studio Experience working with an advanced APM, building dashboards to help quickly identify root causes of production issues, and report creation to show performance and scale trends Curiosity to continue to evaluate new technologies to strengthen our Site Reliability platform Ability to work effectively within a team and with cross-functional technical, operational, and business teams Consistently demonstrated ability and commitment to deliver major initiatives from beginning to full deployment in a timely manner and solve time-critical site issues within defined SLA An eagerness to learn enough about whatever tools or tricks are needed to get the job done ABOVE & BEYOND PowerShell automation experience with JAMS Scheduler Hands-on experience with F5 LTM management via API and iRule creation Strong knowledge of Microsoft System Operations Center (SCOM) Strong knowledge of Systems Center Configuration Manager (SCCM) and SC Orchestrator Related certifications with Microsoft (MCSA, MCSE, MCSD) and F5 (F5-CTS) THE JOB This is a representative list of the general duties the position may be asked to perform and is not intended to be all-inclusive. Work with the Database Operations/Engineering and System Engineering to troubleshoot operational issues from a code point of view Excellent ability to diagnose code issues in a robust, SaaS production environment Find correlations between recent code deployments and production issues Analyze error logs to help prioritize open production issues and continuously improve error log report Tune admin reports to help the Network Operations Center monitoring be alerted to the most important issues and exclude monitoring issues that are not important to the business Build tools that help the Network Operations Control team monitor the production environment Work with the Production Engineering team to maintain and upgrade operational environment without outages Build tools that help maintain the disaster recovery environment Design and support robust build, deployment, and configuration management systems Create tools and processes to make development team more efficient Identify and prevent scaling problems before they occur Employ preventive measures to maintain high availability servers Be the champion of site security Participate in 24x7x365 on-call schedule NEXT STEPS Make sure the palm trees and 77 ° days call your name - learn more about San Luis Obispo and the central coast of California before applying Final on-site interview and relocation coverage available to those already in the US

Read Full Description
Confirmed 30+ days ago. Posted 30+ days ago.

Discover Similar Jobs

Suggested Articles