Apply

1,000+ Similar Jobs

Copart is seeking a Site Reliability Engineer for our Dallas HQ office specializing in Systems and application monitoring and troubleshooting. This position will be part of a 24/7 Global Network Operations team that monitors and provides L1/L2 support to meet the SLA commitment of Copart's Global Data Center and Application infrastructure.

Ideal Candidate:

Team Player -- Candidate that works well in a collaborate team environment. Effective communication skills and a great personality is a must.

Talented -- Your skillsets expand beyond the core knowledge of Windows, UNIX, or Linux platforms. Not only should you be knowledgeable in core Systems, experience in VM environment, Networking, Scripting, Automation, Kubernetes, and awareness of other technologies is a plus.

Innovative -- We are always looking for ways improve our process and procedures. The ideal candidate should have natural desire to make things better and not be afraid to speak up if an opportunity for improvement arises.

Essential Duties and responsibilities:

Perform application deployments using Jenkins and Spinnaker on Prod and Non-Prod Environments.
Coordinate and Perform periodic failover testing of Copart's Network/Systems Infrastructure and application environments.
Build/Optimize tools with Python, Ansible and Grafana to monitor/collect key metrics and automate remediation of Infrastructure or application issues.
Perform monthly security patching of Systems OS and applications.
Maintenance and Optimization of the following tools and repositories (Nagios, Netbox, Prometheus, Grafana, Sumologic, Selenium, Instana, Github and more...)
Interface with internal teams (Product development, DevOps, Network, Systems and DB)
Utilize internal monitoring tools to analyze and pro-actively monitor Copart's Global Data Center and Application infrastructure to catch and quickly resolve issues before it arises.
Quickly and efficiently communicate issues with several of Copart's domains.
Develop analysis and reporting capabilities; monitor performance and quality control plans to identify improvements.
Document standard operating procedures, diagrams, and training materials for use by the teams.

Requirements:

Progressive knowledge of monitoring protocols such as SNMP, Netflow, Syslog etc.
Intermediate programming and scripting knowledge
Knowledge in different types of monitoring methodologies i.e Agent and agentless checks.
Troubleshooting knowledge with Linux/Unix/Windows based systems
Working with VM management software - Vsphere
Knowledge of monitoring tools, Nagios, Solar Winds, Site24*7
Be flexible and be able to handle competing/changing priorities.
Very strong oral and written communication skills
Must be a self-starter with the ability to work well in a team environment
Flexible schedule required
Knowledge of the areas are a BIG plus
Dashboard applications such as Grafana
Scripting/Programing/Automation -- Python, Bash, Ansible, Stackstorm
Experience working with Github, Jenkins, Spinnaker, Docker, Kubernetes
Front end scripting languages, libraries and frameworks such as Java, Javascript, Angular JS, Flask etc.

#LI-MS1

Read Full Description

Apply

Jobs at Copart
Similar Jobs

Confirmed 7 hours ago. Posted 30+ days ago.

Site Reliability Engineer

Copart

Discover Similar Jobs

Sr Site Reliability Engineer - Remote

Site Reliability Engineer

Senior Site Reliability Engineer

Senior Site Reliability Engineer

Director, DevOps & Site Reliability

Suggested Articles

Site Reliability Engineer

Copart

Discover Similar Jobs

Sr Site Reliability Engineer - Remote

Site Reliability Engineer

Senior Site Reliability Engineer

Senior Site Reliability Engineer

Director, DevOps & Site Reliability

Suggested Articles

Virtual Reality Focused Software Jobs