Palantir has flagged the Product Reliability Engineer job as unavailable. Let’s keep looking.

A World-Changing Company

At Palantir, we’re passionate about building software that solves problems. We partner with the most important institutions in the world to transform how they use data and technology. Our software has been used to stop terrorist attacks, discover new medicines, gain an edge in global financial markets, and more. If these types of projects excite you, we'd love for you to join us.

The Role

Palantir systems are deployed at the world’s most critical institutions to help them solve their greatest challenges. Users at customer sites around the world rely on Palantir’s rich feature set, high availability, and performance to pursue their missions. Site Reliability Engineers (SREs) make sure our expanding number of customer deployments continue to deliver insights from massive scale data in real time.

SREs enable our engineers in the field to pre-empt problems before they ever threaten our customers’ workflows. SREs combine engineering experience and an innate drive to improve existing systems and processes with the creativity to develop novel solutions to evolving challenges. Our team strives to automate processes wherever possible, using whichever tools are best for the job. Our responsibilities include designing systems for new implementations of Palantir, administering co-located servers (including hardware troubleshooting) and maintaining database platforms.

SREs work with our software engineering teams to understand threats to our platform and improve our products' performance and security. We work side by side with Palantir’s implementation teams and our customers' IT departments to understand their business’ unique problems and to develop innovative solutions. We document our successes and communicate them back to Palantir’s product teams to advance the way our hardware, software, and network solutions are deployed to minimize failure rates and increase overall system reliability.

Our SRE team is drawn from some of the best in the industry, and we've created a collaborative environment with a focus on mentorship and developing our skills in new technologies.

What We Value

  • 5+ years of experience with Linux system administration (RHEL or CentOS preferred)
  • Confidence in troubleshooting complex systems issues independently using observability tools and stack traces
  • Good scripting ability with Bash, Python, Ruby, or Perl
  • Experience with monitoring systems using tools such as Prometheus or Nagios, and writing health checks
  • Moderate experience with TCP/IP networking
  • Practical experience managing databases or search engines, such as Postgres, MySQL, Oracle, Cassandra or ElasticSearch
  • Ability to work independently with minimal supervision
  • Ability to participate in a 24/7 on-call rotation
  • Unwavering commitment to operational security and best practices

Preferred

  • BSc/MS in Computer Science
  • Experience with virtualization using VMWare ESX, KVM, Xen, or Docker
  • Experience with system management tools like Puppet or Chef
  • Knowledge of server hardware and/or experience working with Amazon Web Services (AWS)

Requirements

  • U.S. Security Clearance (secret and above)

Palantir is proud to be an Equal Employment Opportunity and Affirmative Action employer. We do not discriminate based upon race, religion, color, national origin, gender (including pregnancy, childbirth, or related medical conditions), sexual orientation, gender identity, gender expression, age, status as a protected veteran, status as an individual with a disability, or other applicable legally protected characteristics. Palantir is committed to working with and providing reasonable accommodations to applicants with physical and mental disabilities. Please see the United States Department of Labor's EEO poster and EEO poster supplement for additional information.

Read Full Description
Confirmed 21 hours ago. Posted 30+ days ago.

Discover Similar Jobs

Suggested Articles