POSTED Jul 31

Site Reliability Engineer - Database Operations

at PalantirNew York, NY

Share:

A World-Changing Company

Palantir builds the world’s leading software for data-driven decisions and operations. By bringing the right data to the people who need it, our platforms empower our partners to develop lifesaving drugs, forecast supply chain disruptions, locate missing children, and more.

The Role

We’re looking for a Site Reliability Engineer who can help our Database Operations team scale, maintain, operate, and modernize the databases behind Palantir’s products. Site Reliability Engineers combine engineering experience and an innate drive to improve existing systems and processes with the creativity to develop novel solutions to evolving challenges. Our team strives to automate processes wherever possible, using whichever tools are best for the job.

We strongly believe in engineering teams being responsible for operating their services in production. In this role, you’ll work closely with engineers to design sensible, scalable systems and to diagnose, resolve, and prevent production issues.

Site Reliability Engineering exposes you to a broad range of products and business use cases, and builds operational and systems skills that are useful across the industry and in most engineering roles. Most of the work has implications for the entire fleet of Palantir environments, and therefore there is ample opportunity to improve the performance, stability, and costs at a very large scale.

Technologies We Use

Database Operations provides the majority of the support for Cassandra, Elasticsearch, and Kafka, along with their orchestrating services to ensure they operate as intended within Kubernetes, across a variety of clouds and on-premise, with varying degrees of access.
The Role

We’re looking for a Site Reliability Engineer who can help our Database Operations team scale, maintain, operate, and modernize the databases behind Palantir’s products. Site Reliability Engineers combine engineering experience and an innate drive to improve existing systems and processes with the creativity to develop novel solutions to evolving challenges. Our team strives to automate processes wherever possible, using whichever tools are best for the job.

We strongly believe in engineering teams being responsible for operating their services in production. In this role, you’ll work closely with engineers to design sensible, scalable systems and to diagnose, resolve, and prevent production issues.

Site Reliability Engineering exposes you to a broad range of products and business use cases, and builds operational and systems skills that are useful across the industry and in most engineering roles. Most of the work has implications for the entire fleet of Palantir environments, and therefore there is ample opportunity to improve the performance, stability, and costs at a very large scale.

Technologies We Use

Database Operations provides the majority of the support for Cassandra, Elasticsearch, and Kafka, along with their orchestrating services to ensure they operate as intended within Kubernetes, across a variety of clouds and on-premise, with varying degrees of access.

Core Responsiblities

  • Build expertise on pre-existing systems — their edge cases, failure modes, and life cycles - and how to improve the long-term reliability and scalability of Palantir’s services.
  • Modify core services and infrastructure to improve stability and performance.
  • Participate in operations, including on-call rotations during business hours and occasional weekends. Troubleshoot and debug availability and latency of Palantir’s databases and their clients.
  • Modernize the fleet by migrating infrastructure, upgrading major versions, and right-sizing to optimize cost and performance.
  • What We Value

  • Confidence in troubleshooting complex issues independently using observability tools and stack traces.
  • Ability to identify and remove toil.
  • Comfortable with and curious about large scale production systems and technologies - for example, load balancing, monitoring, distributed systems, or configuration management.
  • Ability to work with a high level of autonomy and responsibility in a rapidly changing environment with dynamic objectives and iteration with users.
  • Demonstrated ability to develop improvements to services.
  • What We Require

  • Engineering background in Computer Science, Mathematics, Software Engineering, Physics or similar field.
  • Familiarity with storage and data processing systems, cloud infrastructure, and other technical tools.
  • Familiarity with monitoring systems using tools like Prometheus and writing health checks
  • Strong written and verbal communication skills and ability to iterate quickly with teammates, incorporating feedback and holding a high bar for quality.


  • Our benefits aim to promote health and wellbeing across all areas of Palantirians’ lives. We work to continuously improve our offerings and listen to our community as we design and update them. The list below details our available benefits and some of the perks that can be enjoyed as an employee of Palantir Technologies.

    Benefits

  • Medical, dental, and vision insurance
  • Life and disability coverage
  • Paid leave for new parents and emergency back-up care for all parents
  • Family planning support, including fertility, adoption, and surrogacy assistance
  • Stipend to help with expenses that come with a new child
  • Commuter benefits
  • Relocation assistance
  • Unlimited paid time off
  • 2 weeks paid time off built into the end of each year
  • Salary

    The estimated salary range for this position is estimated to be $125,000 - $185,000/year. Total compensation for this position may also include Restricted Stock units, sign-on bonus and other potential future incentives. Further note that total compensation for this position will be determined by each individual’s relevant qualifications, work experience, skills, and other factors. This estimate excludes the value of any potential sign-on bonus; the value of any benefits offered; and the potential future value of any long-term incentives.

    Life at Palantir

    We want every Palantirian to achieve their best outcomes, that’s why we celebrate individuals’ strengths, skills, and interests, from your first interview to your longterm growth, rather than rely on traditional career ladders. Paying attention to the needs of our community enables us to optimize our opportunities to grow and helps ensure many pathways to success at Palantir. Promoting health and well-being across all areas of Palantirians’ lives is just one of the ways we’re investing in our community. Learn more at Life at Palantir and note that our offerings may vary by region.

    In keeping consistent with Palantir’s values and culture, we believe employees are “better together” and in-person work affords the opportunity for more creative outcomes. Therefore, we encourage employees to work from our offices to foster connectivity and innovation. Many teams do offer hybrid options (WFH a day or two a week), allowing our employees to strike the right trade-off for their personal productivity. Based on business need, there are a few roles that allow for “Remote” work on an exceptional basis. If you are applying for one of these roles, you must work from the state in which you are employed. If the posting is specified as Onsite, you are required to work from an office.


    Palantir is committed to promoting a culture of diversity, equity, and inclusion and is proud to be an Equal Employment Opportunity and Affirmative Action employer. We believe that all Palantirians share the responsibility of upholding our commitment to these values and encourage candidates from a wide range of backgrounds, perspectives, and lived experiences to join us in solving the world’s hardest problems. Palantir does not discriminate based upon race, religion, color, national origin, gender (including pregnancy, childbirth, or related medical conditions), sexual orientation, gender identity, gender expression, age, status as a protected veteran, status as an individual with a disability, or other applicable legally protected characteristics. Palantir is committed to working with and providing reasonable accommodations to qualified individuals with physical and mental disabilities. Please see the United States Department of Labor’s EEO poster, EEO poster supplement and Pay Transparency Notice for additional information.

    Palantir is committed to making the job application process accessible to everyone. If you are living with a disability (visible or not visible) and need to request a reasonable accommodation for any part of the application or hiring process, please reach out and let us know how we can help.
    A World-Changing Company

    Palantir builds the world’s leading software for data-driven decisions and operations. By bringing the right data to the people who need it, our platforms empower our partners to develop lifesaving drugs, forecast supply chain disruptions, locate missing children, and more.

    Please mention that you found this job on Moaijobs, this helps us get more companies to post here, thanks!

    Related Jobs

    Groq
    Principal Site Reliability Engineer, Infrastructure Platform
    Mountain View, CA (Remote)
    Figure
    Senior Reliability Engineer
    Sunnyvale, CA
    Figure
    Operations Validation Engineer
    Sunnyvale, CA
    Figure
    Senior Reliability Test Engineer
    Sunnyvale, CA
    Invisible
    Operations Manager
    Worldwide - Remote