Site Reliability Engineer
1 month ago
A World-Changing Company
Palantir builds the world’s leading software for data-driven decisions and operations. By bringing the right data to the people who need it, our platforms empower our partners to develop lifesaving drugs, forecast supply chain disruptions, locate missing children, and more.
The Role
We’re looking for Site Reliability Engineers who can help us build, operate, and maintain high-performance, scalable, and reliable services for our production infrastructure, across both cloud & on-prem environments. Site Reliability Engineers combine engineering experience and an innate drive to improve existing systems and processes, with the creativity to develop novel solutions to evolving challenges. Our team strives to automate processes wherever possible, using whichever tools are best for the job. You’ll be the experts for the environments that you operate infrastructure in, helping partner teams build & configure their software to operate reliably within.
We strongly believe in engineering teams being responsible for the operations of their services in production. In this role, you’ll work closely with engineers to advocate and participate in sensible, scalable, systems design and share responsibility with them in diagnosing, resolving, and preventing production issues.
Core Responsiblities- Maintaining availability of cloud & physical Linux servers that power the Palantir platform in air-gapped production environments
- Design, deploy, and operate infrastructure to support customer & product requirements via modern orchestration & monitoring platforms
- Collaborate closely with product teams on requirements & SLOs for deploying software into air-gapped environments.
- Identifying, troubleshooting, and solving network & systems issues
- Scripting to automate away routine operational tasks
- Confidence in troubleshooting complex systems issues independently using observability tools and stack traces
- Ability to identify and automate highly manual tasks
- Comfort with large scale production systems and technologies - for example, load balancing, monitoring, distributed systems, or configuration management
- Proficiency with programming languages such as Java, C++, Python, JavaScript, or similar languages
- Ability to work with a high level of autonomy and responsibility in a rapidly changing environment with dynamic objectives and iteration with users
- Demonstrated ability to continuously learn and drive ongoing improvements within and across teams
- Active security clearance or the ability to obtain a clearance a plus
- 5+ years of experience with Linux system administration (RHEL or equivalent preferred)
- Experience with cloud-based hosting platforms like AWS, Azure, or GCP and/or experience with hardware-based environments
- Familiarity with monitoring systems using tools like Prometheus and writing health checks
Life at Palantir
We want every Palantirian to achieve their best outcomes, that’s why we celebrate individuals’ strengths, skills, and interests, from your first interview to your longterm growth, rather than rely on traditional career ladders. Paying attention to the needs of our community enables us to optimize our opportunities to grow and helps ensure many pathways to success at Palantir. Promoting health and well-being across all areas of Palantirians’ lives is just one of the ways we’re investing in our community. Learn more at Life at Palantir and note that our offerings may vary by region.
In keeping consistent with Palantir’s values and culture, we believe employees are “better together” and in-person work affords the opportunity for more creative outcomes. Therefore, we encourage employees to work from our offices to foster connectivity and innovation. Many teams do offer hybrid options (WFH a day or two a week), allowing our employees to strike the right trade-off for their personal productivity. Based on business need, there are a few roles that allow for “Remote” work on an exceptional basis. If you are applying for one of these roles, you must work from the city and or country in which you are employed. If the posting is specified as Onsite, you are required to work from an office.
Palantir is committed to promoting a culture of diversity, equity, and inclusion. We believe that all Palantirians share the responsibility of upholding our commitment to these values and encourage candidates from a wide range of backgrounds, perspectives, and lived experiences to join us in solving the world’s hardest problems.
Palantir is committed to making the job application process accessible to everyone. If you are living with a disability (visible or not visible) and need to request a reasonable accommodation for any part of the application or hiring process, please reach out and let us know how we can help.
#J-18808-Ljbffr-
Site Reliability Engineer
3 weeks ago
London, United Kingdom TEKsystems Full timeSite Reliability Engineer / SRE Description: My global client is looking for a Site Reliability Engineer / SRE to join their growing team who must have strong experience working within the financial services industry on large complex projects. To be successful in this Site Reliability / SRE project you will need expert experience within: AWS ...
-
Site Reliability Engineer
2 weeks ago
London, United Kingdom Understanding Recruitment Full timeJob DescriptionSite Reliability Engineer I am seeking a Site Reliability Engineer for one of the worlds fastest growing social media platforms. With over 900 Million Daily users.The SRE group come from diverse technical backgrounds, Reliability, Software Engineering and Security Engineering, and have a broad remit ensuring high availability and performance,...
-
Site Reliability Engineer
2 weeks ago
London, United Kingdom Understanding Recruitment Full timeSite Reliability Engineer Check all associated application documentation thoroughly before clicking on the apply button at the bottom of this description.I am seeking a Site Reliability Engineer for one of the worlds fastest growing social media platforms. With over 900 Million Daily users.The SRE group come from diverse technical backgrounds, Reliability,...
-
Site Reliability Engineer
4 weeks ago
London, United Kingdom Understanding Recruitment Full timeSite Reliability Engineer I am seeking a Site Reliability Engineer for one of the worlds fastest growing social media platforms. With over 900 Million Daily users.The SRE group come from diverse technical backgrounds, Reliability, Software Engineering and Security Engineering, and have a broad remit ensuring high availability and performance, and currently...
-
Site Reliability Engineer
6 days ago
London, United Kingdom Understanding Recruitment Full timeJob Description Site Reliability Engineer I am seeking a Site Reliability Engineer for one of the worlds fastest growing social media platforms. With over 900 Million Daily users. The SRE group come from diverse technical backgrounds, Reliability, Software Engineering and Security Engineering, and have a broad remit ensuring high availability and...
-
Site Reliability Engineer
2 days ago
London, United Kingdom Understanding Recruitment Full timeSite Reliability Engineer I am seeking a Site Reliability Engineer for one of the worlds fastest growing social media platforms. With over 900 Million Daily users.The SRE group come from diverse technical backgrounds, Reliability, Software Engineering and Security Engineering, and have a broad remit ensuring high availability and performance, and currently...
-
Site Reliability Engineer
4 weeks ago
London, United Kingdom Understanding Recruitment Full timeSite Reliability Engineer I am seeking a Site Reliability Engineer for one of the worlds fastest growing social media platforms. With over 900 Million Daily users. The SRE group come from diverse technical backgrounds, Reliability, Software Engineering and Security Engineering, and have a broad remit ensuring high availability and performance, and...
-
Site Reliability Engineer
7 days ago
London, United Kingdom Understanding Recruitment Full timeSite Reliability Engineer Check all associated application documentation thoroughly before clicking on the apply button at the bottom of this description.I am seeking a Site Reliability Engineer for one of the worlds fastest growing social media platforms. With over 900 Million Daily users.The SRE group come from diverse technical backgrounds, Reliability,...
-
Site Reliability Engineer
4 weeks ago
London, United Kingdom Understanding Recruitment Full timeSite Reliability Engineer I am seeking a Site Reliability Engineer for one of the worlds fastest growing social media platforms. With over 900 Million Daily users. The SRE group come from diverse technical backgrounds, Reliability, Software Engineering and Security Engineering, and have a broad remit ensuring high availability and performance, and...
-
Site Reliability Engineer
4 weeks ago
London, United Kingdom Experian Full timeJob Description Work that matters – what you’ll be doing We’re looking for a Site Reliability Engineer to join our Experian Data Quality team where you will be working on cutting edge products within our Aperture suite (Data Studio and Data Governance). This role has aspects of both reliability engineering (SRE) and test engineering (SDET)....
-
Site Reliability Engineer
4 weeks ago
London, United Kingdom N Consulting Ltd Full timeJob title: Site Reliability EngineerWork Mode: 3 days office MandatoryLocation: 5 Broadgate, London EC2M 2QS, United KingdomContract Duration: 12 monthsWe’re looking for a Site Reliability Engineer to:· determine the reliability of our digital products, technology services, and the infrastructure that underpins them· minimize the risk and impact of...
-
Site Reliability Engineer
4 weeks ago
London Area, United Kingdom Understanding Recruitment Full timeSite Reliability Engineer I am seeking a Site Reliability Engineer for one of the worlds fastest growing social media platforms. With over 900 Million Daily users.The SRE group come from diverse technical backgrounds, Reliability, Software Engineering and Security Engineering, and have a broad remit ensuring high availability and performance, and currently...
-
Site Reliability Engineer
4 weeks ago
London Area, United Kingdom Understanding Recruitment Full timeSite Reliability Engineer I am seeking a Site Reliability Engineer for one of the worlds fastest growing social media platforms. With over 900 Million Daily users.The SRE group come from diverse technical backgrounds, Reliability, Software Engineering and Security Engineering, and have a broad remit ensuring high availability and performance, and currently...
-
Site Reliability Engineer
4 weeks ago
London Area, United Kingdom Understanding Recruitment Full timeSite Reliability Engineer I am seeking a Site Reliability Engineer for one of the worlds fastest growing social media platforms. With over 900 Million Daily users. The SRE group come from diverse technical backgrounds, Reliability, Software Engineering and Security Engineering, and have a broad remit ensuring high availability and performance, and...
-
AWS Site Reliability Engineer
2 days ago
London, United Kingdom Techruiter Full timeSite Reliability Engineer (SRE) - LLM and Machine Learning London/Remote Roles we're searching for now: – Software Engineering / We are a pioneering technology company specialising in cutting-edge Language Models (LLM) and Machine Learning solutions. We are seeking a highly skilled Site Reliability Engineer (SRE) to join our team and ensure the...
-
AWS Site Reliability Engineer
4 days ago
London, United Kingdom Techruiter Full timeSite Reliability Engineer (SRE) - LLM and Machine Learning London/Remote Roles we're searching for now: – Software Engineering / We are a pioneering technology company specialising in cutting-edge Language Models (LLM) and Machine Learning solutions. We are seeking a highly skilled Site Reliability Engineer (SRE) to join our team and ensure the...
-
Site Reliability Engineer
8 hours ago
London, United Kingdom ByteHire Full timeReference : BH-298c Job Role: Senior Site Reliability Engineer Job Type: Contract IR35 : Inside IR35 Day Rate: £600/Day Contract Duration: 6 months Working Hours: 5 days per week Remote Working : 4 days remote working. 1 day on-site in London Location: Hybrid Remote/London (UK only) Role Overview: We’re looking for a Senior Site Reliability...
-
Site Reliability Engineer
2 hours ago
London, United Kingdom ByteHire Full timeReference : BH-298c Job Role: Senior Site Reliability Engineer Job Type: Contract IR35 : Inside IR35 Day Rate: £600/Day Contract Duration: 6 months Working Hours: 5 days per week Remote Working : 4 days remote working. 1 day on-site in London Location: Hybrid Remote/London (UK only) Role Overview: We’re looking for a Senior Site Reliability...
-
Site Reliability Engineer
4 hours ago
London, United Kingdom ByteHire Full timeReference: BH-298cJob Role: Senior Site Reliability EngineerJob Type: ContractIR35: Inside IR35Day Rate: £600/DayContract Duration: 6 monthsWorking Hours: 5 days per weekRemote Working: 4 days remote working. 1 day on-site in LondonLocation: Hybrid Remote/London (UK only)Role Overview:We’re looking for a Senior Site Reliability Engineer with deep Google...
-
Site Reliability Engineers
2 days ago
London, United Kingdom Capgemini Full timeAt Capgemini Engineering, the world leader in engineering services, we bring together a global team of engineers, scientists, and architects to help the world’s most innovative companies unleash their potential. From autonomous cars to life-saving robots, our digital and software technology experts think outside the box as they provide unique R&D and...