Senior Site Reliability Engineer- Remote

1 month ago


London, United Kingdom ClickHouse Full time

We are committed to providing our customers with reliable and secure services so we are building out our newly formed Site Reliability Engineering team. As one of the first joiners to our Reliability Engineering Team at ClickHouse, you will be responsible for building and leading processes to ensure the reliability, availability, scalability, and performance of our cloud infrastructure that runs ClickHouse databases. You will collaborate with different teams like Control Plane, Dataplane, Core, Security, Support and Operations and guide them to design and implement scalable, secure, highly available and fault-tolerant distributed systems. You will also own the areas of incident management and response, post-mortem analysis including running blameless postmortems, and continuous improvement of our ClickHouse services. You will be leveraging your software engineering expertise to develop software platforms and tools to optimize the operational and engineering efficiencies of ClickHouse Cloud. This role is a unique opportunity to make a significant impact on our elastic, limitless scale, high-performance, serverless ClickHouse Cloud.

What will you do?

  • Collaborate with various engineering teams in ClickHouse to design and implement scalable, secure, and highly available systems for ClickHouse.
  • Establish and manage service level objectives (SLOs) and service level agreements (SLAs) for ClickHouse Cloud.
  • Ensure all the infrastructure components in ClickHouse Cloud (including Dataplane, Control Plane and ClickHouse Core) have monitoring and alerting in place to ensure timely detection and resolution of incidents.
  • Enhance and refine incident response processes and post-mortem analysis for any outages in ClickHouse Cloud including working with the support team to communicate to the impacted customers.
  • Continuously improve the reliability and performance of our ClickHouse services.
  • Plan, enable, and drive Chaos initiatives across Engineering teams, based upon internal priorities.
  • Manage on-call processes to respond to performance and reliability issues, and establish best practices for coordinating escalation to resolve issues and minimize downtime.

About you:

  • Bachelor’s or Master’s degree in Computer Science or a related field.
  • At least 8 years of experience in Site Reliability Engineering or a related field.
  • Previous experience using ClickHouse in production.
  • Hands on experience with Go and/or Python.
  • Strong knowledge of cloud computing platforms such as AWS, Azure, or Google Cloud Platform.
  • Excellent understanding of distributed databases and SQL, particularly ClickHouse is a major plus.
  • Hands on experience with container orchestration tools such as Kubernetes or Docker Swarm.
  • Strong experience with automation and configuration management tools such as Ansible, Terraform, or Puppet.
  • You are a strong problem solver and have solid production debugging skills.
  • You are passionate about efficiency, availability, scalability, and data governance.
  • You thrive in a fast paced environment, and see yourself as a partner with the business with the shared goal of moving the business forward.
  • You have a high level of responsibility, ownership, and accountability.
  • Excellent communication and interpersonal skills.

#LI-Remote

#J-18808-Ljbffr

  • London, United Kingdom Method Resourcing Full time

    **Senior Site Reliability Engineer | Senior Devops Engineer | Senior SRE | Senior DevOps | AWS | Terraform | Python | Docker | CI/CD | Jenkins | Kubernetes | Git** **Cambridge / Remote - Permanent - 100k + 10% Bonus + Benefits** Method Resourcing have the utmost privilege of working alongside a fantastic IOT organisation on a mission to scale...


  • London, United Kingdom Prism Digital Full time

    **Senior Site Reliability Engineer (SRE) | GCP/AWS | Market Intelligence Leaders** We have an exciting opportunity for a Senior Site Reliability Engineer (SRE) to join a global organisation involved in the market intelligence space. Our client's AI-powered platform provides businesses with world-class and real-time consumer analytics. They are looking for...


  • London, United Kingdom DRAGOONIS TECHNOLOGIES LIMITED Full time

    Reference: BH-298cJob Role: Senior Site Reliability EngineerJob Type: ContractIR35: Inside IR35Day Rate: £600/DayContract Duration: 6 monthsWorking Hours: 5 days per weekRemote Working: 4 days remote working. 1 day on-site in LondonLocation: Hybrid Remote/London (UK only) Role Overview:Were looking for a Senior Site Reliability Engineer with deep Google...


  • London, United Kingdom DRAGOONIS TECHNOLOGIES LIMITED Full time

    Reference: BH-298cJob Role: Senior Site Reliability EngineerJob Type: ContractIR35: Inside IR35Day Rate: £600/DayContract Duration: 6 monthsWorking Hours: 5 days per weekRemote Working: 4 days remote working. 1 day on-site in LondonLocation: Hybrid Remote/London (UK only)Role Overview:Were looking for a Senior Site Reliability Engineer with deep Google...


  • London, United Kingdom NearTech Search Full time

    Senior Site Reliability Engineer (GCP, AWS, K8s), UK (remote), £120,000 + bens An extremely well-funded and fast-growing AI-Driven Data company are in need of a new (GCP & AWS) Senior Site Reliability Engineer to join their growing tech team. They have cultivated an extremely innovative culture and working environment where the team are encouraged to...


  • London, United Kingdom NearTech Search Full time

    Senior Site Reliability Engineer (GCP, AWS, K8s), UK (remote), £120,000 + bens An extremely well-funded and fast-growing AI-Driven Data company are in need of a new (GCP & AWS) Senior Site Reliability Engineer to join their growing tech team. They have cultivated an extremely innovative culture and working environment where the team are encouraged to...


  • london, United Kingdom ByteHire Full time

    Reference: BH-298cJob Role: Senior Site Reliability EngineerJob Type: ContractIR35: Inside IR35Day Rate: £600/DayContract Duration: 6 monthsWorking Hours: 5 days per weekRemote Working: 4 days remote working. 1 day on-site in LondonLocation: Hybrid Remote/London (UK only)Role Overview:We’re looking for a Senior Site Reliability Engineer with deep Google...


  • London, United Kingdom ByteHire Full time

    Reference : BH-298c Job Role: Senior Site Reliability Engineer Job Type: Contract IR35 : Inside IR35 Day Rate: £600/Day Contract Duration: 6 months Working Hours: 5 days per week Remote Working : 4 days remote working. 1 day on-site in London Location: Hybrid Remote/London (UK only) Role Overview: We’re looking for a Senior Site Reliability...


  • London, United Kingdom ByteHire Full time

    Reference: BH-298cJob Role: Senior Site Reliability EngineerJob Type: ContractIR35: Inside IR35Day Rate: £600/DayContract Duration: 6 monthsWorking Hours: 5 days per weekRemote Working: 4 days remote working. 1 day on-site in LondonLocation: Hybrid Remote/London (UK only)Role Overview:We’re looking for a Senior Site Reliability Engineer with deep Google...


  • London, United Kingdom ByteHire Full time

    Reference : BH-298c Job Role: Senior Site Reliability Engineer Job Type: Contract IR35 : Inside IR35 Day Rate: £600/Day Contract Duration: 6 months Working Hours: 5 days per week Remote Working : 4 days remote working. 1 day on-site in London Location: Hybrid Remote/London (UK only) Role Overview: We’re looking for a Senior Site Reliability...


  • London, United Kingdom ByteHire Full time

    Reference: BH-298cJob Role: Senior Site Reliability EngineerJob Type: ContractIR35: Inside IR35Day Rate: £600/DayContract Duration: 6 monthsWorking Hours: 5 days per weekRemote Working: 4 days remote working. 1 day on-site in LondonLocation: Hybrid Remote/London (UK only)Role Overview:We’re looking for a Senior Site Reliability Engineer with deep Google...


  • London, United Kingdom ByteHire Full time

    Reference: BH-298cIf you are interested in applying for this job, please make sure you meet the following requirements as listed below.Job Role: Senior Site Reliability EngineerJob Type: ContractIR35: Inside IR35Day Rate: £600/DayContract Duration: 6 monthsWorking Hours: 5 days per weekRemote Working: 4 days remote working. 1 day on-site in LondonLocation: ...


  • London, United Kingdom ByteHire Full time

    Reference: BH-298cIf you are interested in applying for this job, please make sure you meet the following requirements as listed below.Job Role: Senior Site Reliability EngineerJob Type: ContractIR35: Inside IR35Day Rate: £600/DayContract Duration: 6 monthsWorking Hours: 5 days per weekRemote Working: 4 days remote working. 1 day on-site in LondonLocation: ...


  • London, United Kingdom ByteHire Full time

    Job Description Reference : BH-298c Job Role: Senior Site Reliability Engineer Job Type: Contract IR35 : Inside IR35 Day Rate: £600/Day Contract Duration: 6 months Working Hours: 5 days per week Remote Working : 4 days remote working. 1 day on-site in London Location: Hybrid Remote/London (UK only) Role Overview: We’re looking for a Senior...


  • City of London, Greater London, United Kingdom Square One Resources Full time

    Site Reliability Engineer | Remote | Application Development Work Type Contract Remote Work Yes Fully remote Job Type: 6 Month Initial Contract (2-3 year program) The Site reliability engineers (SREs) combine engineering experience and an innate drive to improve existing systems and processes, with the creativity to develop novel solutions to...


  • London Area, United Kingdom ByteHire Full time

    Reference : BH-298c Job Role: Senior Site Reliability Engineer Job Type: Contract IR35 : Inside IR35 Day Rate: £600/Day Contract Duration: 6 months Working Hours: 5 days per week Remote Working : 4 days remote working. 1 day on-site in London Location: Hybrid Remote/London (UK only) Role Overview: We’re looking for a Senior Site Reliability...


  • London Area, United Kingdom ByteHire Full time

    Reference: BH-298cJob Role: Senior Site Reliability EngineerJob Type: ContractIR35: Inside IR35Day Rate: £600/DayContract Duration: 6 monthsWorking Hours: 5 days per weekRemote Working: 4 days remote working. 1 day on-site in LondonLocation: Hybrid Remote/London (UK only)Role Overview:We’re looking for a Senior Site Reliability Engineer with deep Google...


  • London Area, United Kingdom ByteHire Full time

    Reference: BH-298cJob Role: Senior Site Reliability EngineerJob Type: ContractIR35: Inside IR35Day Rate: £600/DayContract Duration: 6 monthsWorking Hours: 5 days per weekRemote Working: 4 days remote working. 1 day on-site in LondonLocation: Hybrid Remote/London (UK only)Role Overview:We’re looking for a Senior Site Reliability Engineer with deep Google...


  • London, United Kingdom CIRCLE Full time

    Circle is a financial technology company at the epicenter of the emerging internet of money, where value can finally travel like other digital data — globally, nearly instantly and less expensively than legacy settlement systems. This ground-breaking new internet layer opens up previously unimaginable possibilities for payments, commerce and markets that...


  • London, United Kingdom Freshtechit Full time

    Job Description Senior Site Reliability Engineer ( Global organisation) Hybrid working We are looking for an experienced hands-on Senior SRE to join the talented Infrastructure team – helping to maintain and develop the platform that powers all our online products. You will have the opportunity to own large scale features, as well as the chance to...