Principal SRE Engineer

3 weeks ago


London, Greater London, United Kingdom loveholidays Full time

About Us

We are a dynamic online travel agency where technology drives our achievements. Our platform has successfully facilitated countless dream vacations for travelers.

With millions of daily visitors, our diverse range of services processes thousands of requests each second, while ensuring optimal search performance. Our observability framework captures extensive logs and metrics, enabling us to maintain high operational standards.

We prioritize innovation through open-source technologies, contributing back to the community through various initiatives.

Key Responsibilities

  • As our inaugural Site Reliability Engineer, you will enhance our SRE methodologies, focusing on incident management, postmortems, service level objectives (SLOs), and error budgets.
  • You will aid in the development of robust, efficient, and scalable systems, collaborating closely with our existing Platform Infrastructure team.
  • Promote the integration of SRE principles across teams to elevate overall reliability.
  • Monitor and improve the reliability metrics of our platform.
  • Facilitate a balance between reliability and feature rollout through effective use of SLOs and error budgets.

Our engineering teams are responsible for the entire lifecycle of services, from initial development to high-load production operations. Your role will be to empower these teams in their operational endeavors.

What You'll Be Working On

  • Establish our SRE function by advocating for best practices and processes in reliability.
  • Identify performance bottlenecks in critical applications using advanced profiling tools.
  • Develop tools or enhance existing applications with a focus on reliability and performance.
  • Ensure our systems can handle significant load increases through strategic improvements.
  • Reduce mean time to detection and recovery by enhancing our observability and alerting mechanisms.
  • Identify system vulnerabilities through rigorous analysis.

Our architecture is service-oriented and hosted on cloud platforms. Engineering teams manage their services' infrastructure using modern tools and practices.

We emphasize observability, continuously refining our monitoring and alerting systems, currently centered around a robust ecosystem. Our service mesh enables comprehensive visibility of all production services.

Performance and scalability are core to our software and infrastructure development, achieved through a blend of computer science principles and cutting-edge cloud technologies.

Our teams are encouraged to select the most suitable tools for their tasks, utilizing a variety of programming languages including Java, Go, Rust, Python, and JavaScript.

Qualifications

  • Strong understanding of SRE principles.
  • Expertise in performance and scalability optimization.
  • Familiarity with HTTP, web services, and RESTful architectures.
  • Experience with containers and cloud environments.
  • Knowledge of testing, reliability, and monitoring practices.
  • Proficiency in Linux systems.
  • Skills in low-level debugging and troubleshooting.

What We Offer

  • Generous company pension contributions.
  • Dedicated training budget for professional development.
  • Discounted travel opportunities for you and your loved ones.
  • Annual leave of 25 days, plus public holidays, with additional days accrued based on service.
  • Flexible annual leave options.
  • Benefits including a cycle-to-work scheme and eye care vouchers.

  • Software Engineer

    1 day ago


    London, Greater London, United Kingdom Hunter Bond Full time £130,000

    Job Title: Software Engineer - SRECompany: Hunter BondLocation: HybridJob Type: Full-timeJob Description:We are seeking a highly skilled Software Engineer - SRE to join our elite team at Hunter Bond. As a key member of our organization, you will be responsible for designing, developing, and maintaining our cutting-edge technology infrastructure.Key...

  • Software Engineer

    1 day ago


    London, Greater London, United Kingdom Hunter Bond Full time £130,000

    Job Title: Software Engineer - SRECompany: Hunter BondLocation: HybridJob Type: Full-timeJob Description:We are seeking a highly skilled Software Engineer - SRE to join our elite team at Hunter Bond. As a key member of our organization, you will be responsible for designing, developing, and maintaining our cutting-edge technology infrastructure.Key...

  • Software Engineer

    3 days ago


    London, Greater London, United Kingdom Hunter Bond Full time £130,000

    About the RoleWe are seeking a highly skilled Software Engineer - SRE to join our dynamic team at Hunter Bond. As a key member of our organization, you will be responsible for designing and implementing cutting-edge technology solutions that drive business growth and innovation.Key ResponsibilitiesDesign and develop complex mission-critical...

  • SRE Team Lead

    3 weeks ago


    London, Greater London, United Kingdom Harrington Starr Full time

    Job OverviewLead Site Reliability Engineer - Remote OpportunityDynamic Start-up EnvironmentSalary Range: £95,000 - £105,000Position SummaryWe are excited to present a unique opportunity to join Harrington Starr as they enhance their operational capabilities and prepare for the launch of essential services. We are in search of an experienced Site...

  • SRE Team Lead

    3 weeks ago


    London, Greater London, United Kingdom Harrington Starr Full time

    Job OverviewSenior Site Reliability Engineer - Remote Work AvailableInnovative Start-up EnvironmentSalary Range: £95,000 - £105,000Position SummaryWe are excited to announce an opportunity to join Harrington Starr as we expand our operations and prepare to launch essential services. We are in search of an experienced Site Reliability Engineering (SRE) Lead...


  • London, Greater London, United Kingdom Palta Full time

    Palta is a dynamic technology platform that develops a variety of mobile applications centered around health and wellness, boasting a collective audience of over 60 million active users each month. Our impressive portfolio features successful brands such as Flo, a global leader in female health, Simple, a nutrition and wellness application with more than 15...


  • London, Greater London, United Kingdom Xpertise Recruitment Full time

    Job OverviewPosition: Platform Engineering Manager (DevOps / SRE / Cloud)Xpertise Recruitment is seeking a skilled Platform Engineering Manager to spearhead the development of a new team as part of an innovative initiative. This role will involve overseeing DevOps Engineers who are integrated within various teams, playing a crucial role in shaping the DevOps...


  • London, Greater London, United Kingdom Stanford Black Limited Full time £200,000

    **About Stanford Black Limited**We are a prestigious multi-strategy hedge fund that invests heavily in technology, enabling us to develop some of the most sought-after software in the finance industry.**Job Overview**We are seeking an experienced Site Reliability Engineer with expertise in monitoring and observability to lead our opensource transformation...


  • London, Greater London, United Kingdom Client Server Full time

    Job Summary:We are seeking a highly skilled Site Reliability Engineer / SRE to join our team at Client Server. As a key member of our technology investment company, you will play a critical role in ensuring the reliability, performance, and availability of our core platforms.About the Role:As a Site Reliability Engineer / SRE, you will be responsible for...

  • SRE Lead

    3 days ago


    London, Greater London, United Kingdom N Consulting Ltd Full time

    Job SummaryN Consulting Ltd is seeking a highly skilled SRE Lead to join our team. As a key member of our cloud operations team, you will be responsible for leading a team of 6-10 people and influencing a larger group.Key ResponsibilitiesLead a team of cloud operations engineers and ensure the smooth operation of our cloud infrastructure.Develop and...

  • SRE Lead

    1 day ago


    London, Greater London, United Kingdom N Consulting Ltd Full time

    Job SummaryN Consulting Ltd is seeking a highly skilled SRE Lead to join our team. As a key member of our cloud operations team, you will be responsible for leading a team of 6-10 people and influencing a larger group.Key ResponsibilitiesLead a team of cloud operations engineers and ensure the smooth operation of our cloud infrastructure.Develop and...


  • London, Greater London, United Kingdom Cleo AI Full time

    About Cleo AIWe're a fast-growing tech startup that empowers people to build a life beyond their next paycheck. Our beloved AI helps users forge their own path toward financial well-being.As a key member of our engineering team, you'll play a crucial role in shaping our platform and ensuring its success. We're looking for a seasoned technical leader to join...


  • London, Greater London, United Kingdom Cleo AI Full time

    About Cleo AIWe're a fast-growing tech startup that empowers people to build a life beyond their next paycheck. Our beloved AI helps users forge their own path toward financial well-being.As a key member of our engineering team, you'll play a crucial role in shaping our platform and ensuring its success. We're looking for a seasoned technical leader to join...


  • London, Greater London, United Kingdom Department for Work and Pensions Full time

    Position OverviewAre you adept at managing stakeholder relationships effectively?Do you enjoy diagnosing issues and creating automated solutions to prevent future occurrences?If this resonates with you, we would be eager to connect.In the role of Senior Site Reliability Engineer, you will champion the implementation of SRE best practices throughout our cloud...


  • London, Greater London, United Kingdom AMEX Full time

    About the RoleWe are seeking an experienced Engineering Director to lead our Site Reliability Engineering & Application Support (SRE & AS) Organization. As a key member of our team, you will be responsible for working on platforms that are low latency, always available, and highly resilient, supporting operations that provide 24x7, 365 days a year...


  • London, Greater London, United Kingdom AMEX Full time

    About the RoleWe are seeking an experienced Engineering Director to lead our Site Reliability Engineering & Application Support (SRE & AS) Organization. As a key member of our team, you will be responsible for working on platforms that are low latency, always available, and highly resilient, supporting operations that provide 24x7, 365 days a year...


  • London, Greater London, United Kingdom AMEX Full time

    About the RoleWe are seeking an experienced Engineering Director to lead our Site Reliability Engineering & Application Support (SRE & AS) Organization. As a key member of our team, you will be responsible for working on platforms that are low latency, always available, and highly resilient, supporting operations that provide 24x7, 365 days a year...


  • London, Greater London, United Kingdom AMEX Full time

    About the RoleWe are seeking an experienced Engineering Director to lead our Site Reliability Engineering & Application Support (SRE & AS) Organization. As a key member of our team, you will be responsible for working on platforms that are low latency, always available, and highly resilient, supporting operations that provide 24x7, 365 days a year...

  • Kubernetes SRE

    3 days ago


    London, Greater London, United Kingdom Sterlings Full time

    Job DescriptionCloud Engineer - Site ReliabilitySterlings is seeking a highly skilled Cloud Engineer - Site Reliability to join our team. As a Cloud Engineer - Site Reliability, you will be responsible for ensuring the reliability and stability of our infrastructure through the delivery of tools and processes that improve operational efficiency.Key...

  • Senior Planner

    16 hours ago


    London, Greater London, United Kingdom VolkerWessels Full time

    Job DescriptionVolkerWessels is a multidisciplinary contractor that delivers innovative engineering solutions across the civil engineering and construction sectors. We are seeking a Senior Planner to work on our SRE framework project with Network Rail.Key ResponsibilitiesPlan and coordinate multiple projects across the Southern Region, including structures,...