See more Collapse

Site Reliability Engineer, Lead

1 month ago


UK, UK, United Kingdom TekStream Solutions Full time

Our client is a remote-first company with team members across the globe Offering a SaaS-based Learning Management System powering the world's leading education programs. Our client helps large brands and fast-moving companies increase revenue, improve customer retention, and decrease support costs through external education. The platform includes all the tools an organization needs to create, manage, track, and improve highly personalized learning experiences for customers, partners, and employees.


Successful Candidate:

  • SaaS experience
  • Experienced and able to thrive in a small-medium high-growth environment
  • Invested in upskilling, learning new tech
  • Deeply curious, creative, and innovative
  • Flexible in working hours/ability to collaborate in different time zones


The Lead Site Reliability Engineer has a pivotal role at the forefront of our engineering operations, responsible for guiding the Platform Team toward achieving exceptional standards of reliability, performance, and stability across all our applications. The successful candidate will possess deep expertise in these core areas and will be instrumental in defining and implementing industry-leading practices. As a key leader, this role will not only shape the strategic direction of our platform operations but also establish the benchmarks and processes by which our engineering excellence is measured.


Responsibilities

  • Lead the SRE Team, setting clear goals and priorities in line with business objectives. In collaboration with the department Director develop and execute strategies that enhance technological capabilities across the company
  • Ensure all platforms and systems operate smoothly and remain highly available, scalable, and fault-tolerant. Implement best practices for continuous monitoring, preventive maintenance, and rapid response.
  • Continuously assess system performance, identify bottlenecks, and make data-driven recommendations for infrastructure enhancements.
  • Ensure that developers have access to the best tools and platforms to facilitate efficient coding practices and understand the performance of applications..
  • Educate the rest of engineering about best practices for writing performant code and troubleshoot problematic areas
  • Develop and refine incident management protocols. Lead efforts to troubleshoot and resolve high-impact issues, minimizing downtime and preventing future occurrences.
  • Work closely with other engineering teams and departments to understand their needs and ensure platform initiatives support overall company goals.
  • Monitor virtual infrastructure and be part of a 24x7 on-call rotation to respond to alerts


Requirements

  • 8+ years of experience as a software engineer
  • 5+ years of experience working with Ruby on Rails
  • Proven experience leading SR teams
  • 3+ years of experience working in infrastructure and operations
  • Expertise with SQL databases such as PostgreSQL
  • Experience with Cloud computing Amazon Web Services and/or Google Cloud
  • Ability to dig into unfamiliar code bases
  • Ability to document solutions and train operational teams on supportability
  • A sense of comfort working in a team-oriented and collaborative environment
  • Can communicate clearly and seek help and support proactively
  • Takes ownership of tasks and leads them to completion


Desired

  • Experience in developing solutions using server automation tools such as Ansible.
  • Experience writing and maintaining CI/CD pipelines and services.


Education

  • Bachelor’s degree in Computer Science or related technical field

We have other current jobs related to this field that you can find below


  • London, UK, UK, United Kingdom McGregor Boyall Full time

    Lead Site Reliability Engineer, Mentoring, Kubernetes, PaaS, IaaS, SQL, Azure DevOps, CI/CDA leading provider of financial services is seeking two Lead Site Reliability Engineers- Leads with a solid and proven background in Azure or GCP.This position will also be based onsite in London two days per week. A key part of this opening is mentoring from a tech...


  • London, UK, UK, United Kingdom McGregor Boyall Full time

    Site Reliability Engineer- Lead, Mentoring, Kubernetes, PaaS, IaaS, SQL, Azure DevOps, CI/CDA leading provider of financial services is seeking two Site Reliability Engineers- Leads with a solid and proven background in Azure or GCP.This position will also be based onsite in London two days per week. A key part of this opening is mentoring from a tech...


  • London, UK, UK, United Kingdom McGregor Boyall Full time

    Site Reliability Engineer- Lead, Mentoring, Kubernetes, PaaS, IaaS, SQL, Azure DevOps, CI/CDA leading provider of financial services is seeking two Site Reliability Engineers- Leads with a solid and proven background in Azure or GCP.This position will also be based onsite in London two days per week. A key part of this opening is mentoring from a tech...


  • UK, UK, United Kingdom Xcede Full time

    Site Reliability Engineering Manager is required by a global financial technology organisation. In this newly created role, the Site Reliability Engineering Manager will be responsible for deploying and managing a suite of enterprise-wide tools used for provisioning, automation, and monitoring as well as technical team leadership. Site Reliability...


  • UK, UK, United Kingdom Prospectus IT Recruitment Full time

    Are you passionate about real-time data and automation? Our client, a leading real-time data platform company, is on the lookout for a talented ClickHouse Site Reliability Engineer to join their innovative team. If you thrive in a fully remote, dynamic environment and have a knack for managing and automating ClickHouse databases, this could be your next...


  • UK, UK, United Kingdom Blockchain 121 Full time

    Site Reliability Engineer - Fully Remote - 6 Figure USD Salary + EquityWe are looking for an experienced Site Reliability Engineer to join a team of experts on a permanent full time basis. This role requires a mix of blockchain knowledge and experience in maintaining the reliability, scalability, and performance of complex systems.This is a unique chance to...


  • UK, UK, United Kingdom Logikk Full time

    Site Reliability Engineer (Senior Level Role) - Leading AI CompanyRemote | United Kingdom (sponsorship not offered)The client: Our client is a start-up who offers an innovative form of GPU computing infrastructure and Cloud-Native Orchestration solutions to Technology and AI firms worldwide. ‍‍‍What they need: They are looking for a senior SRE to play...


  • UK, UK, United Kingdom Logikk Full time

    Site Reliability Engineer (Senior Level Role) - Leading AI CompanyRemote | United Kingdom (sponsorship not offered)The client: Our client is a start-up who offers an innovative form of GPU computing infrastructure and Cloud-Native Orchestration solutions to Technology and AI firms worldwide. ‍‍‍What they need: They are looking for a senior SRE to play...


  • UK, UK, United Kingdom Peaple Talent Full time

    Hello Site Reliability Engineers! Having an average day? Well, luckily you've come across an opportunity that might just change that.For this one - you will be part of a team that is building & designing a new serverless architecture.Therefore, you will be comfortable deploying with Terraform, while understanding observability principles.Really know your...


  • UK, UK, United Kingdom Peaple Talent Full time

    Hello Site Reliability Engineers! Having an average day? Well, luckily you've come across an opportunity that might just change that.For this one - you will be part of a team that is building & designing a new serverless architecture.Therefore, you will be comfortable deploying with Terraform, while understanding observability principles.Really know your...


  • UK, UK, United Kingdom Luupli Full time

    Job Title: Site Reliability Platform EngineerAbout Luupli: Luupli is a social media app that has equity, diversity, and equality at its heart. We believe that social media can be a force for good, and we are committed to creating a platform that maximizes the value that creators and businesses can gain from it, while making a positive impact on society and...

  • Reliability Engineer

    4 weeks ago


    UK, UK, United Kingdom Stirling Warrington Full time

    Reliability EngineerCoventry£47,000 per annumDay Shift Monday – Friday Benefits:Great opportunity for training through MTC Private Health Care as standard including pre-existing conditions.Matched Pension up to 5%Are you a Reliability Engineer with experience in the Manufacturing Industry If so read on….Our client, a family-run business who are commited...


  • London, UK, UK, United Kingdom Legal & General Full time

    Life can sometimes be unpredictable, and it pays to plan ahead. Our aim at Legal & General Retail is to help our customers plan for the unexpected, achieve financial security for their tomorrow, and protect everything that’s important to them. To better understand our customers and meet their needs, we’ve brought our protection, retirement income,...


  • UK, UK, United Kingdom Go Tek Full time

    4x SRE's (Mid/Senior) | Up to £110k + Benefits | AI FinTech Scale-UpGo Tek are partnered with a market leading startup, that offers Enterprise AI Software focused on Language & Finance Intelligence. As they undergo a new scaling phase, they are looking for highly skilled SRE's to build their state of the art anti-fraud product! Responsibilities...


  • UK, UK, United Kingdom Go Tek Full time

    4x SRE's (Mid/Senior) | Up to £110k + Benefits | AI FinTech Scale-UpGo Tek are partnered with a market leading startup, that offers Enterprise AI Software focused on Language & Finance Intelligence. As they undergo a new scaling phase, they are looking for highly skilled SRE's to build their state of the art anti-fraud product! Responsibilities...


  • London, UK, UK, United Kingdom Fastmarkets Full time

    Fastmarkets is an independent commodity pricing and information organisation with over 600 staff. We are fuelled by values that bring us all together and are united by a collective passion to make a difference. We are supported by a working model that is based ona hybrid approach that allows each of us to balance home and office working while...


  • London, UK, UK, United Kingdom Fastmarkets Full time

    Fastmarkets is an independent commodity pricing and information organisation with over 600 staff. We are fuelled by values that bring us all together and are united by a collective passion to make a difference. We are supported by a working model that is based ona hybrid approach that allows each of us to balance home and office working while...


  • UK, UK, United Kingdom Vibe Recruit Full time

    Lead Systems Engineer – ElectronicsBased; South EastCOMPANYA global player with multiple manufacturing sites across the Uk and abroad. We manufacture high technology components and modules which offer the performance and reliability required for some of the most demanding and challenging applications in the world, covering industries such as telecoms,...

  • Lead SRE

    1 month ago


    UK, UK, United Kingdom Vertus Partners Full time

    My client, an investment manager specialising in systematic trading are looking to hire a Lead Site Reliability Engineer to help form and lead a new team that will be responsible for ensuring best in class reliability and performance for their low latency market making systems.This is a unique opportunity to shape the SRE function within the organisation and...

  • Lead Data Engineer

    4 weeks ago


    UK, UK, United Kingdom James Adams Full time

    About the Client: Our client are a tech for good business dedicated to promoting transparency in the corporate world they are seeking a highly skilled and experienced Lead Data Engineer. You will be the first hire in the Data Engineering team allowing you to take the drivers seat regarding the direction taken by the data team to follow.Key...