Site Reliability Engineering Lead

4 days ago


London, Greater London, United Kingdom BenevolentAI Full time

About the Role:

BenevolentAI is seeking a highly skilled Senior Site Reliability Engineer to join our team. As a Senior Site Reliability Engineer, you will be responsible for designing and implementing software solutions for cloud infrastructure, improving long-term infrastructure availability and reliability, and monitoring and handling incident response of the infrastructure, platforms and core engineering services.

You will also construct pipelines to automate infrastructure and software deployments, troubleshoot infrastructure, network and software issues, and stay up to date with recent technology trends and tools. Additionally, you will automate repetitive manual processes and procedures, and participate in on-call rotation to support Benevolent employees in their day-to-day activities.

Responsibilities:
  • Designing and Implementing Software Solutions: Design and implement software solutions for cloud infrastructure in accordance with specification and best engineering practices.
  • Improving Infrastructure Availability and Reliability: Work towards improving long-term infrastructure availability and reliability through proactive monitoring and maintenance.
  • Incident Response: Monitor and handle incident response of the infrastructure, platforms and core engineering services.
  • Automating Deployments: Construct pipelines to automate infrastructure and software deployments.
  • Troubleshooting Issues: Troubleshoot infrastructure, network and software issues.
  • Staying Up-to-Date: Stay up to date with recent technology trends and tools.
  • Automating Processes: Automate repetitive manual processes and procedures.
  • On-Call Rotation: Participate in on-call rotation to support Benevolent employees in their day-to-day activities.

About You:

We are looking for a talented engineer with a strong background in cloud computing, programming languages (Python/Java/Go/C++ preferred), and experience with Kubernetes. You should have hands-on experience with cloud technologies (we work with AWS, but experience with other cloud providers is also a benefit) and be comfortable working with Unix-based operating systems.

In addition, you should have a good understanding of infrastructure-as-code and tools such as Ansible, Terraform, Helm, and experience with cloud networking, cloud operations, automation and workload orchestration. You should also have basic understanding of network protocols such as TCP, HTTP/S and Load Balancing and the contexts in which they are used.

We offer a competitive salary range of $120,000 - $180,000 per year, depending on experience, plus a comprehensive benefits package including health insurance, retirement plan, and generous PTO policy.



  • London, Greater London, United Kingdom LinuxRecruit Full time

    Join Our TeamAbout Us: LinuxRecruit is a leading organization dedicated to delivering innovative solutions and exceptional user experiences.We are committed to fostering a culture of excellence, collaboration, and continuous learning. By joining our team, you will be part of a dynamic and inclusive organization that values your contributions and supports...


  • London, Greater London, United Kingdom Google Full time

    Job OverviewThis is an exciting opportunity to join our Site Reliability Engineering (SRE) team at Google Cloud, where you will play a critical role in ensuring the reliability and uptime of our services. As a seasoned technical leader, you will be responsible for leading teams, developing and implementing solutions, and providing expert guidance to drive...


  • London, Greater London, United Kingdom Fourier Full time

    Key ResponsibilitiesAs a Site Reliability Engineer at Fourier, you will be responsible for designing and implementing tools to enhance the reliability and resilience of our production systems. This includes investigating failures, improving system performance, and automating manual processes.Required SkillsExcellent Python scripting skillsExperience with...


  • London, Greater London, United Kingdom BenevolentAI Full time

    **Job Overview:**We are seeking a highly skilled Senior Site Reliability Engineer to join our team at BenevolentAI. As a key member of our squad, you will play a crucial role in ensuring the reliability and scalability of our cloud infrastructure.The ideal candidate will have a strong background in software development, with experience in implementing cloud...


  • London, Greater London, United Kingdom Selby Jennings Full time

    About Selby JenningsWe're a leading global financial services firm where technologists and investment professionals collaborate to drive innovation and operational excellence.About the RoleAs a Site Reliability Engineer, you'll apply your expertise in software and systems engineering to design, build, and maintain our robust infrastructure. You'll reduce...


  • London, Greater London, United Kingdom Rewardgateway Full time

    Engineering, LondonEarn a salary of £110,000 - £130,000 per year with Reward Gateway.We are seeking an experienced Site Reliability Engineer to lead our team and drive the transformation of our operational workloads to a Service Reliability Engineering (SRE) approach. The successful candidate will be responsible for establishing and managing our new SRE...


  • London, Greater London, United Kingdom Lorien Full time

    Key Responsibilities:Collaborate with the existing team to deliver a brand-new project.Work on a hybrid model with 1 day a week on-site in London.Develop and maintain reliable and efficient systems.Utilize experience with Java, Python, Splunk, ServiceNow, and MongoDB.Contribute to incident management and application monitoring.Ensure seamless interaction...


  • London, Greater London, United Kingdom Apple Inc. Full time

    Unlock the Future of Cloud ServicesAt Apple Inc., we're not just building products - we're crafting experiences that our customers love and depend on. Our Apple Services Engineering (ASE) team is responsible for the systems that make these daily experiences possible. If you've used Apple products, you've likely interacted with us. Our iCloud Services SRE...


  • London, Greater London, United Kingdom J Bandy Consulting Full time

    Job SummaryJ Bandy Consulting is seeking an experienced Site Reliability Engineer to join our team. The ideal candidate will have a strong background in software engineering and a passion for building scalable and reliable systems.Key ResponsibilitiesDevelop and implement automation tools to improve the efficiency of our systemsCollaborate with...


  • London, Greater London, United Kingdom Remotestar Full time

    Remotestar is seeking a Senior Site Reliability Engineering Manager to join our client's team in the UK. The client is building a B2B marketplace for diamonds, and we need someone to ensure the reliability, scalability, and performance of our infrastructure and services.The ideal candidate will have a strong track record of building and maintaining highly...


  • London, Greater London, United Kingdom GoCardless Full time

    The RoleGoCardless is looking for a Site Reliability Engineer to join our team. As a Site Reliability Engineer, you will be responsible for designing, implementing, and maintaining the infrastructure and systems that support our payment and open banking products.Key ResponsibilitiesDesign and implement scalable and efficient infrastructure solutionsDevelop...


  • London, Greater London, United Kingdom Preqin Full time

    About the Role:Preqin is seeking an experienced Site Reliability Engineer to join our team in London. As a Site Reliability Engineer, you will work across Preqin's full suite of services, supporting our clients around the world.You will be responsible for designing, building, and operating our infrastructure, middleware, and CI/CD systems to ensure our teams...


  • London, Greater London, United Kingdom Apple Inc. Full time

    Site Reliability Engineering Manager, AppleAt Apple, we're not just building products - we're crafting experiences our customers love and depend on. Our Apple Services Engineering (ASE) team builds and supports the systems that make many of these daily experiences possible. If you've used Apple products, you've likely interacted with us. Our iCloud Services...


  • London, Greater London, United Kingdom Highfield Professional Solutions Ltd Full time

    Highfield Professional Solutions Ltd is seeking a Site Reliability Engineer to join our team in Central London. The successful candidate will be responsible for managing and maintaining critical engineering systems within our Data Centre, ensuring that they operate efficiently and effectively. This role offers a competitive salary of up to 48,000 per year,...


  • London, Greater London, United Kingdom BenevolentAI Full time

    BenevolentAIEstimated Salary: £110,000 - £140,000 per annum.Company Overview:BenevolentAI is a leading artificial intelligence company that uses machine learning to accelerate scientific discovery. We are seeking a highly skilled Senior Site Reliability Engineer to join our team and help us maintain the reliability and scalability of our cloud...

  • Site Reliability Lead

    3 weeks ago


    London, Greater London, United Kingdom Kroo Bank Ltd Full time

    Site Reliability EngineerWe're on a mission to build the world's greatest social bank, and we believe that banking needs to change for the better. When money is used correctly, it can transform our daily lives and positively impact the planet.We're a varied team of experienced tech, customer experience, marketing, legal and banking professionals, and we're...


  • London, Greater London, United Kingdom Kinetech Full time

    At Kinetech, we're seeking a talented Site Reliability Engineer to join our team. This role is responsible for ensuring the smooth operation of our software systems, with a focus on scalability, reliability, and performance.Key Responsibilities:Design and implement CI/CD pipelines to automate code integration, testing, and deployment.Automate repetitive...


  • London, Greater London, United Kingdom Victrex Full time

    Senior Reliability Engineer RoleAbout the JobWe are seeking an experienced Senior Reliability Engineer to lead our asset management strategy and drive improvements in plant performance across all UK plants.Job SummaryThe successful candidate will be responsible for developing and implementing systems and procedures that enhance safety, asset availability,...


  • London, Greater London, United Kingdom College of Charleston Full time

    Transformative SRE Leadership OpportunityAre you a seasoned leader with a passion for strategy, leadership, and engineering excellence? Do you want to make a meaningful impact at a global financial institution? We're seeking a talented Site Reliability Engineering Manager to join our Operations and Technology Chief Information Office Business area.About the...


  • London, Greater London, United Kingdom Citigroup, Inc. Full time

    As a seasoned Engineering Manager with Citigroup, Inc., you will be responsible for driving operational excellence and fostering collaboration within the Site Reliability Engineering (SRE) team. To achieve this, you will create effective reporting mechanisms, manage multiple projects simultaneously, and ensure timely and budget-conscious completion of work....