Site Reliability Engineer for Scalable Infrastructure

4 weeks ago


London, Greater London, United Kingdom ESL FACEIT Group Full time

At ESL FACEIT Group, we're passionate about creating a culture that fosters innovation and community. As a Site Reliability Engineer, you'll play a crucial role in maintaining and improving our monitoring and observability tools, working closely with cross-functional teams to design, maintain, and operate systems at scale.

Key Responsibilities
  • Maintain and improve monitoring and observability tools (Grafana/Prometheus/Thanos/Jaeger)
  • Collaborate with teams to design, maintain, and operate systems at scale
  • Use troubleshooting skills to identify and fix operational issues

With a proven track record as a Site Reliability Engineer, DevOps Engineer, or Software Engineer, you'll bring expertise in building and maintaining scalable infrastructures. Proficiency in at least one major cloud provider (GCP/AWS/Azure) is a must, as is knowledge of incident management and proficiency in languages such as Go, Java, Python, or Rust.

Requirements
  • Proven experience as a Site Reliability Engineer, DevOps Engineer, or Software Engineer
  • Expertise in building and maintaining scalable infrastructures
  • Knowledge of incident management and proficiency in at least one major cloud provider
  • Proficiency in languages such as Go, Java, Python, or Rust

Experience contributing to open source technologies is a valuable asset.



  • London, Greater London, United Kingdom ESL FACEIT Group Full time

    At ESL FACEIT Group, we strive to create immersive gaming experiences that bring communities together. Our mission is built around the core value of inclusivity, ensuring everyone has an equal chance to participate and thrive in the world of esports.Apart from monitoring our systems' capacity and performance, you will also focus on optimizing existing...


  • London, Greater London, United Kingdom EFG Full time

    About EFGEFG is a leading company in the esports and gaming industry, dedicated to creating immersive experiences for players and fans. Our mission is to foster a culture of inclusivity and social responsibility, ensuring that everyone has access to the world of gaming.Job DescriptionWe are seeking a highly skilled Site Reliability Engineer to join our team....


  • London, Greater London, United Kingdom FactSet Full time

    Job Title:Reliability and Scalability ExpertAbout FactSetWe're a global leader in providing financial data and software solutions for investment professionals. At FactSet, we strive to empower our employees to innovate through technology.Job Description:This Reliability and Scalability Expert role focuses on ensuring the reliability, scalability, and...


  • London, Greater London, United Kingdom Galaxy entertainment Corporation Limited Full time

    Site Reliability Engineer, Blockchain Infrastructure - LondonGalaxy is a digital asset and blockchain leader helping institutions, startups, and individuals access and navigate the crypto economy. We are seeking a skilled Site Reliability Engineer to join our team and ensure the reliability, scalability, and security of our blockchain infrastructure.Key...


  • London, Greater London, United Kingdom ESL FACEIT Group Full time

    About the RoleWe are seeking a highly skilled Site Reliability Engineer to join our team at ESL FACEIT Group. As a key member of our infrastructure team, you will be responsible for designing, analyzing, and troubleshooting large-scale distributed systems.As a Site Reliability Engineer, you will work closely with our software engineering teams to deploy and...


  • London, Greater London, United Kingdom STAND 8 Technology Services Full time $75 - $85

    Job SummarySTAND 8 Technology Services is seeking an experienced Site Reliability Engineer to support our systems focused on linear channel delivery and modernization efforts. The ideal candidate will be responsible for maintaining existing systems, working on infrastructure modernization, and supporting the streaming engineering team to ensure smooth...


  • London, Greater London, United Kingdom Mondrian Alpha Recruitment Solutions Full time

    At Mondrian Alpha Recruitment Solutions, we are seeking a highly skilled Site Reliability Engineer to join our team responsible for engineering and supporting the company's critical infrastructure platforms.This team handles the centralized development infrastructure and works alongside engineering teams across the business to ensure the optimal route of...


  • London, Greater London, United Kingdom Lightricks Ltd. Full time

    Cloud Infrastructure Engineer - Scalable SystemsLightricks Ltd. is a pioneer in innovative technology that bridges the imagination and creation. We are an AI-first company with a mission to build innovative tools for photo and video creation.Job Overview:We are seeking a skilled Cloud Infrastructure Engineer to join our ML Platform team. As a Cloud...


  • London, Greater London, United Kingdom GoCardless Full time

    About the RoleWe are seeking an experienced Site Reliability Engineer to join our team at GoCardless. The successful candidate will be responsible for designing, building, and maintaining our global platform, ensuring it is scalable, reliable, and secure.Key ResponsibilitiesDesign and implement infrastructure solutions using AWS, GCP, and KubernetesDevelop...


  • London, Greater London, United Kingdom Stealth iT Consulting Full time £55,000

    Job Title: Site Reliability Engineer - Cloud Infrastructure ExpertJob Summary: Stealth iT Consulting is seeking a highly skilled Site Reliability Engineer to join our team. As a Site Reliability Engineer, you will play a key role in ensuring the reliability, scalability, and efficiency of our clients' platforms.Key Responsibilities:* Define and implement...


  • London, Greater London, United Kingdom Preqin Full time

    Job Title: Senior Site Reliability EngineerPreqin is seeking a highly skilled Senior Site Reliability Engineer to join our Engineering team. As a key member of our team, you will be responsible for designing, building, and maintaining our infrastructure, middleware, and internal services to ensure high availability, scalability, and performance.Key...


  • London, Greater London, United Kingdom Preqin Full time

    Job DescriptionPreqin is seeking a highly skilled Site Reliability Engineer to join our Engineering team. As a Site Reliability Engineer, you will be responsible for designing, building, and operating our infrastructure, middleware, and CI/CD systems to ensure our teams have access to the best tools available.Key ResponsibilitiesDesign and operate Preqin's...


  • London, Greater London, United Kingdom Preqin Full time

    About the Role:Preqin is seeking an experienced Site Reliability Engineer to join our team in London. As a Site Reliability Engineer, you will work across Preqin's full suite of services, supporting our clients around the world.You will be responsible for designing, building, and operating our infrastructure, middleware, and CI/CD systems to ensure our teams...


  • London, Greater London, United Kingdom Experian Full time

    About the RoleWe're seeking a skilled Site Reliability Engineer to join our Experian Data Quality team in London, working on a hybrid schedule.As a key member of our QA team, you'll ensure the reliability, performance, and scalability of our market-leading data management products, focusing on observability to support incident resolution and drive ongoing...


  • London, Greater London, United Kingdom JLL Full time

    JLL is shaping the future of real estate by combining world-class services, advisory, and technology for its clients.The company is looking for an Observability Engineer to support and administer the Datadog monitoring platform. This role focuses on ensuring the reliability, scalability, and efficiency of Datadog for monitoring and AIOps within the...


  • London, Greater London, United Kingdom LoyaltyLion Full time

    About UsLoyaltyLion is a data-driven loyalty and engagement platform trusted by thousands of ecommerce brands worldwide. Our mission is to help retailers succeed in the age of Amazon by offering a better customer experience. We've built a strong team and are looking for a Site Reliability Engineer to join us.The RoleWe are seeking a Site Reliability Engineer...


  • London, Greater London, United Kingdom Alevio Consulting Full time £750

    Job Title: Senior Site Reliability Engineer - Cloud ExpertAbout the Role: We are seeking a highly skilled Senior Site Reliability Engineer to join our team at Alevio Consulting. As a key member of our cloud infrastructure team, you will be responsible for designing, building, and maintaining high-performance, scalable, and reliable services for our...


  • London, Greater London, United Kingdom Preqin Full time

    Role Overview Preqin is seeking an experienced Site Reliability Manager to join our Engineering team. As a Site Reliability Manager, you will play a crucial role in designing, operating, and supporting our infrastructure, middleware, and internal services. Key Responsibilities Design and operate scalable and high-available services, while establishing...


  • London, Greater London, United Kingdom loveholidays Full time

    About usWe are a dynamic and rapidly growing online travel agency that places technology at the heart of our success. With millions of people trusting us for their dream holidays, our focus is on delivering exceptional customer experiences through cutting-edge technology.We operate at scale, handling 100+ services and 8k requests per second while maintaining...


  • London, Greater London, United Kingdom GoCardless Full time

    The RoleGoCardless is looking for a Site Reliability Engineer to join our team. As a Site Reliability Engineer, you will be responsible for designing, implementing, and maintaining the infrastructure and systems that support our payment and open banking products.Key ResponsibilitiesDesign and implement scalable and efficient infrastructure solutionsDevelop...