Site Reliability Engineer

8 hours ago

Manchester, United Kingdom hackajob Full time

Join to apply for the Site Reliability Engineer role at hackajob Company Description At bet365, we're one of the world's leading online gambling companies, revolutionising the industry since 2000. Founded by Denise Coates CBE, we now employ over 9,000 people and serve over 100 million customers in 27 languages. Our focus on In-Play betting has solidified our market‑leading position, offering an unmatched experience across 96 sports and 700,000 streaming events. With over 750 concurrent sporting fixtures at peak and more live sports streamed than anyone else in Europe, we handle over 6 billion HTTP requests daily and process more than 2 million bets per hour at peak. We empower our employees to push boundaries and explore new ideas, cultivating a culture that celebrates and rewards creativity. This offers employees a wealth of opportunities for growth, giving them the opportunity to make a real impact in the world of online gambling. As a forward‑thinking company, we’re breaking new ground in software innovation too, redefining what’s possible for our customers worldwide. Job Description As a Site Reliability Engineer, you will enhance system reliability, observability and performance through a strong engineering approach and assist with incident resolution and best practices. You will have software engineering skills, focusing on system reliability and observability. You will monitor the health, performance and availability of critical systems, directly impacting operational efficiency. Using your engineering expertise, you will implement solutions that enhance reliability, including service instrumentation with tools such as Open Telemetry, improve logging practices and develop features for maintainability. You will also help engineer tools and automation for effective service management. Collaboration is key, working across multiple functions to integrate reliability and observability best practices into the software development life cycle. By supporting governance standards set by the central teams, you will foster a culture where these principles are integral to development. Your contributions will ensure our systems meet user demands and enhance overall service performance. This role is eligible for inclusion in the Company’s hybrid working from home policy. Qualifications Excellent knowledge of Site Reliability Engineering principles, including the creation and management of effective Service Level Indicators (SLI) and Service Level Objectives (SLO) for reliability and customer satisfaction. Knowledge of contemporary observability tools, techniques and best practice including Splunk, New Relic, Grafana and Pager Duty. Excellent knowledge of programming languages including Python, Golang and JavaScript. Knowledge and experience of modern software development techniques and lifecycles. Experience with Infrastructure as Code (IaC) automation and orchestration tools such as Ansible and Terraform. Prior experience working in a large scale, 24/7 enterprise where system uptime and stability is of paramount importance to the Business. Keen interest of industry trends, particularly Platform Engineering. Proficiency in shell scripting for automation and system management tasks. Additional Information Writing and contributing to code that enhances the reliability and observability of services, including telemetry, operational APIs and tooling. Developing and maintaining tools that facilitate effective management of our systems, ensuring they are operationally efficient and resilient. Working with automation and orchestration platforms to automate manual activity and reduce toil. Building sophisticated dashboards using a range of telemetry data and dash boarding technologies like Grafana, Splunk and New Relic. Maintaining and administering existing monitoring and analytic toolsets. Mentoring colleagues in use of new technologies or practices. Actively participating in live incident resolution and post‑mortem analysis, providing effective remediation strategies to improve overall system health and prevent future issues. Driving initiatives to enhance system reliability and observability, contributing to a culture of continuous improvement. Collaborating with the central Site Reliability Engineering and Observability teams to establish and uphold standards for reliability and observability, assisting teams in adhering to these practices. Working with IT Operations, providing and supporting the use of critical tooling to enable increasing levels of value to the Business. By applying to us you are agreeing to share your Personal Data in accordance with our Recruitment Privacy Notice - https://www.bet365careers.com/privacy-policy At bet365, we're committed to creating an environment where everyone feels welcome, respected and valued. Where all individuals can grow and develop, regardless of their background. We're Never Ordinary, and we're always striving to be better. If you need any adjustments or accommodations to the recruitment process, at either application or interview, please don’t hesitate to reach out. Seniority level: Entry level Employment type: Full‑time Job function: Engineering and Information Technology #J-18808-Ljbffr

Site Reliability Engineer

2 weeks ago

Manchester, United Kingdom Searchability Full time

SITE RELIABILITY ENGINEER £40k salary Join a growing, technology-driven business operating at scale within the online gaming and sports sector. Opportunity to shape the SRE strategy. ABOUT THE CLIENT Our client is a fast-growing digital technology company at the forefront of delivering high-availability platforms for the sports and gaming industry. They...
Site Reliability Engineer

1 week ago

Manchester, United Kingdom Manchester Digital Full time

Site Reliability Engineer role at Manchester Digital You will have software engineering skills, focusing on system reliability and observability. You will monitor the health, performance and availability of critical systems, directly impacting operational efficiency. Using your engineering expertise, you will implement solutions that enhance reliability,...
Site Reliability Engineer

6 days ago

Manchester, United Kingdom bet365 Group Full time

As a Site Reliability Engineer, you will enhance system reliability, observability and performance through a strong engineering approach and assist with incident resolution and best practices.Full-timeCloses 28/01/2026You will have software engineering skills, focusing on system reliability and observability. You will monitor the health, performance and...
Site Reliability Engineer

2 weeks ago

Manchester, United Kingdom Anson McCade Full time

Job DescriptionAbout the RoleAre you passionate about building resilient systems and eliminating operational toil through automation? We’re looking for a Site Reliability Engineer (SRE) to join our high-impact team and help shape the future of our digital infrastructure.As an SRE, you’ll blend software engineering with systems engineering to ensure the...
Site Reliability Engineer

2 days ago

Manchester, United Kingdom hackajob Full time

hackajob is collaborating with Bet365 to connect them with exceptional tech professionals for this role.Company DescriptionAt bet365, we're one of the world's leading online gambling companies, revolutionising the industry since 2000. Founded by Denise Coates CBE, we now employ over 9,000 people and serve over 100 million customers in 27 languages. Our focus...
Site Reliability Engineer

2 weeks ago

Manchester, United Kingdom Caspian One Full time

Job DescriptionWe’re building a Centralised SRE team to champion reliability engineering across global technology infrastructure. As a Senior Site Reliability Engineer, you’ll be at the forefront of this transformation engineering scalable systems, automating operations, and embedding resilience into every layer of the tech stack.This isn’t just about...
Site Reliability Engineer

2 days ago

Manchester, United Kingdom hackajob Full time

hackajob is collaborating with Bet365 to connect them with exceptional tech professionals for this role. Company Description At bet365, we're one of the world's leading online gambling companies, revolutionising the industry since 2000. Founded by Denise Coates CBE, we now employ over 9,000 people and serve over 100 million customers in 27 languages. Our...
Site Reliability Engineer

4 weeks ago

Manchester, United Kingdom Caspian One Full time

We’re building a Centralised SRE team to champion reliability engineering across global technology infrastructure. As a Senior Site Reliability Engineer, you’ll be at the forefront of this transformation engineering scalable systems, automating operations, and embedding resilience into every layer of the tech stack.This isn’t just about keeping the...
Site Reliability Engineer

4 weeks ago

Manchester, United Kingdom Anson McCade Full time

About the RoleAre you passionate about building resilient systems and eliminating operational toil through automation? We’re looking for a Site Reliability Engineer (SRE) to join our high-impact team and help shape the future of our digital infrastructure.As an SRE, you’ll blend software engineering with systems engineering to ensure the reliability,...
Site Reliability Engineer

2 weeks ago

Manchester, United Kingdom Caspian One Full time

We’re building a Centralised SRE team to champion reliability engineering across global technology infrastructure. As a Senior Site Reliability Engineer, you’ll be at the forefront of this transformation engineering scalable systems, automating operations, and embedding resilience into every layer of the tech stack.This isn’t just about keeping the...

Americas

Europe

Asia / Oceania

Africa

Site Reliability Engineer