Site Reliability

3 days ago


Leicester, United Kingdom WeAreTechWomen Full time

Overview Home. There’s no place like it. And there’s no feeling like helping people create the joy of feeling truly at home. At Dunelm, that’s what we do. We’re the UK's number one choice for homewares because we make home life lovelier for our customers. And we’ve crafted a workplace that feels just as welcoming - where you can bring your ideas, be yourself, and feel right at home. Remaining first-choice for savvy homeware shoppers also involves making use of advanced technology. We have embraced serverless, event-driven architecture and container orchestration, and are moving from a monolithic front end to micro front ends. You’ll join a talented and collaborative group of engineers and architects who care about quality and reliability. Learn more on our Engineering Blog (https://engineering.dunelm.com). Site Reliability Engineering Our SRE team is a high-trust, high-impact group of engineers who bring software engineering principles to operational reliability. We are hands-on developers and systems thinkers who build scalable, observable, and resilient platforms. We work closely with other Engineering, Data, Platform and Operations teams to help them build reliable, observable, and cost-effective systems. We lead incident response, improve deployment safety, and guide teams toward sustainable service ownership. We process large volumes of telemetry data every day and are constantly evolving our approach to cost-efficient observability, adaptive sampling, and meaningful tracing. Observability is not a bolt-on - it is a first-class concern that shapes how we build and support systems across the business. Ways of Working This is a hybrid role, with time split between working from home and our London or Leicester offices. We get together as a team for two days every month, but there may be an expectation of other ad‑hoc office days where necessary. Interview Process Step 1: Introductory video call (around 45 minutes) with the Principal Engineer and Delivery Lead to get to know each other, explain the role, and hear about your experience, goals and approach to work. Step 2: A 90‑minute technical discussion with a few members of the SRE team. You will work through scenario‑based questions designed to help you to highlight your knowledge, specific approach and where you feel any improvements could be made. We Are Committed to Inclusion If you are not a 100% fit (but very close) to having the essential skills and experience, we would encourage you to still apply for this position. We want everyone to be as comfortable as possible, so if you need any adjustments within the interview process, please let us know as soon as possible. What you'll be doing We’re looking for a Site Reliability Engineer, with a strong focus on observability engineering, to help scale OpenTelemetry and drive service health across our technology stack. Your work will directly impact reliability, performance, and customer experience. Key responsibilities Engineering & Automation: Build automation and reliability tooling using well‑architected, maintainable, testable and - most importantly - easy to read and understand code. Contribute to shared libraries, observability components and internal platforms. Observability & OpenTelemetry: Help evolve our observability strategy across systems and services. Drive how we collect, process, sample, and surface trace and metrics data using OpenTelemetry. Focus on high‑signal telemetry that enables fast diagnosis, cost efficiency, and meaningful visibility across the stack. SLOs/SLIs & Service Ownership: Help teams truly own their services by helping to define and adopt meaningful SLIs and SLOs. Guide product teams in using observability data to make reliability measurable. Incident Response: Participate in on‑call investigations when issues arise. Drive blameless post‑incident reviews and help to identify observability gaps, recommend mitigating actions that stem losses, but also permanent fixes that prevent recurrence. IaC & CI/CD: Model infrastructure using tools such as Pulumi, CDK, and Terraform in AWS and other PaaS and SaaS providers. Improve pipelines and support safe deployment strategies (e.g. canary, blue‑green). Mentoring & Team Growth: Help support and coach other engineers. Participate in technical discussions and share knowledge through pairing, planning, and documentation. Continuous Learning and Innovation: Stay ahead of emerging practices in observability, resilience, and platform engineering. Lead team proof‑of‑concepts and introduce new patterns or tools that improve our platform. Strategic Development: Contribute to the prioritisation of the SRE roadmap. Help shape observability tooling, telemetry patterns, and platform‑wide approaches to service ownership and reliability. Aligning to Business Goals: Use observability insights to support product goals. Help to ensure SRE priorities align with Dunelm’s wider objectives for quality, performance, and customer experience. What we'll look for in you Essential Skills TypeScript or similar strongly typed programming language(s). Ability to write idiomatic, pragmatic, and testable code, with strong, appropriate, automated testing. AWS, including serverless services and general networking principles Knowledge of OpenTelemetry tools, specification, APIs etc. Understanding of SRE principles, namely: embracing risk, service level objectives, eliminating toil, monitoring distributed systems, automation and release engineering AWS expertise, including serverless services and general networking principles Linux system administration knowledge - able to use a command line to navigate and troubleshoot a server or container running a Linux OS Configuring and using observability back‑end SaaS platforms, such as Datadog, Grafana etc. Infrastructure-as-Code tools, such as Pulumi and Terraform Kubernetes fundamentals (deploying and monitoring workloads) CI/CD pipelines (GitLab or similar) and build/test/deploy automation Participation in incident response, root cause analysis and post‑incident reviews Strong problem‑solving and investigative mindset, with high attention to detail Desirable Skills Rust or a similar compiled language (e.g. Go) Instrumenting and running OpenTelemetry in production at scale. Distributed tracing and trace sampling Cost optimisation for observability and cloud services Exposure to Google Cloud Platform (GCP) Deep Kubernetes observability (e.g. metrics exporters, service mesh) Familiarity with challenges in the retail sector is a bonus but not expected Behaviours and Values At Dunelm, our shared values of Act Like Owners, Keep Listening & Learning, Long‑Term Thinking, and Stronger Together serve as the foundation for our success. These values guide us continuously; improving our practices and ensure we dedicate our time to what truly matters. As a Site Reliability Engineer, you will exemplify these key behaviours: Support and build trust with teammates, always assuming positive intent Communicate clearly and share knowledge to build shared understanding Stay curious, ask why, and always look to improve how things work Embrace change, adapt quickly, and take on a variety of challenges Drive innovation by looking for better ways forward and pushing for progress #J-18808-Ljbffr



  • Leicester, United Kingdom Rise Site Solutions Ltd Full time

    A construction services provider in Leicester is seeking a Labourer for an ongoing refurbishment project. The ideal candidate will be reliable and self-motivated, assisting in various tasks including removal of materials and support for trades on site. This entry-level role offers Monday to Friday work from 08:00 to 16:00 on a temporary basis. Ideal for...


  • Leicester, United Kingdom Rise Site Solutions Ltd Full time

    A leading construction firm is looking for general labourers for a long-term refurbishment project in Leicester. The ideal candidates should be reliable, self-motivated, and enthusiastic about working. Responsibilities include assisting in the strip-out stage, removing materials, and supporting trades and the site manager. This part-time position offers...


  • Leicester, United Kingdom Amazon Full time

    Amazon Operations sits at the heart of the Amazon customer experience. We look after everything from the moment a customer clicks buy, to the moment their item is delivered - from desktop to doorstep. Across Europe we have more than 50 Fulfillment Centers, hundreds of Delivery Stations, thousands of machines, and tens of thousands of employees, all working...

  • Labourer

    1 week ago


    Leicester, United Kingdom Rise Site Solutions Full time

    Our client is currently looking for general labourers for an on going and long term refurbishment project in the centre of Leicester. They are looking for candidates that are reliable, self motivated and keen to work. Tasks to include helping with the strip out stage, removing materials, filling the skip, assisting trades and helping the site...

  • Labourer (No CSCS)

    5 days ago


    Leicester, United Kingdom Rise Site Solutions Ltd Full time

    Join to apply for the Labourer (No CSCS) role at Rise Site Solutions Ltd Our client is currently looking for general labourers for an ongoing long‑term refurbishment project in the centre of Leicester. Candidates should be reliable, self‑motivated and keen to work. Responsibilities include helping with the strip‑out stage, removing materials, filling...

  • Labourer (No CSCS)

    17 hours ago


    Leicester, United Kingdom Rise Site Solutions Ltd Full time

    Our client is currently looking for general labourers for an on going and long term refurbishment project in the centre of Leicester. They are looking for candidates that are reliable, self motivated and keen to work. Tasks to include helping with the strip out stage, removing materials, filling the skip, assisting trades and helping the site manager....


  • Leicester, United Kingdom Rise Site Solutions Ltd Full time

    A construction services company is seeking general labourers for a long-term refurbishment project in Leicester. Candidates should be reliable and self-motivated with a willingness to assist in various tasks such as removing materials and supporting trades and the site manager. The position offers Monday to Friday work, from 8:00 to 16:00 daily. This is a...

  • Labourer

    5 days ago


    Leicester, United Kingdom Rise Site Solutions Ltd Full time

    Our client is currently looking for general labourers for an ongoing and long‑term refurbishment project in the centre of Leicester. We seek candidates who are reliable, self‑motivated and eager to work. Responsibilities include helping with the strip‑out stage, removing materials, filling the skip, assisting trades, and supporting the site...

  • Labourer

    3 days ago


    Leicester, Leicester, United Kingdom Rise Site Solutions Ltd Full time

    Our client is currently looking for general labourers for an on going and long term refurbishment project in the centre of Leicester. They are looking for candidates that are reliable, self motivated and keen to work. Tasks to include helping with the strip out stage, removing materials, filling the skip, assisting trades and helping the site manager.Working...


  • Leicester, United Kingdom Amazon Full time

    Description Our Reliability Maintenance Engineering (RME) team is central to Amazon's commitment to innovation. As Amazon evolves and adapts, this team makes sure that the tools and technologies we use do as well. As a Senior RME Technician, you'll help us stay one step ahead, adopting the latest technologies and identifying new and efficient ways of...