Site Reliability Engineer

2 weeks ago


United Kingdom Travelodge Hotels Limited Full time

Job Description

Travelodge’s mission is to be the UK’s favourite hotel for value.

With more than a million visits every week to our website and more than eighteen million customers a year, the use of technology is critical to both our customer offer and our low-cost operations.

The mission within IT is to ensure innovative technology drives the business forward, through the development of the company’s customer-facing and internal technology systems.

The job in a nutshell

As a Site Reliability Engineer in our IT Digital and Data Operations team, you will be passionate about maintaining and improving software that solves problems. The role will form a bridge between development and operations by applying a software engineering mindset to system administration topics. Your time will be split between operations duties and enhancing systems, software, monitoring and processes that help increase site reliability, availability and performance.

Working closely with internal IT teams, business stakeholders and 3rd party suppliers, your primary responsibility will be to ensure system performance is optimised, with an eye toward pushing our capabilities forward by innovating to continually improve our technical environments.

What you’ll be doing

● Being an advocate for DevOps methodologies and ways of working with the ability to apply them to existing and new integrations across our applications within our web stack.

● Collaborating with developers at the design stage, to ensure services released to production are fit for purpose and deliver near zero defects.

● Providing architectural governance from an operations perspective from the time of planning of changes and releases including creating documentation and evaluating architectural decisions.

● Administering the production and pre-production environments including CMS’s using monitoring tools, application functionality and availability checks.

● Proficient use of deployment technologies such as Jenkins.

● Troubleshooting and administering the Linux OS (preferably RHEL) and providing log analysis.

● Building software and systems to manage platform infrastructure and applications and resolve vulnerabilities.

● Improving reliability, quality, and time-to-market of our suite of software applications.

● Measuring and optimising system performance, with an eye toward pushing our capabilities forward, getting ahead of growth, capacity needs, and innovating to continually improve

● Providing 2nd / 3rd level operational and engineering support for multiple software applications and systems.

● Gathering and analysing metrics from both operating systems and applications to assist in performance tuning and fault finding.

● Partnering with the digital development teams and Product Owners to improve services through rigorous testing and release procedures.

● Proactively deploy automation for regularly repeated tasks and identifying new automation opportunities.

● Active engagement with future digital roadmaps, to innovate and ensure no legacy tech debt.

● Supporting the continuous improvement of internal IT processes and ways of working.

● Working on day to day incidents with the digital operations team, championing the resolution of tickets with pace and tenacity utilising preventative maintenance and proactive techniques to actively drive down incident tickets.

● Running major incident calls and assisting with the resolution of major incident issues within the platforms and following through to root cause analysis and remediation.

● Staying up-to-date with the IT industry methodologies and emerging trends.

● Leading on establishing and implementing shifts in Culture to support adoption of new processes and ways of working, across teams.

● Ensuring technology and processes are running optimally and this is reflected in the availability of all systems and tools.

● To reduce or even eliminate toil in order to maximise the time spent on engineering and innovation.

● Providing direct support and acting as a second in command to the IT Senior Digital Platform Manager, covering absences and leave, to manage and support the Digital Operations teams.

What we’ll expect from you

To succeed in this role, you will be a ‘hands-on’ Engineer with a proven track record of improving and maintaining enterprise scale ecommerce/digital systems and associated applications on prem and in the cloud, with experience of multiple digital implementations. You will have broad technical knowledge and be comfortable working with pace and agility to ensure the required outcomes are achieved.

You must have a strong understanding of systems integration and application lifecycle management, but we are not expecting expert knowledge in absolutely every technology; it is important that you can articulate what you know well and recognise when further understanding is required - be an active self-starter who can gather information and make appropriate decisions in a timely and organised manner.

This role will require you to participate in an on-call rota, and manage the team outside of hours with web releases and platform upgrades.

The ability to work with a variety of teams and technologies is required. As our Digital SRE you will have a good understanding of IT operations, support and software engineering in order to be successful.

Essential

● Professional Qualifications or demonstrable training and practical experience that relate to the function of an SRE

● Proficiency with Redhat Linux distributions

● Administration experience of at least one cloud Platform (AWS or Azure)

● Proven experience of working with CI/CD pipelines

● Setup and monitoring of end to end systems using enterprise monitoring and reporting tools, such as NewRelic, Splunk, Pingdom and Zabbix

● Experience with distributed storage technologies like NFS, HDFS, Ceph, S3 as well as dynamic resource management frameworks (Kubernetes)

● A proactive approach to spotting and resolving problems, areas for improvement, and performance bottlenecks

● Working knowledge of Bash and\or KSH, Nginx and PHP.

● Knowledge of relational and non relational databases and experience implementing best practice approaches.

● Experience with the following technologies, Akamai WAF, Github

Desirable

● Understanding of Python with one or more high level languages, such as Java, Ruby, and JavaScript

● Understanding of Apache and\or Tomcat

● BSc and/or MSc Computer Science/Business Computing or equivalent experience

● Experience as an SRE within retail/hospitality or similar

● Certifications in Cloud computing with either AWS or Azure

Travelodge Traits

At Travelodge, we believe that behaviours are just as important as the activities you carry out. The ones we look for in every colleague are:

I care about people

● I treat everyone in a way I would like to be treated

● I am easy to work with

● I have a can do attitude

● I care about the impact my work has on others

I pay attention to detail

● I do the little things that make a difference to our customers

● I work to brand standards

● I treat Travelodge time, equipment and stock as if it were my own

I drive for results

● I hit targets in my role and work at the right pace

● I take ownership of problems and try to fix them fast

● I look for ways to avoid future problems

● I look for ways to promote Travelodge

What you can expect from us

Culture

At Travelodge, we are warm, straightforward and optimistic. We have a big footprint in the UK, but still a small company feel and you can expect quality and value to be built into everything we do. You’ll have the support of a close network of colleagues and managers, and every day is different here We want you to bring your personality to work and we love our diversity.

Reward and recognition

It’s not just our customers we want to wake up with a smile on their face. As well as a competitive salary, being part of our hotel support centre means great holiday entitlements, pension contribution deals, being part of our bonus scheme, and a Thanks Card giving generous room and food discounts as well as friends and family rates.

Career and development

We want you to develop further with us at Travelodge and we’ll provide you a development plan to help you reach your goals. You can expect to have a full induction and training relevant to your role. We advertise all our vacancies internally, so you’ll have the opportunity to really develop your career with Travelodge.

#J-18808-Ljbffr

  • United Kingdom Understanding Recruitment Group Full time

    Direct message the job poster from Understanding Recruitment Lead Cloud Native/CTO Consultant and Host of The CTO Club Podcast Site Reliability Engineer I am seeking a Site Reliability Engineer for one of the worlds fastest growing social media platforms. With over 900 Million Daily users. The SRE group come from diverse technical backgrounds,...


  • United Kingdom THRIVE Learning Full time

    As a Site Reliability Engineer within the SRE team, you’ll be focused on monitoring and supporting our AWS environments for platforms and tools utilised by our customers. Leigh Darlow About this role As a Site Reliability Engineer within the SRE team, you’ll be focused on monitoring and supporting our AWS environments for platforms and tools utilised...


  • United Kingdom Mondrian Alpha Full time

    My client, a renowned hedge fund with a global presence, is in search of a seasoned Site Reliability Engineer to join their London team. As part of this team, you'll play a pivotal role in maintaining the technology infrastructure that drives the fund's operations, directly contributing to its success. This involves handling large volumes of data for...


  • United Kingdom Mondrian Alpha Full time

    My client, a renowned hedge fund with a global presence, is in search of a seasoned Site Reliability Engineer to join their London team. As part of this team, you'll play a pivotal role in maintaining the technology infrastructure that drives the fund's operations, directly contributing to its success. This involves handling large volumes of data for...


  • United Kingdom Albert Bow Full time

    Network Site Reliability Engineer | Trading | London They are looking for a Network Site Reliability Engineer to join their Shared Engineering Team! Support a high-performance trading business, dealing with outages, performing root cause analysis, and expanding the footprint. Develop their network and datacenter infrastructure across multiple...


  • United Kingdom TekStream Solutions Full time

    Our client is a remote-first company with team members across the globe! Offering a SaaS-based Learning Management System powering the world's leading education programs. Our client helps large brands and fast-moving companies increase revenue, improve customer retention, and decrease support costs through external education. Deeply curious, creative, and...


  • United Kingdom TekStream Solutions Full time

    Our client is a remote-first company with team members across the globe! Offering a SaaS-based Learning Management System powering the world's leading education programs. Our client helps large brands and fast-moving companies increase revenue, improve customer retention, and decrease support costs through external education. Deeply curious, creative, and...


  • United Kingdom Axon Full time

    Your Impact As a contributor in the SRE (Site Reliability Engineering) organization, you are passionate about delivering solutions to the real-time problems our mission-critical cloud native services encounter. You are also obsessed about achieving the high quality and reliability our customers demand. You will work closely not only with the SRE division,...


  • United Kingdom JPMorgan Chase & Co. Full time

    Out of the successful launch of Chase in 2021, we’re a new team, with a new mission. We’re creating products that solve real world problems and put customers at the center - all in an environment that nurtures skills and helps you realize your potential. Our team is key to our success. We’re people-first. We value collaboration, curiosity and...


  • United Kingdom Candour Solutions Full time

    Lead Site Reliability Engineer – Leeds (hybrid / remote) #TeamCandour have partnered with a true global player and genuine household name who are looking to build out their Leeds office with the addition of an accomplished Lead Site Reliability Engineer. This is an opportunity to collaborate with and lead a team of engineers around the world working on...


  • United Kingdom Oracle Full time

    The job is remote from the UK, currently without VISA sponsorship Job description:Work with Site Reliability Engineering (SRE) team on the shared full stack ownership of a collection of services and/or technology areas.Understand the end-to-end configuration, technical dependencies, and overall behavioral characteristics of production services.Responsible...


  • United Kingdom Oracle Full time

    The job is remote from the UK, currently without VISA sponsorship Job description: Work with Site Reliability Engineering (SRE) team on the shared full stack ownership of a collection of services and/or technology areas. Understand the end-to-end configuration, technical dependencies, and overall behavioral characteristics of production services....


  • United Kingdom Oracle Full time

    The job is remote from the UK, currently without VISA sponsorship Job description: Work with Site Reliability Engineering (SRE) team on the shared full stack ownership of a collection of services and/or technology areas. Understand the end-to-end configuration, technical dependencies, and overall behavioral characteristics of production services. ...


  • United Kingdom Oracle Full time

    The job is remote from the UK, currently without VISA sponsorship Job description:Work with Site Reliability Engineering (SRE) team on the shared full stack ownership of a collection of services and/or technology areas.Understand the end-to-end configuration, technical dependencies, and overall behavioral characteristics of production services.Responsible...


  • United Kingdom Oracle Full time

    The job is remote from the UK, currently without VISA sponsorship Job description: Work with Site Reliability Engineering (SRE) team on the shared full stack ownership of a collection of services and/or technology areas. Understand the end-to-end configuration, technical dependencies, and overall behavioral characteristics of production services. ...


  • United Kingdom Experian Marketing Services Full time

    We’re looking for a Site Reliability Engineer to join our Experian Data Quality team where you will be working on cutting edge products within our Aperture suite (Data Studio and Data Governance). This role has aspects of both reliability engineering (SRE) and test engineering (SDET). It is ideally suited to someone looking to take on some aspects of a...


  • United Kingdom THINKalpha Full time

    Location: 100% Remote. The working timezone is EU/GMT. ThinkAlpha is looking for a Senior Site Reliability Engineer to work in the core infrastructure team supporting our data analytics platform and transactional trading engine. Our team provides solutions for real-time analytics, financial search, data integration, robust transactional systems,...


  • United Kingdom THINKalpha Full time

    Location: 100% Remote. The working timezone is EU/GMT. ThinkAlpha is looking for a Senior Site Reliability Engineer to work in the core infrastructure team supporting our data analytics platform and transactional trading engine. Our team provides solutions for real-time analytics, financial search, data integration, robust transactional systems,...


  • United Kingdom Fortice Full time

    We are heading up a recruitment drive for a global consultancy that require an SC Cleared Site Reliability Engineer to join them on a major government project that's based 2 days per week in Wokingham. The SRE team have L2 support responsibilities and will lead the triages. You will be trained in and exposed to many different modern technologies...


  • United Kingdom THINKalpha Full time

    Location: 100% Remote. The working timezone is EU/GMT.ThinkAlpha is looking for a Senior Site Reliability Engineer to work in the core infrastructure team supporting our data analytics platform and transactional trading engine. Our team provides solutions for real-time analytics, financial search, data integration, robust transactional systems, backtesting,...