Site Reliability Engineer

1 month ago


Nottingham, United Kingdom Thrive Learning Limited Full time

As a Site Reliability Engineer within the SRE team, you’ll be focused on monitoring and supporting the applications hosted in AWS environments for platforms and tools utilised by our customers.
The SRE team specialises in giving delivery squads visibility of the performance of their services in production and support to investigate and contain potential problems.
Unlike traditional development roles, this position won't have you building features. Instead, you'll dive deep into troubleshooting issues, implementing automation solutions, containing bugs and implementing proactive measuresto uphold our system's integrity and performance.
You’ll have freedom to help research and recommend solutions for hosting applications at scale. You’ll be fundamental in incident response, troubleshooting and containing issues.

Key responsibilities

Debug Node.js applications and contribute to their optimization and performance tuning.
Configuration and ongoing management of environments and services on AWS.
Enhancing tools and processes for monitoring scalable applications on AWS.
Maintaining high availability through proactive measures.
Troubleshooting and resolving complex technical issues.
Documentation of Standard Operating Procedures.
Automation of SOPs and Run Books.
Respond to issues outside of working hours as per on call rota.
Basic Qualifications

Experience implementing environments for web-based microservices.
Experience of supporting MongoDB based web applications.
Experience of engineering, architecting, or supporting AWS solutions.
Familiarity with cloud virtualisation tools such as ECS and/or Docker containers.
Experience working with automated deployment systems (eg. CloudFormation. CodeBuild).
Familiarity with any monitoring tool. for eg : NewRelic, DataDog, Prometheus, Grafana etc.
Experience in automation of workloads using a scripting language like Python or JavaScript
Strong problem-solving skills and the ability to troubleshoot complex issues.
Good understanding of incident response best practices, post-incident reviews, and continuous improvement.
Ability and willingness to proactively improve ways of working and processes.
Desire to continually grow, develop and improve.
Experience debugging NodeJS applications.
Useful Skills

Understanding of REST, GraphQL and asynchronous messaging
Experience of using Git for version control.
Experience of Continuous Integration and Deployment advantageous.
Familiarity with core SRE principles encompassing areas such as monitoring, alerting, error budgets, fault analysis, and other prevalent concepts in the realm of reliability engineering.
Excellent written and verbal communication skills.
Familiarity with IT compliance and risk management requirements (eg. security, privacy, GDPR etc.)
What we will offer you
It’s no secret that fast growing SaaS businesses are one of the most exciting places to start your career, and Thrive is no exception. Our team thinks differently to other businesses and we offer our employees something different, too. We’re all about trust, autonomy and doing the right thing, and we’re proud of the benefits we offer to our team. You’ll receive:

Unlimited annual holiday. You did read that right
Flexible working hours
Modern and lively offices with a fantastic culture
Unbeatable THRIVE social events.
Health Plan
Employee Assistance Program
Referral Scheme
#J-18808-Ljbffr



  • Nottingham, United Kingdom Burns Sheehan Full time

    Senior Site Reliability Engineer | £100,000 Package | Fully Remote (UK) Are you ready to be part of the forefront of financial technology innovation? Burns Sheehan have partnered up with a trailblazer within the FinTech/RegTech space and are on the lookout for a dynamic Senior Site Reliability Engineer to join their innovative ranks. About Them: They are...


  • Nottingham, United Kingdom Scalers Full time

    SaaS payments (Series D) Remote (UK)⚙️ 3 stage process ️ AWS, Terraform, Docker, Go, Kubernetes, GrafanaScalers is helping a leading, global SaaS payments company to hire a Site Reliability Engineer to join their team on a permanent basis.The SRE team is a driving force of improving and automating how our clients Product teams develop software at...


  • Nottingham, United Kingdom Capital One Financial Corporation Full time

    Principal Site Reliability Engineer - Services page is loaded Principal Site Reliability Engineer - Services Apply locations Nottingham, Eng time type Full time posted on Posted 2 Days Ago job requisition id R181801 Nottingham Trent House (95002), United Kingdom, Nottingham, NottinghamshirePrincipal Site Reliability Engineer - Services About this role ...


  • Nottingham, United Kingdom Capital One Financial Corporation Full time

    Senior Site Reliability Engineer - Front End page is loaded Senior Site Reliability Engineer - Front End Apply locations Nottingham, Eng time type Full time posted on Posted 2 Days Ago job requisition id R181804 Nottingham Trent House (95002), United Kingdom, Nottingham, NottinghamshireSenior Site Reliability Engineer - Front End Capital One's mission is...


  • Nottingham, United Kingdom Capital One Financial Corporation Full time

    Principal Site Reliability Engineer - Services page is loaded Principal Site Reliability Engineer - Services Apply locations Nottingham, Eng time type Full time posted on Posted 2 Days Ago job requisition id R181801 Nottingham Trent House (95002), United Kingdom, Nottingham, NottinghamshirePrincipal Site Reliability Engineer - Services About this role ...


  • Nottingham, United Kingdom Capital One (Europe) plc Full time

    Nottingham Trent House (95002), United Kingdom, Nottingham, NottinghamshireSenior Site Reliability Engineer - Front End Capital One's mission is to change banking for good by bringing humanity, ingenuity and simplicity to banking. Sitting at the core of these values is our Software Engineering department, whose primary role is to be an effective enabler of...


  • Nottingham, United Kingdom Capital One (Europe) plc Full time

    Capital One's mission is to change banking for good by bringing humanity, ingenuity and simplicity to banking. Sitting at the core of these values is our Software Engineering department, whose primary role is to be an effective enabler of Capital One’s ambitions. We are keen to add a Principal Site Reliability Engineer to our Nottingham based team whose...


  • Nottingham, United Kingdom Capital One Financial Corporation Full time

    Principal Site Reliability Engineer - Front End page is loaded Principal Site Reliability Engineer - Front End Apply locations Nottingham, Eng time type Full time posted on Posted 2 Days Ago job requisition id R181803 Nottingham Trent House (95002), United Kingdom, Nottingham, NottinghamshirePrincipal Site Reliability Engineer - Front End Capital One's...


  • Nottingham, United Kingdom Capital One Financial Corporation Full time

    Senior Site Reliability Engineer - Front End page is loaded Senior Site Reliability Engineer - Front End Apply locations Nottingham, Eng time type Full time posted on Posted 2 Days Ago job requisition id R181804 Nottingham Trent House (95002), United Kingdom, Nottingham, NottinghamshireSenior Site Reliability Engineer - Front End Capital One's mission is...


  • Nottingham, United Kingdom Capital One (Europe) plc Full time

    Nottingham Trent House (95002), United Kingdom, Nottingham, NottinghamshirePrincipal Site Reliability Engineer - Services About this role Capital One's mission is to change banking for good by bringing humanity, ingenuity and simplicity to banking. Sitting at the core of these values is our Software Engineering department, whose primary role is to be...


  • Nottingham, United Kingdom Capital One Full time

    Nottingham Trent House (95002), United Kingdom, Nottingham, NottinghamshirePrincipal Site Reliability Engineer - Front End Capital One's mission is to change banking for good by bringing humanity, ingenuity and simplicity to banking. Sitting at the core of these values is our Software Engineering department, whose primary role is to be an effective enabler...


  • Nottingham, United Kingdom Capital One Full time

    Nottingham Trent House (95002), United Kingdom, Nottingham, NottinghamshireSenior Site Reliability Engineer - Front End Capital One's mission is to change banking for good by bringing humanity, ingenuity and simplicity to banking. Sitting at the core of these values is our Software Engineering department, whose primary role is to be an effective enabler of...


  • Nottingham, United Kingdom Capital One (Europe) plc Full time

    Nottingham Trent House (95002), United Kingdom, Nottingham, NottinghamshirePrincipal Site Reliability Engineer - Front End Capital One's mission is to change banking for good by bringing humanity, ingenuity and simplicity to banking. Sitting at the core of these values is our Software Engineering department, whose primary role is to be an effective...


  • Nottingham, United Kingdom Capital One (Europe) plc Full time

    Nottingham Trent House (95002), United Kingdom, Nottingham, NottinghamshireSenior Site Reliability Engineer - Front End Capital One's mission is to change banking for good by bringing humanity, ingenuity and simplicity to banking. Sitting at the core of these values is our Software Engineering department, whose primary role is to be an effective...


  • Nottingham, United Kingdom Capital One Full time

    Nottingham Trent House (95002), United Kingdom, Nottingham, NottinghamshirePrincipal Site Reliability Engineer - Back End About this role Capital One's mission is to change banking for good by bringing humanity, ingenuity and simplicity to banking. Sitting at the core of these values is our Software Engineering department, whose primary role is to be an...

  • Reliability Engineer

    1 month ago


    Nottingham, United Kingdom Musk Process Services Full time

    Musk Process Services is part of the Edwin James Group, a leading independently owned property and infrastructure support services company delivering integrated building and facilities management services. Our team of over 1000 people works for and with a diverse Blue Chip client base across diverse sectors including Manufacturing, Food, Government and...


  • Nottingham, United Kingdom Capital One (Europe) plc Full time

    Nottingham Trent House (95002), United Kingdom, Nottingham, NottinghamshirePrincipal Site Reliability Engineer - Back End About this role Capital One's mission is to change banking for good by bringing humanity, ingenuity and simplicity to banking. Sitting at the core of these values is our Software Engineering department, whose primary role is to be...

  • Reliability Engineer

    3 weeks ago


    Nottingham, United Kingdom Musk Process Services Full time

    Musk Process Services is part of the Edwin James Group, a leading independently owned property and infrastructure support services company delivering integrated building and facilities management services. Our team of over 1000 people works for and with a diverse Blue Chip client base across diverse sectors including Manufacturing, Food, Government and...

  • Reliability Engineer

    1 month ago


    Nottingham, United Kingdom Musk Process Services Full time

    Musk Process Services is part of the Edwin James Group, a leading independently owned property and infrastructure support services company delivering integrated building and facilities management services. Our team of over 1000 people works for and with a diverse Blue Chip client base across diverse sectors including Manufacturing, Food, Government and...

  • Reliability Engineer

    3 weeks ago


    Nottingham, United Kingdom Musk Process Services Full time

    Job Description Musk Process Services is part of the Edwin James Group, a leading independently owned property and infrastructure support services company delivering integrated building and facilities management services. Our team of over 1000 people works for and with a diverse Blue Chip client base across diverse sectors including Manufacturing, Food,...