Site Reliability Engineer

4 hours ago


Greater London, United Kingdom J Bandy Consulting Full time

We are hiring for a next generation telecoms software company who are seeking a Network Autonomy Engineer to join their expanding team. Primary Function of the Position Reporting to the Site Reliability Engineer Team Lead, the Site Reliability Engineer will be responsible for ensuring the reliability, scalability and performance of our systems. Responsibilities Develop the Site Reliability Engineering culture across the team by applying best practices, approaches and code. Apply automation and propose / implement software to any tasks or parts of the system that would deliver benefit. Monitor application performance – identifying, and implementing, improvements to application performance and stability. Collaborate with the design and implementation of the desired pipelines and process for deployment to production environment. The SRE will work closely with Platform and Software domains to ensure continuous improvement of performance and stability whilst adhering to standards. Undertake ad-hoc projects and other activities as required. Key Accountabilities and Activities Drive evolution of the DevOps / GitOps toolchain, promoting improvements to streamline the software delivery process and showing improvements through metrics. Accountable for halting or stopping a project / product if the solution is not technically acceptable. Responsible for producing and maintaining documentation relating to application design, integration processes, testing procedures, and deployment approach as well as collaborating with teams to create operational run and playbooks. Integration with Domains including Collaborating with Domains to plan, design, test and maintain the application. Design patterns for any component or structure under SRE responsibility. Implementation of components such as Monitoring and Logging. Manage the runbook preparations of Domains. Liaise and support other teams on work items including Developing, refining, and tuning integrations between application elements. Collaborate with stakeholders in the Enterprise, Solution and Development teams to produce and maintain standards and guidelines. Knowledge sharing and education of team members across the organisation. Act as first point of contact for the Problem management and Process Outcomes team. Build and guide successful SRE efforts including Analysing and resolving technical and application issues. Researching and evaluating software products. Evaluate risks and defects, analysing specifications, and customising applications for specific customer needs. Identify complex and manual processes and work to simplify and automate them. Continuously review capabilities and roles critical to evolving DevOps and quality assurance practices and be responsible for the acquisition, development, and maturity of these. Minimising outages by continuous improvement. Undertake ad-hoc projects and other activities as required. Experience and Skills Essential Experience and demonstratable knowledge of SRE best practices Expert in Git and Gitops Expert in logging and monitoring solutions (Prometheus, Grafana etc.) Demonstratable knowledge of Cloud Expert knowledge of Kubernetes Proficient ability to communicate in English (Written and Verbal) Understanding of non-functional testing Significant DevOps experience Desirable Proven ability to work independently and collaboratively in a fast-paced technical environment. Demonstratable knowledge of the telecommunications industry and technologies. Proven experience and ability to provide support to direct reports. Golang skills and experience #J-18808-Ljbffr



  • Greater London, United Kingdom Arrows Full time

    This range is provided by Arrows. Your actual pay will be based on your skills and experience — talk with your recruiter to learn more. Base pay range Site Reliability Engineer | Contract | London | Up to £600/day Inside IR35 | Hybrid - Up to £650 per day (Inside IR35) - 2 days per week onsite in London I'm working with a leading media and technology...


  • Greater London, United Kingdom Trades Workforce Solutions Full time

    Site Reliability Engineer (SC Cleared) Duration: 12 Months Rate: £675 per day Location: London or Manchester & remote (hybrid working) IR35 Status: Inside Start: ASAP Role Overview: A Site Reliability Engineer (SC Cleared) is required for our government department to be part of a multidisciplinary team developing and supporting the clients data hub which...


  • Greater London, United Kingdom Charles Simon Associates Ltd Full time

    Site Reliability Engineer – (SRE, Terraform, AKS, Azure, Kubernetes, PowerShell, Python, Bash, Datadog, Monitoring Tools) – Permanent – RemoteTHIS IS AN AZURE FOCUSED ROLE, IF YOU APPLY AND DO NOT WORK EITHER SOLEY OR MAINLY ON AZURE YOU WILL NOT BE CONSIDERED.Location : Remote (occasional travel to Nottinghamshire HQ)Salary : Up to £95,000 per annum...


  • Greater London, United Kingdom TP ICAP Full time

    Join to apply for the Site Reliability Engineer role at TP ICAP. The TP ICAP Group is a world leading provider of market infrastructure. Our purpose is to provide clients with access to global financial and commodities markets, improving price discovery, liquidity, and distribution of data, through responsible and innovative solutions. Through our people and...


  • Greater London, United Kingdom Stratospherec Ltd Full time

    Overview Senior DevOps Engineer / Senior Site Reliability Engineer Fully Remote working for candidates based in the UK – Salary to £90k + Benefits We are looking for a Senior DevOps Engineer that has strong C# code knowledge combined with strong knowledge of DevOps tools like Kubernetes (EKS or AKS) and Azure or AWS Cloud platforms. We are looking for a...


  • City of London, Greater London, United Kingdom Amelco Limited Full time

    Role: Site Reliability Engineer Type: Full-time permanent role Location: Hybrid/ Shoreditch, London 3 days per week About Us Amelco Ltd are a leading gaming and gambling solution software provider with a strong presence in the USA, UK, and Europe. Through partnerships with global gaming companies, we build cutting-edge technical platforms across sportsbooks,...


  • Greater London, United Kingdom Group Full time

    Site Reliability Engineer Optum is a global organisation that delivers care, aided by technology to help millions of people live healthier lives. The work you do with our team will directly improve health outcomes by connecting people with the care, pharmacy benefits, data and resources they need to feel their best. Here, you will find a culture guided by...


  • Farringdon, Greater London, United Kingdom Charles Simon Associates Ltd Full time

    Site Reliability Engineer – (SRE, Terraform, AKS, Azure, Kubernetes, PowerShell, Python, Bash, Datadog, Monitoring Tools) – Permanent – Remote THIS IS AN AZURE FOCUSED ROLE, IF YOU APPLY AND DO NOT WORK EITHER SOLEY OR MAINLY ON AZURE YOU WILL NOT BE CONSIDERED. Location: Remote (occasional travel to Nottinghamshire HQ) Salary: Up to £95,000 per annum...


  • London, United Kingdom Prism Digital Full time

    **Site Reliability Engineer | DevOps, Kubernetes | Prestigious Retailer** A prestigious fashion retailer, well known and respected within the industry, is looking to expand their engineering team with a talented Site Reliability Engineer. Our client, a well-funded scale-up, has offices across the globe, and they are entering anexciting and innovative period...


  • London, United Kingdom Searchability (UK) Ltd Full time

    KEY POINTS* Up to £70,000 salary* Hybrid working with three days a week onsite in Greater Manchester* Modern SRE environment with cloud-native tooling (AWS, Kubernetes, Terraform)* High-availability digital platforms and performance-critical workloads ABOUT THE CLIENTWe're supporting a well-established UK organisation recognised for operating large-scale,...