Site Reliability Engineering

4 weeks ago


London, United Kingdom Apple Inc. Full time
Site Reliability Engineering (SRE) Manager, iCloud

People at Apple don’t just build products — they craft experiences our customers love and depend on. Apple Services Engineering (ASE) builds and supports the systems that make many of these daily experiences possible. If you’ve used Apple products, you’ve likely interacted with us. iCloud Services SRE teams are responsible for the systems and services that directly support those customers and their experiences. We focus on availability and automation of key services that run iCloud every minute of every day all around the world.

Key Qualifications

  • Experience with large scale distributed systems, especially ML infrastructure and services including LLMs, Generative AI, and transformers
  • Demonstrable success leading engineering teams - ideally SRE or Production Engineering
  • Knowledge of core operating system principles, networking fundamentals, and systems management
  • Understanding of SRE principals, including monitoring, alerting, error budgets, fault analysis, and other common reliability engineering concepts
  • Experience with hiring and leading engineers
  • Professional experience in an engineering leadership position
Description

We're looking for a hardworking and passionate person to join this amazing team. You will be an accomplished builder and leader of teams looking to tackle your next challenge. You know SRE and you know what it will take to run services at Apple scale with a high degree of operational perfection. This role will position you to help shape the future of how we build and run our services on a global scale. You will have the technical skills to go deep and retain the ability to focus on higher-level business and product goals. We hire high quality leaders and engineers with a diverse set of experiences and skill sets for positions on Apple. Our customers count on us to provide extraordinary availability, scalability, and security for services. If you’d like to positively influence millions of customers’ experience of Apple this is the job for you.As a Site Reliability Engineering Manager, responsibilities include:Lead SRE teams responsible for reliability and performance of on-prem and cloud-based servicesLeading and growing the engineers on your teamManage staging and production environments with goal of maximizing availabilityPromote observability of systems for monitoring, alerting, and metrics reportingAdvocate best practices of reliability engineering

Education & Experience

Bachelors or Masters degree in computer science or equivalent field.

#J-18808-Ljbffr

  • London, United Kingdom Capgemini Engineering Full time

    At Capgemini Engineering, the world leader in engineering services, we bring together a global team of engineers, scientists, and architects to help the world’s most innovative companies unleash their potential. From autonomous cars to life-saving robots, our digital and software technology experts think outside the box as they provide unique R&D and...


  • London, United Kingdom Capgemini Engineering Full time

    At Capgemini Engineering, the world leader in engineering services, we bring together a global team of engineers, scientists, and architects to help the world’s most innovative companies unleash their potential. From autonomous cars to life-saving robots, our digital and software technology experts think outside the box as they provide unique R&D and...


  • London, United Kingdom Capgemini Engineering Full time

    At Capgemini Engineering, the world leader in engineering services, we bring together a global team of engineers, scientists, and architects to help the world’s most innovative companies unleash their potential. From autonomous cars to life-saving robots, our digital and software technology experts think outside the box as they provide unique R&D and...


  • London, United Kingdom Reed Full time

    **SRE | SITE RELIABILITY ENGINEER | DEVOPS | AWS | AMAZON WEB SERVCIES | CLOUDFORMATION | KINESIS | CODEPIPELINE | FARGATE | BATCH | PYTHON | GOLANG | DJANGO | REACT | UK | FULLY REMOTE** **Site Reliability Engineer - £80k** A renowned SEO business is looking for a Senior Site Reliability Engineer to build and improve a rapidly evolving infrastructure...


  • London, United Kingdom Lorien Full time

    Site Reliability Engineer Location: London (hybrid remote working) **Salary**: Up to £100,000 + Very Generous Benefits Package One of the fastest growing ecommerce organisation requires a Site Reliability Engineer to help be the glue between the companies Dev, QA and Product teams - enabling the smooth Continuous Build and Integration of new instances of...


  • London, United Kingdom Lorien Full time

    Site Reliability Engineer Location: London (hybrid remote working) **Salary**: Up to £100,000 + Very Generous Benefits Package One of the fastest growing software development organisation requires a Site Reliability Engineer to help be the glue between the companies Dev, QA and Product teams - enabling the smooth Continuous Build and Integration of new...


  • London, United Kingdom Lorien Full time

    Site Reliability Engineer Location: London (hybrid remote working) **Salary**: Up to £100,000 + Very Generous Benefits Package One of the fastest growing ecommerce organisation requires a Site Reliability Engineer to help be the glue between the companies Dev, QA and Product teams - enabling the smooth Continuous Build and Integration of new instances of...


  • London, United Kingdom Explore Group Full time

    **Lead Site reliability engineer - Fully remote - No sponsorship offered** Role: Site Reliability engineer Location: Fully remote **Salary**: Up to £115,000 **Responsibilities**: - Design, build, and maintain scalable and highly available infrastructure on AWS - Implement automation and continuous integration/delivery pipelines using tools such as...


  • London, Greater London, United Kingdom Austin Werner Ltd Full time

    Site Reliability Engineer - Global Media/Publishing businessWe are seeking a Site Reliability Engineer for a globally leading Publishing business based in London.My client has built their internal IT environment from ground up so is bespoke to the business with cutting edge and innovative tech where the systems are responsible for all infrastructure...


  • London, Greater London, United Kingdom Bayside Solutions Full time £91,400 - £108,000

    Site Reliability Engineer Contract Location: London, England - Hybrid Role We seek a Site Reliability Engineer to join our team and play a crucial role in ensuring our applications and services' reliability, availability, and performance. This role requires a strong background in application support, monitoring, and cloud technologies, focusing on AWS,...


  • London, United Kingdom Understanding Recruitment Full time

    Site Reliability Engineer Check all associated application documentation thoroughly before clicking on the apply button at the bottom of this description.I am seeking a Site Reliability Engineer for one of the worlds fastest growing social media platforms. With over 900 Million Daily users.The SRE group come from diverse technical backgrounds, Reliability,...


  • London, United Kingdom Understanding Recruitment Full time

    Site Reliability Engineer I am seeking a Site Reliability Engineer for one of the worlds fastest growing social media platforms. With over 900 Million Daily users. The SRE group come from diverse technical backgrounds, Reliability, Software Engineering and Security Engineering, and have a broad remit ensuring high availability and performance, and...


  • London, United Kingdom Understanding Recruitment Full time

    Site Reliability Engineer I am seeking a Site Reliability Engineer for one of the worlds fastest growing social media platforms. With over 900 Million Daily users. The SRE group come from diverse technical backgrounds, Reliability, Software Engineering and Security Engineering, and have a broad remit ensuring high availability and performance, and...


  • London, United Kingdom Understanding Recruitment Full time

    Site Reliability Engineer I am seeking a Site Reliability Engineer for one of the worlds fastest growing social media platforms. With over 900 Million Daily users.The SRE group come from diverse technical backgrounds, Reliability, Software Engineering and Security Engineering, and have a broad remit ensuring high availability and performance, and currently...


  • London, Greater London, United Kingdom Workingmums Full time £75,000

    Site Reliability EngineerRemote / GlasgowSalary to GBP75,000 + BonusImmediate Start Fantastic new opportunity to the market to join our Glasgow-based tech-for-good client, specialising in digital solutions and who have a huge global reach. Due to increased success in their space and demand for their services, they are now recruiting for a Site Reliability...


  • London, Greater London, United Kingdom Explore Group Full time

    Lead Site reliability engineer - Fully remote - No sponsorship offeredRole: Site Reliability engineerLocation: Fully remoteSalary: Up to £115,000Responsibilities: Design, build, and maintain scalable and highly available infrastructure on AWS Implement automation and continuous integration/delivery pipelines using tools such as Jenkins, Terraform, and...


  • London, United Kingdom N Consulting Ltd Full time

    Job title: Site Reliability EngineerWork Mode: 3 days office MandatoryLocation: 5 Broadgate, London EC2M 2QS, United KingdomContract Duration: 12 monthsWe’re looking for a Site Reliability Engineer to:· determine the reliability of our digital products, technology services, and the infrastructure that underpins them· minimize the risk and impact of...


  • London, United Kingdom N Consulting Ltd Full time

    Job title: Site Reliability EngineerWork Mode: 3 days office MandatoryLocation: 5 Broadgate, London EC2M 2QS, United KingdomContract Duration: 12 monthsWe’re looking for a Site Reliability Engineer to:· determine the reliability of our digital products, technology services, and the infrastructure that underpins them· minimize the risk and impact of...


  • London, United Kingdom CONTECHS Full time

    Site Reliability Engineer (IT Infrastructure)n10-month initial contractnOnsite (Manchester)n£33ph (Inside IR35)About the companynI am currently recruiting on behalf of a Luxury Automotive OEM, based in Manchester, seeking Site Reliability Engineers to join their teamJob DescriptionnAs Site Reliability Engineer, your main responsibilities are:nSoftware...


  • London, Greater London, United Kingdom CONTECHS Full time

    Site Reliability Engineer (IT Infrastructure)n10-month initial contractnOnsite (Manchester)n£33ph (Inside IR35)About the companynI am currently recruiting on behalf of a Luxury Automotive OEM, based in Manchester, seeking Site Reliability Engineers to join their teamJob DescriptionnAs Site Reliability Engineer, your main responsibilities are:nSoftware...