Reliability Engineering Expert for Cloud Services

2 weeks ago


London, Greater London, United Kingdom Apple Inc. Full time

At Apple, we're always looking for talented individuals who can help us create innovative solutions for our customers. As a Senior Monitoring Site Reliability Engineer, you'll play a critical role in ensuring the reliability, scalability, and security of our production environments.

About the Team

Our team is highly collaborative, working closely with partner teams to deliver the best results for Apple. We strive to find the best solution while also considering the need to get things done efficiently for each engineering challenge we face. Good ideas are valued and rewarded.

Key Responsibilities

  1. Operate, monitor, and troubleshoot all aspects of our production and non-production environments.
  2. Pioneer and implement next-generation telemetry systems for AIS services.
  3. Establish alert handling procedures, runbooks, and collaborate with our global security team.
  4. Automate deployment and orchestration of services into the cloud environment.
  5. Actively participate in capacity planning and disaster recovery exercises.
  6. Interact with and support partner teams across the enterprise.

Requirements

  • Professional experience in Site Reliability Engineering, DevOps, or a related field.
  • Experience working with cloud compute environments like OpenStack, AWS, GCP, or Azure.
  • Experience with infrastructure as code (IaC), configuration management, CI/CD, and automation, e.g., Terraform, Pulumi, CloudFormation, Ansible, Chef, Puppet, Jenkins.
  • Strong programming skills: Python and/or Go.

Preferred Qualifications

  • Proficiency in implementing and coordinating telemetry using monitoring and observability tools like Splunk, Grafana, Prometheus, or similar.
  • Extensive experience administering and troubleshooting Linux systems, including standard Linux utilities.
  • Troubleshooting and debugging experience.
  • Experience with shell scripting and system administration.
  • Experience measuring, analyzing, and optimizing performance.
  • Experience operating with Scrum/Agile development methodologies.
  • Strong understanding of concurrency, parallelism, and distributed system concepts.
  • Passion for high-quality code, tests, documentation, and production services.
  • Participation in an on-call rotation.
  • Building and operating container orchestrating systems (Docker, Kubernetes, vagrant, and micro-services).
  • Bachelor's Degree in Computer Science or equivalent experience.

Estimated Salary: $145,000 - $195,000 per year



  • London, Greater London, United Kingdom Oracle Full time

    Key Responsibilities: As a Reliability Expert for Cloud Services, you will be responsible for automating complex problems related to infrastructure cloud services and building automation to prevent problem recurrence. You will also need to ensure the seamless integration of all components of our Recovery Cloud Services and continuously push to define and...


  • London, Greater London, United Kingdom Spectrum IT Recruitment Full time £75,000 - £85,000

    We are seeking a Cloud Reliability and Scalability Expert to join our team at Spectrum IT Recruitment. As a key member of the engineering team, you will be responsible for streamlining software delivery pipelines, enhancing reliability, performance, and scalability of systems. This role is ideal for those with a passion for cloud computing and a desire to...


  • London, Greater London, United Kingdom X4 Engineering Full time £900

    Job Title: Cloud Architect ExpertX4 Engineering is a global leader in the energy sector, driving innovation at the forefront of the energy transition. We're seeking an expert to join our team on a 12-month contract (with likely extension) for this exciting opportunity.The successful candidate will design and deliver all aspects of solution architecture...


  • London, Greater London, United Kingdom Arrows Full time

    Arrows seeks a Cloud Reliability Engineer with expertise in modern DevOps practicesThe ideal candidate will have extensive experience with Kubernetes, Azure Container Apps, and Azure networking.To ensure the reliability and efficiency of our systems, we require a highly skilled engineer with strong coding and scripting skills.We are looking for an expert who...


  • London, Greater London, United Kingdom Cloud Decisions Full time £65,000

    Company OverviewA leading global SaaS provider, Cloud Decisions, is seeking a highly skilled Cloud Solutions Architect to design and implement secure, scalable cloud solutions on Microsoft Azure. This purpose-led software provider empowers learners, educators, and caregivers around the globe with accessible, interactive tools.Salary: £80,000 - £110,000 per...


  • London, Greater London, United Kingdom Wipro Full time

    Job Title: Platform and Service Reliability ExpertCompany Overview: Wipro Limited is a global technology leader that provides innovative solutions to its clients' complex digital transformation needs. We strive to create a diverse and inclusive workplace culture.Salary: $125,000 per year.Job Description:We are seeking an experienced Platform and Service...


  • London, Greater London, United Kingdom Alibaba Cloud Full time

    About the RoleThe Cloud Solutions Expert will play a key part in driving business growth and customer success for Alibaba Cloud. This role involves developing solutions, promoting public cloud services, and empowering sales teams and partners.Key Responsibilities:Develop targeted solutions to meet customer needs, including product selection strategy, POC...


  • London, Greater London, United Kingdom Qurated Network Full time

    The Qurated Network is looking for an exceptional Cloud Reliability Engineering Manager to lead our efforts in ensuring the reliability, scalability, and security of our cloud-based payments platform.In this critical role, you will be responsible for designing, implementing, and maintaining robust cloud architectures that meet the needs of our...


  • London, Greater London, United Kingdom Kroo Bank Ltd Full time

    We are a cutting-edge fintech company, Kroo Bank Ltd, on a mission to revolutionize the banking industry. Our vision is to create a platform that makes financial services accessible to all. We believe in the importance of reliability and efficiency in our operations, which is why we are seeking an experienced Cloud Reliability Engineer to join our team.Job...


  • London, Greater London, United Kingdom Quorso Full time

    About the RoleWe are seeking a highly skilled Cloud Reliability Engineer to join our Engineering team at Quorso. As a key member of the team, you will focus on improving the stability and security aspects of our technical stack. Your expertise in monitoring and logging integrations, alerting capabilities, and performance-related issues will be instrumental...

  • Cloud Engineer

    1 week ago


    London, Greater London, United Kingdom Cloud Decisions Full time

    Experience the art of work-life balance as a Cloud Engineer at Cloud Decisions. We offer a hybrid working position with 4 days on, 4 days off, allowing you to recharge and focus on delivering exceptional results.This role involves investigating device outages, carrying out remediation, and reviewing failed Windows updates. You will also diagnose performance...

  • Hybrid Cloud Engineer

    2 weeks ago


    London, Greater London, United Kingdom UNITING CLOUD Limited Full time

    Job Summary: UNITING CLOUD Limited is seeking a Cloud and Server Infrastructure Engineer to support our existing enterprise business with their on-prem infrastructure and be involved in the migration to a multi-cloud environment.About the Role:We are looking for an expert in Windows Server, O365 and VMware with a desire to learn AWS and Azure. You will have...


  • London, Greater London, United Kingdom Regal Cloud Full time £80,000

    Regal Cloud is seeking a highly experienced Cloud Architect to lead our cloud engineering team.We are looking for an expert in cloud architecture with a strong track record of delivering complex cloud-based solutions. As a key member of our team, you will be responsible for designing and implementing scalable, secure, and efficient cloud architectures that...


  • London, Greater London, United Kingdom Apple Inc. Full time

    OverviewAt Apple Inc., we're looking for a highly skilled Senior Monitoring Site Reliability Engineer to join our team. The ideal candidate will have a strong background in reliability engineering, software development, privacy, and information security, with a desire to work in hyper-scale environments.Key ResponsibilitiesMonitor and operate all aspects of...


  • London, Greater London, United Kingdom Apple Inc. Full time

    We are seeking an experienced Reliability Engineer to join our team at Apple Inc. in a challenging role that combines engineering, software development, and problem-solving skills.About the RoleThis is an exciting opportunity to work on high-availability systems, scalable architecture, and monitoring tools to ensure seamless operations of our cloud...


  • London, Greater London, United Kingdom Cloud Decisions Full time £65,000

    Lead Engineer OpportunityWe are seeking an experienced Azure Cloud Professional to join our team as a Lead Engineer. As a key member of our Cloud Decisions team, you will play a critical role in designing and implementing secure, scalable cloud solutions on Microsoft Azure.Key ResponsibilitiesDesign and deploy secure, high-performance solutions on Azure....


  • London, Greater London, United Kingdom Cloud Decisions Full time £65,000

    Transform Education with Cloud SolutionsAt Cloud Decisions, we're committed to empowering learners, educators, and caregivers through inclusive education. As a Cloud Solutions Architect, you'll play a critical role in designing and implementing secure, scalable cloud solutions on Microsoft Azure.Your Key Responsibilities:Designing and deploying...

  • IT Support Engineer

    2 weeks ago


    London, Greater London, United Kingdom Cloud Decisions Full time £32,000

    Transform Your Career with Cloud DecisionsWe are seeking an experienced Cloud Support Engineer to join our team and help us deliver exceptional support services to our clients.As a Cloud Support Engineer, you will be responsible for managing the configuration of end user devices, resolving escalated 1st line support calls and service requests, and occasional...


  • London, Greater London, United Kingdom Cloud Decisions Full time £65,000

    **Job Description:**We are seeking a highly skilled Cloud Solutions Architect to design and implement secure, scalable cloud solutions on Microsoft Azure.This is an exciting opportunity to join our team at Cloud Decisions as a Senior Cloud Professional, working on a leading global SaaS provider's cloud platform. As part of this purpose-led software provider,...

  • Cloud Engineer

    5 days ago


    London, Greater London, United Kingdom Revolution Technology Full time £90,000

    Job Title: Cloud Engineer - AWS ExpertWe are Revolution Technology, a leading FinTech company, and we are seeking a skilled Cloud Engineer to join our team. As a Senior DevOps Engineer, you will be responsible for managing our AWS infrastructure, ensuring high performance, security, and reliability.We are looking for a candidate with 5+ years of experience...