Observability Site Reliability Engineer

1 day ago


London, United Kingdom Apple Inc. Full time

People at Apple don’t just build products — they craft the kind of experience that have revolutionized entire industries. The diverse collection of our people and their ideas inspire innovation in everything we do. Imagine what you could do here Join Apple, and help us leave the world better than we found it. The Apple Service Engineering (ASE) team builds and provides systems and infrastructure that fuel Apple’s services (such as iCloud, iTunes, Siri, and Maps). We are the foundation on which Apple’s software developers build the products that our customers love. We are looking for passionate and talented Site Reliability Engineers to continue our focus in providing our customers the highest quality Apple Services experience. Our services have to scale globally, stay highly available, and "just work.” If you love designing, engineering and running systems and infrastructure that will help millions of customers, then this is the place for you The Observability SRE organization is specifically tasked with enabling other teams to better understand their infrastructure and services, providing world-class observability capabilities.

Description

Apple Services Engineering infrastructure is BIG. Operating at our scale, across multiple geographically dispersed data centers and servicing hundreds of millions of users presents unique challenges. As an SRE at Apple, you'll need to solve these problems using data, teamwork, and your own expertise. SREs at Apple own the full infrastructure stack; from device driver performance debugging to content delivery network traffic management — our responsibilities are both broad and deep. ASE runs the majority of its systems on Linux. We run a mix of open source, vendor licensed, and internally developed tools to perform functions such as system configuration management, provisioning, software deployment, logging, and monitoring. You'll learn these tools and have opportunities to improve them. Our team is collaborative; we work closely with the development teams we support to deliver the best results for Apple. We think critically and strive to balance the best solution with the need to get things done for each engineering challenge we face. Good ideas are heard and results are rewarded.

Minimum Qualifications

  • Strong understanding of the Linux operating system
  • Good understanding of the TCP/IP suite of networking protocols
  • Ability to design, author, and release code in languages like Go or Python
  • Hands-on experience managing large numbers of diverse systems with configuration management or software delivery platforms (such as Puppet, Chef, Ansible)
  • Familiarity with microservices architecture and container orchestration with Kubernetes

Preferred Qualifications

  • Excellent troubleshooting and problem solving skills
  • Bare metal management experience
  • Experience with deploying, supporting and monitoring new and existing services, platforms, and application stacks
  • Acute drive to automate manual operations and to improve them through repeated iteration
  • Experience with scale testing, disaster recovery, and capacity planning
  • Strong sense of ownership and integrity demonstrated through clear communication and collaboration
  • Experience in managing and scaling distributed systems in a public, private, or hybrid cloud environment
  • Experience with the Prometheus ecosystem
  • Good understanding of infrastructure observability principles

Education & Experience

BS/MS in Computer Science or Equivalent (5+ years of software development or production operations experience in a large-scale environment)

#J-18808-Ljbffr

  • London, United Kingdom Apple Full time

    Summary People at Apple don't just build products - they craft the kind of experience that have revolutionized entire industries. The diverse collection of our people and their ideas inspire innovation in everything we do. Imagine what you could do here! Join Apple, and help us leave the world better than we found it.The Apple Service Engineering(ASE) team...


  • London, United Kingdom Apple Inc. Full time

    People at Apple don’t just build products — they craft the kind of experience that have revolutionized entire industries. The diverse collection of our people and their ideas inspire innovation in everything we do. Imagine what you could do here! Join Apple, and help us leave the world better than we found it. The Apple Service Engineering (ASE) team...


  • London, Greater London, United Kingdom JLL Full time

    JLL is shaping the future of real estate by combining world-class services, advisory, and technology for its clients.The company is looking for an Observability Engineer to support and administer the Datadog monitoring platform. This role focuses on ensuring the reliability, scalability, and efficiency of Datadog for monitoring and AIOps within the...


  • London, Greater London, United Kingdom Cisco Full time

    About ThousandEyesThousandEyes is a leading provider of cloud and internet intelligence solutions. Our mission is to provide organizations with visibility and insights into the digital experience of their users.Job SummaryWe are seeking a Senior Site Reliability Engineer to join our Observability team. The successful candidate will be responsible for...


  • London, United Kingdom Cisco Full time

    Who We Are The name ThousandEyes was born from two big ideas: the power to see things not ordinarily possible and the ability to collect insights from a multitude of vantage points. As organizations rely more on cloud services and the Internet, the network has become a “black box” outside of their control. ThousandEyes gives organizations...


  • London, United Kingdom Tbwa ChiatDay Inc Full time

    Site Reliability Engineer, ObservabilityLondon, United KingdomWho We AreCisco ThousandEyes is a Digital Experience Assurance platform that empowers organizations to deliver flawless digital experiences across every network – even the ones they don’t own. Powered by AI and an unmatched set of cloud, internet and enterprise network telemetry data,...


  • London, United Kingdom Tbwa ChiatDay Inc Full time

    Site Reliability Engineer, Observability London, United Kingdom Who We Are Cisco ThousandEyes is a Digital Experience Assurance platform that empowers organizations to deliver flawless digital experiences across every network – even the ones they don’t own. Powered by AI and an unmatched set of cloud, internet and enterprise network telemetry data,...


  • London, United Kingdom Stealth iT Consulting Full time

    A large IT consultancy is currently seeking a Site Reliability Engineer (SRE) for a permanent position.Remote with occasional travelSalary - up to £55,000 per annum Candidates must be eligible for SC ClearanceAs an SRE engineer, you will play a key role in ensuring the reliability, scalability, and efficiency of our clients' platforms. Your focus will...


  • London Area, United Kingdom Stealth iT Consulting Full time

    A large IT consultancy is currently seeking a Site Reliability Engineer (SRE) for a permanent position. Remote with occasional travel Salary - up to £55,000 per annum Candidates must be eligible for SC Clearance As an SRE engineer, you will play a key role in ensuring the reliability, scalability, and efficiency of our clients' platforms. Your focus will...


  • London Area, United Kingdom Stealth iT Consulting Full time

    A large IT consultancy is currently seeking a Site Reliability Engineer (SRE) for a permanent position.Remote with occasional travelSalary - up to £55,000 per annum Candidates must be eligible for SC ClearanceAs an SRE engineer, you will play a key role in ensuring the reliability, scalability, and efficiency of our clients' platforms. Your focus will...


  • London Area, United Kingdom Stealth iT Consulting Full time

    A large IT consultancy is currently seeking a Site Reliability Engineer (SRE) for a permanent position.Remote with occasional travelSalary - up to £55,000 per annum Candidates must be eligible for SC ClearanceAs an SRE engineer, you will play a key role in ensuring the reliability, scalability, and efficiency of our clients' platforms. Your focus will...


  • London, United Kingdom Bright Purple Full time

    Site Reliability Engineer – London - Hybrid (3 Days onsite) Step into a role that promises not just a job, but a rewarding career with a leading tech unicorn. What is in it for you: Salary up to £75,000 including equity in the company Hybrid working arrangements with global offices Generous holiday allowance Private healthcare Professional and...


  • London, Greater London, United Kingdom Worldpay Full time

    Unlocking potential means working as one global community. At Worldpay, we stay agile, using our initiative, taking calculated risks to progress. We champion our ideas and stay flexible to make them happen.About UsWe process the largest volume of payments in the world, driving the global economy every day. Our team of system administrators and engineers...


  • London, United Kingdom Stealth iT Consulting Full time

    A large IT consultancy is currently seeking a Site Reliability Engineer (SRE) for a permanent position. Remote with occasional travel Salary - up to £55,000 per annum Candidates must be eligible for SC Clearance As an SRE engineer, you will play a key role in ensuring the reliability, scalability, and efficiency of our clients' platforms. Your...


  • London, United Kingdom HCLTech Full time

    Job Description HCLTech is a global technology company, home to 219,000+ people across 54 countries, delivering industry-leading capabilities centered on digital, engineering and cloud, powered by a broad portfolio of technology services and products. We work with clients across all major verticals, providing industry solutions for Financial Services,...


  • London, United Kingdom HCLTech Full time

    HCLTech is a global technology company, home to 219,000+ people across 54 countries, delivering industry-leading capabilities centered on digital, engineering and cloud, powered by a broad portfolio of technology services and products. We work with clients across all major verticals, providing industry solutions for Financial Services, Manufacturing, Life...


  • London, United Kingdom HCLTech Full time

    HCLTech is a global technology company, home to 219,000+ people across 54 countries, delivering industry-leading capabilities centered on digital, engineering and cloud, powered by a broad portfolio of technology services and products. We work with clients across all major verticals, providing industry solutions for Financial Services, Manufacturing, Life...


  • London, United Kingdom ZipRecruiter Full time

    Job Description HCLTech is a global technology company, home to 219,000+ people across 54 countries, delivering industry-leading capabilities centered on digital, engineering and cloud, powered by a broad portfolio of technology services and products. We work with clients across all major verticals, providing industry solutions for Financial Services,...


  • London, United Kingdom HCLTech Full time

    HCLTech is a global technology company, home to 219,000+ people across 54 countries, delivering industry-leading capabilities centered on digital, engineering and cloud, powered by a broad portfolio of technology services and products. We work with clients across all major verticals, providing industry solutions for Financial Services, Manufacturing, Life...


  • London, United Kingdom HCLTech Full time

    HCLTech is a global technology company, home to 219,000+ people across 54 countries, delivering industry-leading capabilities centered on digital, engineering and cloud, powered by a broad portfolio of technology services and products. We work with clients across all major verticals, providing industry solutions for Financial Services, Manufacturing, Life...