Observability Site Reliability Engineer
4 weeks ago
People at Apple don’t just build products — they craft the kind of experience that have revolutionized entire industries. The diverse collection of our people and their ideas inspire innovation in everything we do. Imagine what you could do here Join Apple, and help us leave the world better than we found it.The Apple Service Engineering(ASE) team builds and provides systems and infrastructure that fuel Apple’s services (such as iCloud, iTunes, Siri, and Maps). We are the foundation on which Apple’s software developers build the products that our customers love. We are looking for passionate and talented Site Reliability Engineers to continue our focus in providing our customers the highest quality Apple Services experience. Our services have to scale globally, stay highly available, and "just work.” If you love designing, engineering and running systems and infrastructure that will help millions of customers, then this is the place for youThe Observability SRE organization is specifically tasked with enabling other teams to better understand their infrastructure and services, providing world-class observability capabilities.
Key Qualifications
- Strong sense of ownership and integrity demonstrated through clear communication and collaboration
- Experience in managing and scaling distributed systems in a public, private, or hybrid cloud environment
- Experience with the Prometheus ecosystem
- The ability to design, author, and release code in languages like Go or Python
- Acute drive to automate manual operations and to improve them through repeated iteration
- Understanding of the Linux Operating System, standard networking protocols, and components
- Hands-on experience managing large numbers of diverse systems with configuration management or software delivery platforms (such as Puppet, Chef, Ansible, and Spinnaker)
- Experience with deploying, supporting and monitoring new and existing services, platforms, and application stacks
- Excellent troubleshooting and problem solving skills
- Experience with scale testing, disaster recovery, and capacity planning
- Familiarity with microservices architecture and container orchestration with Kubernetes
Apple Services Engineering infrastructure is BIG. Operating at our scale, across multiple geographically dispersed data centers and servicing hundreds of millions of users presents unique challenges. As an SRE at Apple, you'll need to solve these problems using data, teamwork, and your own expertise. SREs at Apple own the full infrastructure stack; from device driver performance debugging to content delivery network traffic management — our responsibilities are both broad and deep.ASE runs the majority of its systems on Linux. We run a mix of open source, vendor licensed, and internally developed tools to perform functions such as system configuration management, provisioning, software deployment, logging, and monitoring. You'll learn these tools and have opportunities to improve them. Our team is collaborative; we work closely with the development teams we support to deliver the best results for Apple. We think critically and strive to balance the best solution with the need to get things done for each engineering challenge we face. Good ideas are heard and results are rewarded.
Education & Experience
BS/MS in Computer Science or Equivalent (5+ years of software development or production operations experience in a large-scale environment)
#J-18808-Ljbffr-
Observability Site Reliability Engineer
2 months ago
London, United Kingdom Apple Inc. Full timePeople at Apple don’t just build products — they craft the kind of experience that have revolutionized entire industries. The diverse collection of our people and their ideas inspire innovation in everything we do. Imagine what you could do here! Join Apple, and help us leave the world better than we found it.The Apple Service Engineering(ASE) team...
-
Observability Site Reliability Engineer
3 weeks ago
London, United Kingdom Apple Inc. Full timePeople at Apple don’t just build products — they craft the kind of experience that have revolutionized entire industries. The diverse collection of our people and their ideas inspire innovation in everything we do. Imagine what you could do here! Join Apple, and help us leave the world better than we found it.The Apple Service Engineering(ASE) team...
-
Observability Site Reliability Engineer
1 week ago
London, United Kingdom Apple Inc. Full timePeople at Apple don’t just build products — they craft the kind of experience that have revolutionized entire industries. The diverse collection of our people and their ideas inspire innovation in everything we do. Imagine what you could do here! Join Apple, and help us leave the world better than we found it.The Apple Service Engineering(ASE) team...
-
Observability Site Reliability Engineer
1 week ago
London, United Kingdom Apple Inc. Full timePeople at Apple don’t just build products — they craft the kind of experience that have revolutionized entire industries. The diverse collection of our people and their ideas inspire innovation in everything we do. Imagine what you could do here! Join Apple, and help us leave the world better than we found it.The Apple Service Engineering(ASE) team...
-
Observability Site Reliability Engineer
11 hours ago
London, United Kingdom Apple Full timeSummary: People at Apple don’t just build products — they craft the kind of experience that have revolutionized entire industries. The diverse collection of our people and their ideas inspire innovation in everything we do. Imagine what you could do here! Join Apple, and help us leave the world better than we found it. The Apple Service...
-
Observability Site Reliability Engineer
2 days ago
London, United Kingdom Apple Full timeSummary: People at Apple don’t just build products — they craft the kind of experience that have revolutionized entire industries. The diverse collection of our people and their ideas inspire innovation in everything we do. Imagine what you could do here! Join Apple, and help us leave the world better than we found it. The Apple Service...
-
Observability Site Reliability Engineer
20 hours ago
London, United Kingdom Apple Full timeSummary: People at Apple don’t just build products — they craft the kind of experience that have revolutionized entire industries. The diverse collection of our people and their ideas inspire innovation in everything we do. Imagine what you could do here! Join Apple, and help us leave the world better than we found it. The Apple Service...
-
Observability Site Reliability Engineer
1 week ago
London, United Kingdom Apple Inc. Full timePeople at Apple don’t just build products — they craft the kind of experience that have revolutionized entire industries. The diverse collection of our people and their ideas inspire innovation in everything we do. Imagine what you could do here! Join Apple, and help us leave the world better than we found it.The Apple Service Engineering(ASE) team...
-
Site Reliability Engineer
5 days ago
London, United Kingdom Cisco Full timeWho We Are The name ThousandEyes was born from two big ideas: the power to see things not ordinarily possible and the ability to collect insights from a multitude of vantage points. As organizations rely more on cloud services and the Internet, the network has become a “black box” outside of their control. ThousandEyes gives organizations...
-
Site Reliability Engineer
1 day ago
London, United Kingdom Cisco Systems, Inc. Full timeSite Reliability Engineer - Observability Location: Offsite, London, United Kingdom Area of Interest Job Type Professional Software Development Job Id 1424270 Who We Are The name ThousandEyes was born from two big ideas: the power to see things not ordinarily possible and the ability to collect insights from a multitude of vantage points. As...
-
Site Reliability Engineer
14 hours ago
London, United Kingdom Cisco Systems, Inc. Full timeSite Reliability Engineer - Observability Location: Offsite, London, United Kingdom Area of Interest Job Type Professional Software Development Job Id 1424270 Who We Are The name ThousandEyes was born from two big ideas: the power to see things not ordinarily possible and the ability to collect insights from a multitude of vantage points. As...
-
Senior Site Reliability Engineer
2 weeks ago
London, United Kingdom Formula Recruitment Full timeSenior Site Reliability EngineernSalary Up to £120,000nFully RemotenPermanent, Full TimeWe are partnered with a leading Web3 and Blockchain start-up company who aim to disrupt the the crypto eco-system and move away from a chain centric worldview and move towards an account centric worldview.They are currently looking for Senior Site Reliability Engineer to...
-
Senior Site Reliability Engineer
1 week ago
London, United Kingdom Formula Recruitment Full timeSenior Site Reliability EngineernSalary Up to £120,000nFully RemotenPermanent, Full TimeWe are partnered with a leading Web3 and Blockchain start-up company who aim to disrupt the the crypto eco-system and move away from a chain centric worldview and move towards an account centric worldview.They are currently looking for Senior Site Reliability Engineer to...
-
Site Reliability Engineer
3 weeks ago
London, United Kingdom Experian Full timeJob Description Work that matters – what you’ll be doing We’re looking for a highly skilled and motivated Site Reliability Engineer (SRE) to join our Experian Data Quality team. As an SRE, you will be responsible for ensuring the reliability, performance, and scalability of our market leading suite of data management products, with an...
-
AWS Site Reliability Engineer
4 weeks ago
London, United Kingdom Techruiter Full timeSite Reliability Engineer (SRE) - LLM and Machine Learning London/Remote Roles we're searching for now: – Software Engineering / We are a pioneering technology company specialising in cutting-edge Language Models (LLM) and Machine Learning solutions. We are seeking a highly skilled Site Reliability Engineer (SRE) to join our team and ensure the...
-
AWS Site Reliability Engineer
4 weeks ago
London, United Kingdom Techruiter Full timeSite Reliability Engineer (SRE) - LLM and Machine Learning London/Remote Roles we're searching for now: – Software Engineering / We are a pioneering technology company specialising in cutting-edge Language Models (LLM) and Machine Learning solutions. We are seeking a highly skilled Site Reliability Engineer (SRE) to join our team and ensure the...
-
Site Reliability Engineer Openstack
1 week ago
London, United Kingdom Client Server Ltd. Full time**Site Reliability Engineer / SRE (OpenStack Kubernetes Terraform) *Remote / London* to£90k** Would you like to work on a digital transformation project and use to a number of different modern technologies and tools? You could be joining a hugely successful software consultancy that offers remote working and requires no travel to client...
-
Site Reliability Engineer with Python
2 weeks ago
London, United Kingdom Galaxy entertainment Corporation Limited Full timeGalaxy is seeking a Site Reliability Engineer to build Observability and Infrastructure as code to help accelerate the development of innovative software systems for Galaxy Digital. We’re looking for dynamic, highly motivated people with automation and configuration management experience who want to join an exciting, rapidly growing team! Build and...
-
Site Reliability Engineer with Python
3 weeks ago
London, United Kingdom Galaxy entertainment Corporation Limited Full timeGalaxy is seeking a Site Reliability Engineer to build Observability and Infrastructure as code to help accelerate the development of innovative software systems for Galaxy Digital. We’re looking for dynamic, highly motivated people with automation and configuration management experience who want to join an exciting, rapidly growing team! Build and...
-
Senior Site Reliability Engineer
3 weeks ago
London, United Kingdom loveholidays Full timeAbout Us We are a rapidly growing online travel agency with technology at the heart of our success. In 2022, we sent millions of people on their dream holiday. With a million visitors a day, our 100+ services handle 8k requests per second, while maintaining p95 search latency of 150ms. Our observability captures and processes 1TB of logs a day and 350k...