Site Reliability Engineer

1 month ago


London, United Kingdom Rockset, Inc. Full time

At Rockset, we’ve built the real-time analytics database for the world's data applications. Our team and technology come from a rich heritage, rooted in the experience of building massive scale data systems at the world’s leading companies, and we created Rockset to make those kinds of powerful data platforms available to real-time application developers everywhere. We are creating a world where developers can go from complex data sets to fast, interactive applications and analysis effortlessly.
We’re a fast-growing company that values curiosity, diversity, and open-mindedness. As a site reliability engineer, you will be responsible for the automation, stability, security, configuration, monitoring, alerting, and capacity planning of Rockset's network, systems, and infrastructure. You will also build tools that help the rest of the engineering team be more productive, and including the ones that Rockset engineers use to deploy and manage their services. The on-call pager is shared by most of the engineering team, not just SRE.
Our infrastructure is completely hosted in Amazon Web Services. We use a variety of home grown, open source, and commercial tools, including Kubernetes, Docker, Kafka, Zookeeper, Prometheus, Grafana, Salt, Terraform, Phacility, and Buildkite. You should expect to collaborate with all other engineering teams to develop solutions that meet reliability, security, and business requirements. Passionate about distributed systems, database technologies, and highly scalable services
Poised under fire and willing to share an on-call rotation with the rest of the team
Bachelor's or Master's degree in Computer Science or a related field, or relevant work experience
Experience building and operating public-facing 24x7 web applications at scale
Experience working with cloud infrastructure and patterns (AWS preferred)
Strong programming skills in a scripted language (Python, Ruby, Bash)
Experience with Terraform, Salt, Chef, Packer, or similar configuration management tools
Experience with Grafana, Prometheus, Datadog, or similar monitoring tools
OUR COMMITMENT TO DIVERSITY
We are an equal opportunity employer and value diversity at our company. We do not discriminate on the basis of race, religion, color, national origin, gender, sexual orientation, age, marital status, veteran status, or disability status.
#



  • London, United Kingdom Reed Full time

    **SRE | SITE RELIABILITY ENGINEER | DEVOPS | AWS | AMAZON WEB SERVCIES | CLOUDFORMATION | KINESIS | CODEPIPELINE | FARGATE | BATCH | PYTHON | GOLANG | DJANGO | REACT | UK | FULLY REMOTE** **Site Reliability Engineer - £80k** A renowned SEO business is looking for a Senior Site Reliability Engineer to build and improve a rapidly evolving infrastructure...


  • London, United Kingdom Lorien Full time

    Site Reliability Engineer Location: London (hybrid remote working) **Salary**: Up to £100,000 + Very Generous Benefits Package One of the fastest growing software development organisation requires a Site Reliability Engineer to help be the glue between the companies Dev, QA and Product teams - enabling the smooth Continuous Build and Integration of new...


  • London, United Kingdom Lorien Full time

    Site Reliability Engineer Location: London (hybrid remote working) **Salary**: Up to £100,000 + Very Generous Benefits Package One of the fastest growing ecommerce organisation requires a Site Reliability Engineer to help be the glue between the companies Dev, QA and Product teams - enabling the smooth Continuous Build and Integration of new instances of...


  • London, United Kingdom Lorien Full time

    Site Reliability Engineer Location: London (hybrid remote working) **Salary**: Up to £100,000 + Very Generous Benefits Package One of the fastest growing ecommerce organisation requires a Site Reliability Engineer to help be the glue between the companies Dev, QA and Product teams - enabling the smooth Continuous Build and Integration of new instances of...


  • London, United Kingdom Explore Group Full time

    **Lead Site reliability engineer - Fully remote - No sponsorship offered** Role: Site Reliability engineer Location: Fully remote **Salary**: Up to £115,000 **Responsibilities**: - Design, build, and maintain scalable and highly available infrastructure on AWS - Implement automation and continuous integration/delivery pipelines using tools such as...


  • London, Greater London, United Kingdom Austin Werner Ltd Full time

    Site Reliability Engineer - Global Media/Publishing businessWe are seeking a Site Reliability Engineer for a globally leading Publishing business based in London.My client has built their internal IT environment from ground up so is bespoke to the business with cutting edge and innovative tech where the systems are responsible for all infrastructure...


  • London, Greater London, United Kingdom Bayside Solutions Full time £91,400 - £108,000

    Site Reliability Engineer Contract Location: London, England - Hybrid Role We seek a Site Reliability Engineer to join our team and play a crucial role in ensuring our applications and services' reliability, availability, and performance. This role requires a strong background in application support, monitoring, and cloud technologies, focusing on AWS,...


  • London, United Kingdom Understanding Recruitment Full time

    Site Reliability Engineer I am seeking a Site Reliability Engineer for one of the worlds fastest growing social media platforms. With over 900 Million Daily users. The SRE group come from diverse technical backgrounds, Reliability, Software Engineering and Security Engineering, and have a broad remit ensuring high availability and performance, and...


  • London, United Kingdom Understanding Recruitment Full time

    Site Reliability Engineer I am seeking a Site Reliability Engineer for one of the worlds fastest growing social media platforms. With over 900 Million Daily users. The SRE group come from diverse technical backgrounds, Reliability, Software Engineering and Security Engineering, and have a broad remit ensuring high availability and performance, and...


  • London, United Kingdom Understanding Recruitment Full time

    Site Reliability Engineer I am seeking a Site Reliability Engineer for one of the worlds fastest growing social media platforms. With over 900 Million Daily users.The SRE group come from diverse technical backgrounds, Reliability, Software Engineering and Security Engineering, and have a broad remit ensuring high availability and performance, and currently...


  • London, United Kingdom Understanding Recruitment Full time

    Site Reliability Engineer Check all associated application documentation thoroughly before clicking on the apply button at the bottom of this description.I am seeking a Site Reliability Engineer for one of the worlds fastest growing social media platforms. With over 900 Million Daily users.The SRE group come from diverse technical backgrounds, Reliability,...


  • London, Greater London, United Kingdom Workingmums Full time £75,000

    Site Reliability EngineerRemote / GlasgowSalary to GBP75,000 + BonusImmediate Start Fantastic new opportunity to the market to join our Glasgow-based tech-for-good client, specialising in digital solutions and who have a huge global reach. Due to increased success in their space and demand for their services, they are now recruiting for a Site Reliability...


  • London, Greater London, United Kingdom Explore Group Full time

    Lead Site reliability engineer - Fully remote - No sponsorship offeredRole: Site Reliability engineerLocation: Fully remoteSalary: Up to £115,000Responsibilities: Design, build, and maintain scalable and highly available infrastructure on AWS Implement automation and continuous integration/delivery pipelines using tools such as Jenkins, Terraform, and...


  • London, United Kingdom N Consulting Ltd Full time

    Job title: Site Reliability EngineerWork Mode: 3 days office MandatoryLocation: 5 Broadgate, London EC2M 2QS, United KingdomContract Duration: 12 monthsWe’re looking for a Site Reliability Engineer to:· determine the reliability of our digital products, technology services, and the infrastructure that underpins them· minimize the risk and impact of...


  • London, United Kingdom N Consulting Ltd Full time

    Job title: Site Reliability EngineerWork Mode: 3 days office MandatoryLocation: 5 Broadgate, London EC2M 2QS, United KingdomContract Duration: 12 monthsWe’re looking for a Site Reliability Engineer to:· determine the reliability of our digital products, technology services, and the infrastructure that underpins them· minimize the risk and impact of...


  • London, Greater London, United Kingdom CONTECHS Full time

    Site Reliability Engineer (IT Infrastructure)n10-month initial contractnOnsite (Manchester)n£33ph (Inside IR35)About the companynI am currently recruiting on behalf of a Luxury Automotive OEM, based in Manchester, seeking Site Reliability Engineers to join their teamJob DescriptionnAs Site Reliability Engineer, your main responsibilities are:nSoftware...


  • London, United Kingdom CONTECHS Full time

    Site Reliability Engineer (IT Infrastructure)n10-month initial contractnOnsite (Manchester)n£33ph (Inside IR35)About the companynI am currently recruiting on behalf of a Luxury Automotive OEM, based in Manchester, seeking Site Reliability Engineers to join their teamJob DescriptionnAs Site Reliability Engineer, your main responsibilities are:nSoftware...


  • London, United Kingdom in Newbury Full time

    Were looking for a Site Reliability Engineer to join ourExperian Data Quality team where you will be working on cutting edgeproducts within our Aperture suite (Data Studio and Data Governance). This role has aspects of both reliability engineering (SRE) andtest engineering (SDET). It is ideally suited to someone looking totake on some aspects of a...


  • London, United Kingdom in Newbury Full time

    Were looking for a Site Reliability Engineer to join ourExperian Data Quality team where you will be working on cutting edgeproducts within our Aperture suite (Data Studio and Data Governance). This role has aspects of both reliability engineering (SRE) andtest engineering (SDET). It is ideally suited to someone looking totake on some aspects of a...


  • London, United Kingdom in Newbury Full time

    Were looking for a Site Reliability Engineer to join ourExperian Data Quality team where you will be working on cutting edgeproducts within our Aperture suite (Data Studio and Data Governance). This role has aspects of both reliability engineering (SRE) andtest engineering (SDET). It is ideally suited to someone looking totake on some aspects of a...