Platform Engineer
4 weeks ago
Platform Engineer – HPC, AI and ML Up to £80,000 plus benefits Onsite – Kensington, LondonCompany and Role This is an opportunity to join a global technology and AI solutions provider delivering some of the most advanced computing platforms in the world. You will play a leading role in the design, build and long-term support of a next generation AI and Machine Learning platform, built on cutting-edge High Performance Computing (HPC) infrastructure for one of the UK’s most prestigious research environments.As a Platform Engineer – HPC, AI and ML, you will be responsible for building and optimising a high-performance platform purpose-built for AI, ML, LLM and Generative AI workloads. You will lead on architecture, deployment and performance tuning using technologies such as Kubernetes, NVIDIA Run:AI, Ubuntu, Weka NeuralMesh and HGX B200 GPU nodes.Once live, you will take ownership of the platform’s operation and evolution, ensuring it delivers consistent world-class performance for advanced research workloads. It is a rare opportunity to build a complex HPC environment from the ground up and then own it, ensuring it continues to power the next generation of AI-driven innovation.Why This Role Stands Out • Be part of one of the UK’s most advanced AI and HPC platform projects • Build and then support a world-class infrastructure enabling AI, ML, LLM and Generative AI research • Collaborate with global technology leaders including NVIDIA, HPE, Canonical and Weka • Onsite role in Kensington, London within a pioneering research and innovation environment • Salary up to £80,000 with excellent opportunities for growth in HPC and AI infrastructure engineeringWhat You’ll Be Doing • Designing, deploying and configuring a complete AI and ML operations platform within a large-scale HPC environment • Installing and optimising Ubuntu (Canonical) across GPU and non-GPU compute nodes • Implementing and managing Kubernetes for container orchestration and performance at scale • Installing and configuring NVIDIA GPU Operator, Network Operator and Run:AI orchestration platform • Integrating Run:AI with Kubernetes clusters to deliver scalable GPU utilisation • Supporting deployment of HGX B200 GPU nodes (96 NVIDIA B200 GPUs) and associated infrastructure • Managing Weka NeuralMesh distributed AI storage for high-speed data access and resilience • Implementing CI/CD and MLOps pipelines using Argo Workflows, Jenkins and GitHub • Monitoring platform performance using Zabbix, Prometheus and Grafana • Integrating SAN and Infiniband networking to achieve high throughput and reliability • Creating detailed documentation and performing knowledge transfer to operations teams • Providing ongoing platform support, patching, troubleshooting and continuous improvementWhat You’ll Bring • Proven experience designing, deploying and supporting HPC or large-scale compute environments for AI and ML workloads • Strong understanding of Ubuntu server administration, networking and performance tuning • Hands-on experience with Kubernetes and GPU-enabled workloads • Practical knowledge of NVIDIA GPU technologies, particularly GPU Operator and Run:AI • Familiarity with distributed storage and AI data systems such as Weka NeuralMesh • Experience with CI/CD and MLOps pipelines using Argo, Jenkins or GitHub • Knowledge of HPC networking including SAN and Infiniband integration • Strong troubleshooting and documentation skills with a collaborative mindsetDesirable Experience • Certifications in Kubernetes, NVIDIA or HPC infrastructure technologies • Experience in research, academic or scientific computing environments • Understanding of AI and ML workflows, neural network training and large language models • Familiarity with HPE, NVIDIA, Aarna and Digital Realty platformsIf you are passionate about building and operating large-scale computing environments and want to play a key role in delivering one of the UK’s most advanced HPC and AI platforms, this is your opportunity to shape the future of research and machine learning infrastructure.
-
Platform Engineer
3 days ago
London, Greater London, United Kingdom Carbon3 - Building the UK's AI Solution Platform Full time £40,000 - £80,000 per yearWe are building the UK's next generation AI platform, powered by renewable energy, rooted in sovereign capability, and designed to give enterprises and innovators the compute they need.AI Platform OperationsSupport Engineer / Cluster Administrator to provide Level 1 and Level 2 support for AI platform. This role will be customer facing, involve technical...
-
Platform Engineer
3 days ago
London Area, United Kingdom Carbon3 - The UK's AI Solution Platform Full time £24,000 - £90,000 per yearWe are building the UK's next generation AI platform, powered by renewable energy, rooted in sovereign capability, and designed to give enterprises and innovators the compute they need.*AIPlatformOperationsManager*Support Engineer / Cluster Administrator to provide Level 1 and Level 2 support for AI platform. This role will be customer facing, involve...
-
Platform Engineer
4 weeks ago
City of London, United Kingdom Burns Sheehan Full timeSenior Software / Platform Engineer (LLM / AI Interest) | Brand new AI Platform💰 £120,000 - £130,000📍 London - 4 days a week in office (Paddington)🛠️ Python, GCP, Kubernetes, Terraform, CI/CD, AI / LLM interestHere at Burns Sheehan, we’re working with a leading UK tech business in the payments and financial services space - processing billions...
-
Platform Engineer
4 weeks ago
City of London, United Kingdom La Fosse Full timePlatform Engineer (AWS / OpenShift)Location: London, Bristol, or Edinburgh (Hybrid – 3 days per week in office)Salary: £70,000 – £85,000 DOE + bonus + 15% pensionJoin a well-established engineering team responsible for one of the most critical container platforms in the organisation - Red Hat OpenShift running on AWS.You’ll work within a...
-
Platform Engineer
8 hours ago
City Of London, United Kingdom La Fosse Associates Full timeJob Benefits: bonusPlatform Engineer (AWS / OpenShift)Location:London, Bristol, or Edinburgh (Hybrid – 3 days per week in office)Salary:£70,000 – £85,000 DOE + bonus + 15% pensionJoin a well-established engineering team responsible for one of the most critical container platforms in the organisation – Red Hat OpenShift running on AWS.You’ll work...
-
Platform Engineer
4 weeks ago
City of London, United Kingdom Burns Sheehan Full timeSenior Platform Engineer - Developer Platform / Internal Tooling £90,000 - £100,000 Equity London - Hybrid (Very flexible, weekly, fortnightly or even x1 day per Month) ️ AWS, Terraform, Python/Go, CI/CD, Backstage Here at Burns Sheehan, we’re working with a high-growth FinTech that uses data and technology to modernise one of the biggest but most...
-
Platform Engineer
4 weeks ago
City of London, United Kingdom Burns Sheehan Full timeSenior Platform Engineer - Developer Platform / Internal Tooling💰 £90,000 - £100,000 + Equity📍 London - Hybrid (Very flexible, weekly, fortnightly or even x1 day per Month)🛠️ AWS, Terraform, Python/Go, CI/CD, BackstageHere at Burns Sheehan, we’re working with a high-growth FinTech that uses data and technology to modernise one of the biggest...
-
Platform Engineer
4 weeks ago
City of London, United Kingdom Burns Sheehan Full timeSenior Platform Engineer - Developer Platform / Internal Tooling💰 £90,000 - £100,000 + Equity📍 London - Hybrid (Very flexible, weekly, fortnightly or even x1 day per Month)🛠️ AWS, Terraform, Python/Go, CI/CD, BackstageHere at Burns Sheehan, we’re working with a high-growth FinTech that uses data and technology to modernise one of the biggest...
-
Platform Engineer
3 weeks ago
london (city of london), United Kingdom La Fosse Full timePlatform Engineer (AWS / OpenShift) Location: London, Bristol, or Edinburgh (Hybrid – 3 days per week in office) Salary: £70,000 – £85,000 DOE + bonus + 15% pension Join a well-established engineering team responsible for one of the most critical container platforms in the organisation - Red Hat OpenShift running on AWS. You’ll work within a...
-
Platform Engineer
9 hours ago
City Of London, United Kingdom incident.io Full timePlatform Engineer Join to apply for the Platform Engineer role at incident.io Get AI-powered advice on this job and more exclusive features. This range is provided by incident.io. Your actual pay will be based on your skills and experience — talk with your recruiter to learn more. Base pay range £110K - £200K About Incident.io incident.io is the leading...