Platform Engineer

4 weeks ago


City of London Greater London, United Kingdom Cloud People Full time

Platform Engineer – HPC, AI and ML Up to £80,000 plus benefits Onsite – Kensington, London Company and Role This is an opportunity to join a global technology and AI solutions provider delivering some of the most advanced computing platforms in the world. You will play a leading role in the design, build and long-term support of a next generation AI and Machine Learning platform, built on cutting-edge High Performance Computing (HPC) infrastructure for one of the UK’s most prestigious research environments. As a Platform Engineer – HPC, AI and ML, you will be responsible for building and optimising a high-performance platform purpose-built for AI, ML, LLM and Generative AI workloads. You will lead on architecture, deployment and performance tuning using technologies such as Kubernetes, NVIDIA Run:AI, Ubuntu, Weka NeuralMesh and HGX B200 GPU nodes. Once live, you will take ownership of the platform’s operation and evolution, ensuring it delivers consistent world-class performance for advanced research workloads. It is a rare opportunity to build a complex HPC environment from the ground up and then own it, ensuring it continues to power the next generation of AI-driven innovation. Why This Role Stands Out • Be part of one of the UK’s most advanced AI and HPC platform projects • Build and then support a world-class infrastructure enabling AI, ML, LLM and Generative AI research • Collaborate with global technology leaders including NVIDIA, HPE, Canonical and Weka • Onsite role in Kensington, London within a pioneering research and innovation environment • Salary up to £80,000 with excellent opportunities for growth in HPC and AI infrastructure engineering What You’ll Be Doing • Designing, deploying and configuring a complete AI and ML operations platform within a large-scale HPC environment • Installing and optimising Ubuntu (Canonical) across GPU and non-GPU compute nodes • Implementing and managing Kubernetes for container orchestration and performance at scale • Installing and configuring NVIDIA GPU Operator, Network Operator and Run:AI orchestration platform • Integrating Run:AI with Kubernetes clusters to deliver scalable GPU utilisation • Supporting deployment of HGX B200 GPU nodes (96 NVIDIA B200 GPUs) and associated infrastructure • Managing Weka NeuralMesh distributed AI storage for high-speed data access and resilience • Implementing CI/CD and MLOps pipelines using Argo Workflows, Jenkins and GitHub • Monitoring platform performance using Zabbix, Prometheus and Grafana • Integrating SAN and Infiniband networking to achieve high throughput and reliability • Creating detailed documentation and performing knowledge transfer to operations teams • Providing ongoing platform support, patching, troubleshooting and continuous improvement What You’ll Bring • Proven experience designing, deploying and supporting HPC or large-scale compute environments for AI and ML workloads • Strong understanding of Ubuntu server administration, networking and performance tuning • Hands-on experience with Kubernetes and GPU-enabled workloads • Practical knowledge of NVIDIA GPU technologies, particularly GPU Operator and Run:AI • Familiarity with distributed storage and AI data systems such as Weka NeuralMesh • Experience with CI/CD and MLOps pipelines using Argo, Jenkins or GitHub • Knowledge of HPC networking including SAN and Infiniband integration • Strong troubleshooting and documentation skills with a collaborative mindset Desirable Experience • Certifications in Kubernetes, NVIDIA or HPC infrastructure technologies • Experience in research, academic or scientific computing environments • Understanding of AI and ML workflows, neural network training and large language models • Familiarity with HPE, NVIDIA, Aarna and Digital Realty platforms If you are passionate about building and operating large-scale computing environments and want to play a key role in delivering one of the UK’s most advanced HPC and AI platforms, this is your opportunity to shape the future of research and machine learning infrastructure.


  • Platform Engineer

    4 weeks ago


    City of London, Greater London, United Kingdom La Fosse Full time

    Platform Engineer (AWS / OpenShift) Location: London, Bristol, or Edinburgh (Hybrid – 3 days per week in office) Salary: £70,000 – £85,000 DOE + bonus + 15% pension Join a well-established engineering team responsible for one of the most critical container platforms in the organisation - Red Hat OpenShift running on AWS. You’ll work within a...

  • Platform Engineer

    4 weeks ago


    City of London, Greater London, United Kingdom Burns Sheehan Full time

    Senior Platform Engineer - Developer Platform / Internal Tooling 💰 £90,000 - £100,000 + Equity 📍 London - Hybrid (Very flexible, weekly, fortnightly or even x1 day per Month)🛠️ AWS, Terraform, Python/Go, CI/CD, Backstage Here at Burns Sheehan, we’re working with a high-growth FinTech that uses data and technology to modernise one of the...

  • Platform Engineer

    3 days ago


    London, Greater London, United Kingdom Carbon3 - Building the UK's AI Solution Platform Full time £40,000 - £80,000 per year

    We are building the UK's next generation AI platform, powered by renewable energy, rooted in sovereign capability, and designed to give enterprises and innovators the compute they need.AI Platform OperationsSupport Engineer / Cluster Administrator to provide Level 1 and Level 2 support for AI platform. This role will be customer facing, involve technical...

  • Platform Engineer

    3 days ago


    London Area, United Kingdom Carbon3 - The UK's AI Solution Platform Full time £24,000 - £90,000 per year

    We are building the UK's next generation AI platform, powered by renewable energy, rooted in sovereign capability, and designed to give enterprises and innovators the compute they need.*AIPlatformOperationsManager*Support Engineer / Cluster Administrator to provide Level 1 and Level 2 support for AI platform. This role will be customer facing, involve...

  • Platform Engineer

    4 weeks ago


    City of London, Greater London, United Kingdom Harrington Starr Full time

    Platform Engineer - PERMANENT Location : London Industry : Financial Services Salary : £115,000 + benefits + bonus Hybrid : In office 3 days per week If you’ve built and managed AWS environments at scale, this could be the perfect next step for you. Join a high-performing Platform Engineering team at a global brokerage based in London. You’ll need: 5+...

  • Platform Engineer

    4 weeks ago


    City of London, Greater London, United Kingdom Harrington Starr Full time

    Platform Engineer - PERMANENT Location : London Industry : Financial Services Salary : £115,000 + benefits + bonus Hybrid : In office 3 days per week If you’ve built and managed AWS environments at scale, this could be the perfect next step for you. Join a high-performing Platform Engineering team at a global brokerage based in London. You’ll need: 5+...

  • Platform Engineer

    4 weeks ago


    City of London, Greater London, United Kingdom Financial Services Full time

    Platform Engineer required by blue-chip company in the Financial Services sector. Wonderful opportunity for a second or third jobber to get exposed to a multitude of great technologies. They are heading increasingly towards containerisation/Kubernetes. Kafka is the key platform you'll be building. Required Experience: Kafka Devops/Platform/SRE specialisation...

  • Platform Engineer

    3 weeks ago


    City of London, Greater London, United Kingdom Oliver Bernard Full time

    Platform Engineer | 2/3 days p/week in London | £65-75K | AWS, Security, Kubernetes, Terraform, Go Overview: This role is part of a small, collaborative Cloud Infrastructure team supporting the core platforms that power a modern financial-services business. The team owns cloud, container, and secrets-management foundations, ensuring they’re secure,...

  • Platform Engineer

    3 weeks ago


    london (city of london), United Kingdom La Fosse Full time

    Platform Engineer (AWS / OpenShift) Location: London, Bristol, or Edinburgh (Hybrid – 3 days per week in office) Salary: £70,000 – £85,000 DOE + bonus + 15% pension Join a well-established engineering team responsible for one of the most critical container platforms in the organisation - Red Hat OpenShift running on AWS. You’ll work within a...

  • Platform Engineer

    1 week ago


    London Area, United Kingdom Carbon3 - Building the UK's AI Solution Platform Full time £40,000 - £80,000 per year

    We are building the UK's next generation AI platform, powered by renewable energy, rooted in sovereign capability, and designed to give enterprises and innovators the compute they need.AIPlatformOperationsManagerSupport Engineer / Cluster Administrator to provide Level 1 and Level 2 support for AI platform. This role will be customer facing, involve...