DevOps Engineer – Reinforcement Learning Platforms
6 days ago
DevOps Engineer – Reinforcement Learning Platforms
We are seeking an experienced DevOps Engineer to help build and scale a web-based platform for reinforcement learning (RL) training and RLOps. You will design, implement, and maintain the cloud infrastructure, CI/CD pipelines, and deployment systems that support large-scale RL workloads.
Responsibilities
• Design and manage scalable cloud infrastructure for high-performance RL training and distributed environments
• Build and optimise CI/CD pipelines for open-source and enterprise components
• Implement containerisation and orchestration using Docker and Kubernetes
• Develop Infrastructure as Code solutions (Terraform, CloudFormation, Pulumi)
• Implement monitoring, logging, and alerting for distributed ML systems
• Collaborate with ML teams on resource optimisation and cost efficiency
• Apply security best practices, manage access controls, and ensure compliance
• Automate operational tasks: backups, disaster recovery, maintenance
• Support GPU clusters and distributed compute resources for RL workloads
• Maintain availability and performance of production ML systems
Requirements
• Degree in Computer Science/Engineering or 3+ years of DevOps/infrastructure experience
• Strong background with AWS, GCP, or Azure, including ML/AI workloads
• Proficiency with Docker, Kubernetes, and ML-focused orchestration
• Experience with Terraform/CloudFormation/Pulumi and configuration management
• Solid understanding of CI/CD tools (GitHub Actions, GitLab CI, Jenkins)
• Knowledge of monitoring/observability tools (Prometheus, Grafana, OpenObserve)
• Experience with GPU infrastructure and distributed ML compute frameworks
• Familiarity with MLOps tools and model lifecycle management
• Strong scripting skills (Python, Bash)
• Understanding of cloud networking, security, and database fundamentals
• Experience with HPC environments or schedulers is a plus
• Strong problem-solving and communication skills
Compensation & Benefits
• Competitive salary and stock options
• 30 days' holiday plus bank holidays
• Flexible and remote working options
• Enhanced parental leave
• £500 annual learning and development budget
• Pension scheme
• Regular socials and quarterly gatherings
• Bike-to-Work scheme
-
Machine Learning Engineer
7 days ago
London, United Kingdom FBI &TMT Full timeI am recruiting on behalf of a leading client in the technology sector who is seeking a highly skilled and experienced Machine Learning Engineer with a strong background in Reinforcement Learning. This role will contribute to the continued development of Arena, the company's web-based platform for reinforcement learning training and RLOps, as well as its...
-
DevOps Engineer
1 week ago
London, United Kingdom FBI &TMT Full timeDevOps Engineer - Reinforcement Learning PlatformsWe are seeking an experienced DevOps Engineer to help build and scale a web-based platform for reinforcement learning (RL) training and RLOps. You will design, implement, and maintain the cloud infrastructure, CI/CD pipelines, and deployment systems that support large-scale RL workloads.Check out the role...
-
Machine Learning Engineer
1 week ago
London, United Kingdom FBI &TMT Full timeI am recruiting on behalf of a leading client in the technology sector who is seeking a highly skilled and experienced Machine Learning Engineer with a strong background in Reinforcement Learning. This role will contribute to the continued development of Arena, the company's web-based platform for reinforcement learning training xsabvtc and RLOps, as well as...
-
Machine Learning Engineer
3 days ago
City Of London, United Kingdom AgileRL Ltd Full timeMachine Learning Engineer (Reinforcement Learning) We are seeking a talented and experienced Machine Learning Engineer with a background in Reinforcement Learning to join our team. This engineer will contribute to the further development of Arena, a web-based software platform for reinforcement learning training and RLOps, and our open-source reinforcement...
-
Senior Reinforcement Learning expert
1 week ago
London Area, United Kingdom Barrington James Full timeI am hiring Senior Robotic & AI Engineer to drive the development of intelligent controllers for real-world robotic systems. This is a hands-on, highly technical role: you’ll design, build, and maintain advanced learning pipelines that combine imitation learning, reinforcement learning, and language or vision-conditioned models. You will play a pivotal...
-
Reinforcement learning Engineer
1 week ago
London E, United Kingdom Go Arrow Full time £60,000 - £100,000 per yearWe are looking for a Reinforcement Learning (RL) Engineer to join our AI research and development team. In this role, you will design, implement, and optimize reinforcement learning algorithms to solve complex, real-world problems. You'll collaborate closely with data scientists, machine learning engineers, and software developers to bring intelligent...
-
Senior Reinforcement Learning expert
4 days ago
London Area, United Kingdom Barrington James Full timeI am hiringSenior Robotic & AI Engineerto drive the development of intelligent controllers for real-world robotic systems. This is a hands-on, highly technical role: you'll design, build, and maintain advanced learning pipelines that combine imitation learning, reinforcement learning, and language or vision-conditioned models. You will play apivotal role in...
-
DevOps Platform Engineer
7 days ago
Greater London, United Kingdom NatWest Group Full timeOverviewJoin us as a DevOps Platform EngineerThis is an excellent opportunity to contribute to building our DevOps engineering capability, culture and mindsets within the bankPromoting technical and cultural change, you’ll be accelerating learning journeys and the progressive adoption of our DevOps centre of excellence technical practices and techniquesAs...
-
DevOps Platform Engineer
1 week ago
London, Greater London, United Kingdom RBS Full time £80,000 - £120,000 per yearJoin us as a DevOps Platform EngineerThis is an excellent opportunity to contribute to building our DevOps engineering capability, culture and mindsets within the bankPromoting technical and cultural change, you'll be accelerating learning journeys and the progressive adoption of our DevOps centre of excellence technical practices and techniquesAs you build...
-
Reinforcement Learning
2 weeks ago
London Area, United Kingdom Humanoid Full timeReinforcement Learning (RL) Engineer, ManipulationHumanoid is the first AI and robotics company in the UK, creating the world’s most advanced, reliable, commercially scalable, and safe humanoid robots. Our first humanoid robot HMND 01 is a next-gen labour automation unit, providing highly efficient services across various use cases, starting with...