Evals Research Scientist

1 week ago

City Of London, United Kingdom Apollo Research Full time

Evals Research Scientist / Engineer at Apollo Research Application Deadline: We’re accepting applications until 31 October 2025. Applications are considered on a rolling basis and may take multiple weeks for a response. About The Opportunity We’re looking for Research Scientists and Research Engineers who are excited to work on safety evaluations, the science of scheming, or control/monitoring for frontier models. Responsibilities Work with frontier labs like OpenAI, Anthropic, and Google DeepMind, by running pre-deployment evaluations and collaborating closely on mitigations, see e.g. our work on anti-scheming or OpenAI’s o1-preview system card and Anthropics’s Opus 4 and Sonnet 4 system card. Build evaluations for scheming-related properties (such as deceptive reasoning, sabotage, and deception tendencies). See our conceptual work on scheming, e.g. evaluation-based safety cases for scheming or how scheming could arise. Work on the "science of scheming," e.g. by studying model organisms or real-world examples of scheming in detail. Our goal is to develop a much better theoretical understanding of why models scheme and which components of training and deployment cause it. Work on automating the entire evals pipeline. We aim to automate substantial parts of evals ideation, generation, running and analysis. Design and evaluate AI control protocols. Since agents have longer and longer time-horizons, we’re shifting more effort to deployment-time monitoring and other control methods. Note: We are not hiring for interpretability roles. Key Requirements We don’t require a formal background or industry experience and welcome self-taught candidates. Experience in empirical research related to scheming, AI control and evaluations and a scientific mindset: You have designed and executed experiments. You can identify alternative explanations for findings and test alternative hypotheses to avoid overinterpreting results. This experience can come from academia, industry, or independent research. Track record of excellent scientific writing and communication: You can understand and communicate complex technical concepts to our target audience and synthesize scientific results into coherent narratives. Comprehensive experience in Large Language Model (LLM) steering and the supporting Data Science and Data Engineering skills. LLM steering can take many different forms, such as: a) prompting, b) LM agents and scaffolding, c) fluent LLM usage and integration into your own workflows, d) experience with supervised fine-tuning, e) experience with RL on LLMs. Software engineering skills: Our entire stack uses Python. We’re looking for candidates with strong software engineering experience. (Bonus) We have recently switched to Inspect as our primary evals framework, and we value experience with it. Depending on your preferred role and how these characteristics weigh up, we can offer either a RS or RE role. Logistics Start date: Target 2–3 months after first interview. Time allocation: Full-time. Location: London office, in-person (partial remote considered case-by-case). Work visa sponsorship available for UK. Benefits Salary: 100k – 200k GBP (~135k – 270k USD). Flexible work hours and schedule. Unlimited vacation and sick leave. Lunch, dinner and snacks provided on workdays. Paid work trips and conferences. Annual professional development budget: $1,000 USD. About Apollo Research Apollo Research focuses on risks from Loss of Control, especially deceptive alignment/scheming. We develop detection, science, and mitigation strategies for scheming and work closely with frontier AI companies. About the Team Current evals team includes Mikita Balesni, Jérémy Scheurer, Alex Meinke, Rusheb Shah, Bronson Schoen, Andrei Matveiakin, Felix Höfstätter, Axel Højmark, Nix Goldowsky‑Dill, Teun van der Weij, Alex Lloyd. Marius Hobbhahn manages the team. Equality Statement Apollo Research is an Equal Opportunity Employer. We value diversity and are committed to providing equal opportunities to all, regardless of age, disability, gender reassignment, marriage and civil partnership, pregnancy and maternity, race, religion or belief, sex, or sexual orientation. How to Apply Complete the application form with your CV. Cover letter optional. Share relevant work samples. Interview Process Multi-stage: screening interview, take-home test (~2.5 hrs), 3 technical interviews, final interview with CEO Marius. Technical interviews aligned with job tasks; no general coding tests. Privacy Statement We protect your data, use AI-powered tools for screening, all decisions made by humans. Contact [email protected] with privacy concerns. #J-18808-Ljbffr

Evals Research Scientist

4 days ago

London, Greater London, United Kingdom Apollo Research Full time

Application deadline: We're currently considering applications on a rolling basis. It can take multiple weeks until we respond, even if you are a great fit. ABOUT THE OPPORTUNITY We're looking for Research Scientists and Research Engineers who are excited to work on safety evaluations, the science of scheming, or control/monitoring for frontier...
Evals Research Scientist

7 days ago

City Of London, United Kingdom COL Limited Full time

Application deadline: We're accepting applications until 31 October 2025. We encourage early submissions and will start interviews in early October.ABOUT THE OPPORTUNITYWe’re looking for Research Scientists and Research Engineers who are excited to work on safety evaluations, the science of scheming, or control/monitoring for frontier models.YOU WILL HAVE...
Research Scientist

1 week ago

City Of London, United Kingdom Huawei Technologies Research & Development (UK) Ltd Full time

Job SummaryThe Reinforcement Learning Team at the Huawei London Research Centre is seeking a highly skilled and research-driven Machine Learning Scientist to join our team. This role focuses on advancing the state-of-the-art in reinforcement learning, Bayesian optimisation, AI agents, large language models (LLMs), and/or vision-language models (VLMs). You...
AI Safety

1 week ago

City Of London, United Kingdom COL Limited Full time

A leading research organization in London seeks Research Scientists and Engineers to conduct safety evaluations and collaborate with top-tier AI labs like OpenAI and Google DeepMind. Ideal candidates should have a strong foundation in empirical research, excellent communication skills, and software engineering experience in Python. The role is full-time,...
Research Scientist

1 week ago

City Of London, United Kingdom Huawei Technologies Research & Development (UK) Ltd Full time

OverviewResearch Scientist / Engineer in NLP (Contractor) at Huawei Technologies Research & Development (UK) Ltd. Join to apply for the Research Scientist / Engineer in NLP (Contractor) role at Huawei Technologies Research & Development (UK) Ltd.Job SummaryWe are looking for Research Engineers with experience in Natural Language Processing, Machine Learning,...
Data Scientist London

1 day ago

Greater London, United Kingdom Data Scientist Full time

Job Description Data Scientist - Financial Services - London (hybrid / remote) Overview Are you passionate about leveraging data to drive impactful decisions? Do you thrive in a collaborative and innovative environment? We are seeking a dynamic and skilled Data Scientist to join our forward-thinking team. This is your opportunity to work with cutting‑edge...
Research Scientist

1 week ago

City Of London, United Kingdom Huawei Technologies Research & Development (UK) Ltd Full time

Join to apply for the Research Scientist - AI Agent role at Huawei Technologies Research & Development (UK) Ltd About Huawei Research And Development UK Limited Founded in 1987, Huawei is a leading global provider of information and communications technology (ICT) infrastructure and smart devices. We have 207,000 employees and operate in over 170 countries...
AI Research Scientist

1 week ago

City Of London, United Kingdom The 38 Labs Group Full time

We are looking for a highly motivated AI Research Scientist to join our growing research team. This role is ideal for individuals passionate about solving real-world problems using advanced machine learning, deep learning, and AI techniques. You will work alongside a team of experts to design, implement, and publish state-of-the-art AI models with direct...
Research Scientist

1 week ago

City Of London, United Kingdom UiPath Full time

Life at UiPathThe people at UiPath believe in the transformative power of automation to change how the world works. We’re committed to creating category-leading enterprise software that unleashes that power.To make that happen, we need people who are curious, self‑propelled, generous, and genuine. People who love being part of a fast‑moving,...
Research Fellow

2 weeks ago

London Area, United Kingdom Pivotal Research Full time

The Pivotal Research Fellowship is a 9-week program for promising researchers to produce impactful research and accelerate their careers in AI safety and AI governance. Fellows conduct research, work with experienced mentors, participate in workshops & private Q&A sessions with domain experts, and build strong networks within the AI safety research community...

Americas

Europe

Asia / Oceania

Africa

Evals Research Scientist