John Gardner

me! ^⤴

Hi, I'm John, a researcher and engineer with broad ML and scientific interests. I specialise in applying graph neural networks (GNNs) to systems of atoms, and in particular the development of machine-learned force fields (MLFFs).

I'm currently interning at Valence Labs in London, where I'm working on generative modelling for drug discovery. Previously, I was an ML Research Intern in Microsoft Research's AI for Science team in Amsterdam, where I investigated multi-fidelity training for MLFFs.

During my PhD (2021-2026), supervised by Prof. Volker Deringer at the University of Oxford, my research has focused on using synthetic data to pre-/train MLFFs, and to use this approach to distill foundation models for targeted domains. I've also developed a new, open-source framework for training MLFFs: graph-pes

You can find me on LinkedIn, X / Twitter, GitHub, and Scholar.

Code

I have extensive professional experience developing software in Python (coupled with PyTorch, numpy etc.), Java and C++. I am an eager proponent of open science, and ensure that all code and data supporting my research is freely available on GitHub.

Packages that I have written include:

graph-pes : a framework for defining, training and using graph-based models of the potential energy surface (i.e. MLFFs). Features I'm especially proud of include distributed training, a pair-style for LAMMPS, fine-tuning capabilities for many foundation models, and independent re-implementations of many popular MLFF architectures, such as MACE, TensorNet, NequIP, and PaiNN.
load-atoms : a package for downloading, inspecting and manipulating large datasets of atomic structures in a vectorised and efficient manner.
augment-atoms : a tool for augmenting datasets of atomic configurations via a model-driven, GPU-accelerated, rattle-relax-repeat procedure.

Aside from code with an obvious ML research focus, I also enjoy playing around with code for fun. Examples include:

digital-experiments : a lightweight wrapper to keep track of input, output, code and metadata for digital experiments.
data2objects : a language and parser to configure arbitrary python objects from yaml files.
locache : a single-file, 0-dependency utility package for caching the results of deterministic and pure function calls to disk
this webpage!
make-a-gif : a (very simple) tool for creating GIFs using matplotlib (inlcuding silly ones like this)

Publications

Below, I (try to) group my publications into common themes:

Multi-fidelity training

The research I undertook while interning at Microsoft Research, AI for Science. We performed a rigorous set of experiments to better understand the mechanisms by which popular multi-fidelity training methods (pre-train/fine-tune and multi-head) lead to improved model performance. Among many findings, we discovered that models learn fidelity specific internal representations: while these are somewhat transferable to other fidelities, care must be taken to ensure that this does not adversely affect model performance.

Understanding multi-fidelity training of machine-learned force-fields

J. Gardner, H. Schulz, J. Helie, L. Sun, G. Simm
arXiv 2025
Pre-print

Synthetic data

The main thrust of my PhD research to date: in this trio of papers, I show that (i) synthetic data are useful in atomistic ML, (ii) pre-training on synthetic data can lead to more accurate and robust MLFFs, and (iii) synthetic data can be used to distill foundation models for targeted domains in an architecture-agnostic manner.

Synthetic data enable experiments in atomistic machine learning

J. Gardner, Z. Faure Beaulieu and V. Deringer
Digital Discovery 2023
Paper, Pre-print, and Code

Synthetic pre-training for neural-network interatomic potentials

J. Gardner, K. Baker and V. Deringer
Machine Learning: Science and Technology 2024
Paper, Pre-print, and Code

Distillation of atomistic foundation models across architectures and chemical domains

J. Gardner, D. Thomas du Toit, C. Ben Mahmoud, Z. Faure Beaulieu, V. Juraskova, L. Rosset, F. Duarte, F. Martelli, C. Pickard and V. Deringer
arXiv 2025
Pre-print, and Code

MLFF validation

We presented early work showing how important validation-beyond-error-metrics is, and showcase several extended validation methods.

How to validate machine-learned interatomic potentials

J. Morrow, J. Gardner and V. Deringer
Journal of Chemical Physics 2023
Paper, Pre-print, and Code

The importance of data

In these two papers, we show how important the data you train your MLFFs on are, and present a new package, autoplex, for automatically creating high-quality datasets via random structure searching (RSS).

Data as the next challenge in atomistic machine learning

C. Ben Mahmoud, J. Gardner and V. Deringer
Nature Computational Science 2024
Paper

An automated framework for exploring and learning potential-energy surfaces

Y. Liu, J. Morrow, C. Ertural, N. Fragapane, J. Gardner, A. Naik, Y. Zhou, J. George and V Deringer
arXiv 2024
Pre-print

Miscellaneous

A few other works that don't fit into such an obvious theme.

Assessing zero-shot generalisation behaviour in graph-neural-network interatomic potentials

C. Ben Mahmoud, Z. El-Machachi, K. A. Gierczak, J. Gardner and V. Deringer
arXiv 2025
Pre-print

Coarse-grained versus fully atomistic machine learning for zeolitic imidazolate frameworks

Z. Faure Beaulieu, T. Nicholas, J. Gardner, A. Goodwin and V. Deringer
Chemical Communications 2023
Paper, and Pre-print

Using spectroscopy to probe relaxation, decoherence, and localization of photoexcited states in π-conjugated polymers

W. Barford, J. Gardner, and J. R. Mannouch
Faraday Discussions 2020
Paper

Experience

In reverse chronological order, I have worked at:

Valence Labs (2025) as an ML Research Intern under Prudencio Tossou.
Microsoft research (2024) as an ML Research Intern under Gregor Simm.

My research was focused on the multi-fidelity training of MLFFs.
Syvl.earth (2023-2024) as an ML Specialist Consultant.

My project involved creating ML models to develop baseline measurements of farmland health, and changes therein due to climate change
Ocado Technology (2019-2021) as a Software and ML Engineer.

Among other responsibilities, I developed real-time coordination and control systems for robotic swarms, trained computer vision models for autonomous motion planning, and used RL to optimise automated warehouse throughput and energy efficiency
The Phoenix Partnership (2018) as an ML Summer Intern.

My project involved developing NLP models to pre-emptively detect ovarian cancer from doctors' notes

Education

From 2015-2019, I studied for a Masters in Chemistry at the University of Oxford, receiving First Class Honours, and placing 4/180 overall in the year. For my Part II year, I was supervised by Prof. William Barford, and produced a thesis entitled:

Using spectroscopy to probe relaxation, decoherence, and localization of photoexcited states in π-conjugated polymers.

From 2020-2022, I studied for a Masters in Computer Science at the University of York, receiving a Distinction. My thesis was entitled:

Using interpretability methods to understand knowledge- and self-distillation processes in neural network training.

Since 2021, I have been studying for a PhD in ML for Chemistry at the University of Oxford, supervised by Prof. Volker Deringer. I intend to graduate in 2026.

During my education, I have been awarded the following prizes:

First place, Oxford Quant Challenge 2023, G-Research Hackathon
The EPA Cephalosporin Scholarship, Linacre College (2021-2025)
Runner-up prize for Physical and Theoretical Masters' theses, Department of Chemistry (2019)
The Downs Prize, Jesus College (2018)
The Gibbs Prize, University of Oxford (2018)
Open Scholarship, Jesus College (2016-2019)
The Downs Prize, Jesus College (2018)
The Woodward Prize for Chemistry, Jesus College (4x: 2015-2019)

Hobbies

In my spare time, I'm a voracious reader, particularly of science fiction. Some of my favourite books include The Metamorphosis of Prime Intellect and Harry Potter and the Methods of Rationality. I'm also partial to literature from the mid 1800s, such as The Count of Monte Cristo, and Crime and Punishment. Aside from reading books, I also frequent several blogs and message boards, including Less Wrong, and (as should be obvious from the style of this site) gwern.net.