about

Hi, I'm John, a researcher and engineer
with
broad ML and scientific interests.
I specialise in applying graph neural networks (GNNs) to systems of atoms, and in particular
the development of machine-learned force fields (MLFFs).
I'm currently interning at Valence Labs in London, where I'm working on generative modelling
for drug discovery.
Previously, I was an ML Research Intern in Microsoft Research's AI for Science team in
Amsterdam, where I investigated multi-fidelity training for MLFFs.
During my PhD (2021-2026), supervised by Prof. Volker Deringer at the University of Oxford,
my research has focused on using synthetic data to pre-/train MLFFs, and to use this approach to
distill
foundation models for targeted domains.
I've also developed a new, open-source framework for training MLFFs: graph-pes
You can find me on LinkedIn, X / Twitter
,
GitHub
, and Scholar
.
Code
I have extensive professional experience developing software in Python (coupled with PyTorch, numpy etc.), Java and C++. I am an eager proponent of open science, and ensure that all code and data supporting my research is freely available on GitHub.
Packages that I have written include:
-
graph-pes
: a framework for defining, training and using graph-based models of the potential energy surface (i.e. MLFFs). Features I'm especially proud of include distributed training, a pair-style for LAMMPS, fine-tuning capabilities for many foundation models, and independent re-implementations of many popular MLFF architectures, such as MACE, TensorNet, NequIP, and PaiNN.
-
load-atoms
: a package for downloading, inspecting and manipulating large datasets of atomic structures in a vectorised and efficient manner.
-
augment-atoms
: a tool for augmenting datasets of atomic configurations via a model-driven, GPU-accelerated, rattle-relax-repeat procedure.
Aside from code with an obvious ML research focus, I also enjoy playing around with code for fun. Examples include:
-
digital-experiments
: a lightweight wrapper to keep track of input, output, code and metadata for digital experiments.
-
data2objects
: a language and parser to configure arbitrary python objects from yaml files.
-
locache
: a single-file, 0-dependency utility package for caching the results of deterministic and pure function calls to disk
- this webpage!
-
make-a-gif
: a (very simple) tool for creating GIFs using matplotlib (inlcuding silly ones like this)
Publications
Below, I (try to) group my publications into common themes:
Multi-fidelity training
The research I undertook while interning at Microsoft Research, AI for Science. We performed a rigorous set of experiments to better understand the mechanisms by which popular multi-fidelity training methods (pre-train/fine-tune and multi-head) lead to improved model performance. Among many findings, we discovered that models learn fidelity specific internal representations: while these are somewhat transferable to other fidelities, care must be taken to ensure that this does not adversely affect model performance.
J. Gardner, H. Schulz, J. Helie, L. Sun, G. Simm
arXiv 2025
Pre-print
Synthetic data
The main thrust of my PhD research to date: in this trio of papers, I show that (i) synthetic data are useful in atomistic ML, (ii) pre-training on synthetic data can lead to more accurate and robust MLFFs, and (iii) synthetic data can be used to distill foundation models for targeted domains in an architecture-agnostic manner.
MLFF validation
We presented early work showing how important validation-beyond-error-metrics is, and showcase several extended validation methods.The importance of data
In these two papers, we show how important the data you train your MLFFs on are, and present a new
package, autoplex, for
automatically creating high-quality datasets via random structure searching (RSS).
C. Ben Mahmoud, J. Gardner and V. Deringer
Nature Computational Science 2024
Paper
Y. Liu, J. Morrow, C. Ertural, N. Fragapane, J. Gardner, A. Naik, Y. Zhou, J. George and V
Deringer
arXiv 2024
Pre-print
Miscellaneous
A few other works that don't fit into such an obvious theme.
C. Ben Mahmoud, Z. El-Machachi, K. A. Gierczak, J. Gardner and V. Deringer
arXiv 2025
Pre-print
Z. Faure Beaulieu, T. Nicholas, J. Gardner, A. Goodwin and V. Deringer
Chemical Communications 2023
Paper, and
Pre-print
W. Barford, J. Gardner, and J. R. Mannouch
Faraday Discussions 2020
Paper
Experience
In reverse chronological order, I have worked at:-
Valence Labs
(2025)
as an ML Research Intern under Prudencio Tossou
.
-
Microsoft research
(2024) as an ML Research Intern under Gregor Simm
.
My research was focused on the multi-fidelity training of MLFFs. -
Syvl.earth (2023-2024) as
an
ML Specialist Consultant.
My project involved creating ML models to develop baseline measurements of farmland health, and changes therein due to climate change -
Ocado Technology
(2019-2021)
as a Software and ML Engineer.
Among other responsibilities, I developed real-time coordination and control systems for robotic swarms, trained computer vision models for autonomous motion planning, and used RL to optimise automated warehouse throughput and energy efficiency -
The Phoenix Partnership
(2018) as an ML Summer Intern.
My project involved developing NLP models to pre-emptively detect ovarian cancer from doctors' notes
Education
From 2015-2019, I studied for a Masters in Chemistry at the University of Oxford, receiving First
Class
Honours, and placing 4/180 overall in the year.
For my Part II year, I was supervised by Prof. William Barford, and produced a thesis entitled:
From 2020-2022, I studied for a Masters in Computer Science at the University of York, receiving a Distinction. My thesis was entitled:
Since 2021, I have been studying for a PhD in ML for Chemistry at the University of Oxford,
supervised
by Prof. Volker Deringer.
I intend to graduate in 2026.
During my education, I have been awarded the following prizes:
- First place, Oxford Quant Challenge 2023, G-Research Hackathon
- The EPA Cephalosporin Scholarship, Linacre College (2021-2025)
- Runner-up prize for Physical and Theoretical Masters' theses, Department of Chemistry (2019)
- The Downs Prize, Jesus College (2018)
- The Gibbs Prize, University of Oxford (2018)
- Open Scholarship, Jesus College (2016-2019)
- The Downs Prize, Jesus College (2018)
- The Woodward Prize for Chemistry, Jesus College (4x: 2015-2019)
Hobbies
In my spare time, I'm a voracious reader,
particularly of science fiction. Some of my favourite books
include The Metamorphosis of Prime Intellect
and Harry Potter and the Methods of Rationality
.
I'm also partial to literature from the mid 1800s, such as The Count of Monte Cristo
, and
Crime
and Punishment
.
Aside from reading books, I also frequent several blogs and message boards, including Less
Wrong
, and (as should be obvious from
the
style of this site) gwern.net
.
When I'm not reading or working, I can be found outside running, cycling, swimming, hiking, or (as of very recently) bouldering.


