about

Profile
    me!

Hi, I'm John, a researcherarrow and engineer with broad ML and scientific interests. I specialise in applying graph neural networks (GNNs) to systems of atoms, and in particular the development of machine-learned force fields (MLFFs).

I'm currently interning at Valence Labs in London, where I'm working on generative modelling for drug discovery. Previously, I was an ML Research Intern in Microsoft Research's AI for Science team in Amsterdam, where I investigated multi-fidelity training for MLFFs.

During my PhD (2021-2026), supervised by Prof. Volker Deringer at the University of Oxford, my research has focused on using synthetic data to pre-/train MLFFs, and to use this approach to distill foundation models for targeted domains. I've also developed a new, open-source framework for training MLFFs: graph-pes

You can find me on LinkedIn, X / Twitter, GitHub, and Scholar.

Code

I have extensive professional experience developing software in Python (coupled with PyTorch, numpy etc.), Java and C++. I am an eager proponent of open science, and ensure that all code and data supporting my research is freely available on GitHub.

Packages that I have written include:

Aside from code with an obvious ML research focus, I also enjoy playing around with code for fun. Examples include:

Publications

Below, I (try to) group my publications into common themes:

Multi-fidelity training

The research I undertook while interning at Microsoft Research, AI for Science. We performed a rigorous set of experiments to better understand the mechanisms by which popular multi-fidelity training methods (pre-train/fine-tune and multi-head) lead to improved model performance. Among many findings, we discovered that models learn fidelity specific internal representations: while these are somewhat transferable to other fidelities, care must be taken to ensure that this does not adversely affect model performance.

Understanding multi-fidelity training of machine-learned force-fields

J. Gardner, H. Schulz, J. Helie, L. Sun, G. Simm
arXiv 2025
Pre-print

Synthetic data

The main thrust of my PhD research to date: in this trio of papers, I show that (i) synthetic data are useful in atomistic ML, (ii) pre-training on synthetic data can lead to more accurate and robust MLFFs, and (iii) synthetic data can be used to distill foundation models for targeted domains in an architecture-agnostic manner.

Synthetic data enable experiments in atomistic machine learning

J. Gardner, Z. Faure Beaulieu and V. Deringer
Digital Discovery 2023
Paper, Pre-print, and Code

Synthetic pre-training for neural-network interatomic potentials

J. Gardner, K. Baker and V. Deringer
Machine Learning: Science and Technology 2024
Paper, Pre-print, and Code

Distillation of atomistic foundation models across architectures and chemical domains

J. Gardner, D. Thomas du Toit, C. Ben Mahmoud, Z. Faure Beaulieu, V. Juraskova, L. Rosset, F. Duarte, F. Martelli, C. Pickard and V. Deringer
arXiv 2025
Pre-print, and Code

MLFF validation

We presented early work showing how important validation-beyond-error-metrics is, and showcase several extended validation methods.
How to validate machine-learned interatomic potentials

J. Morrow, J. Gardner and V. Deringer
Journal of Chemical Physics 2023
Paper, Pre-print, and Code

The importance of data

In these two papers, we show how important the data you train your MLFFs on are, and present a new package, autoplex, for automatically creating high-quality datasets via random structure searching (RSS).

Data as the next challenge in atomistic machine learning

C. Ben Mahmoud, J. Gardner and V. Deringer
Nature Computational Science 2024
Paper

An automated framework for exploring and learning potential-energy surfaces

Y. Liu, J. Morrow, C. Ertural, N. Fragapane, J. Gardner, A. Naik, Y. Zhou, J. George and V Deringer
arXiv 2024
Pre-print

Miscellaneous

A few other works that don't fit into such an obvious theme.
Assessing zero-shot generalisation behaviour in graph-neural-network interatomic potentials

C. Ben Mahmoud, Z. El-Machachi, K. A. Gierczak, J. Gardner and V. Deringer
arXiv 2025
Pre-print

Coarse-grained versus fully atomistic machine learning for zeolitic imidazolate frameworks

Z. Faure Beaulieu, T. Nicholas, J. Gardner, A. Goodwin and V. Deringer
Chemical Communications 2023
Paper, and Pre-print

Using spectroscopy to probe relaxation, decoherence, and localization of photoexcited states in π-conjugated polymers

W. Barford, J. Gardner, and J. R. Mannouch
Faraday Discussions 2020
Paper

Experience

In reverse chronological order, I have worked at:

Education

From 2015-2019, I studied for a Masters in Chemistry at the University of Oxford, receiving First Class Honours, and placing 4/180 overall in the year. For my Part II year, I was supervised by Prof. William Barford, and produced a thesis entitled:

From 2020-2022, I studied for a Masters in Computer Science at the University of York, receiving a Distinction. My thesis was entitled:

Since 2021, I have been studying for a PhD in ML for Chemistry at the University of Oxford, supervised by Prof. Volker Deringer. I intend to graduate in 2026.

During my education, I have been awarded the following prizes:

Hobbies

In my spare time, I'm a voracious reader, particularly of science fiction. Some of my favourite books include The Metamorphosis of Prime Intellect and Harry Potter and the Methods of Rationality. I'm also partial to literature from the mid 1800s, such as The Count of Monte Cristo, and Crime and Punishment. Aside from reading books, I also frequent several blogs and message boards, including Less Wrong, and (as should be obvious from the style of this site) gwern.net.

When I'm not reading or working, I can be found outside running, cycling, swimming, hiking, or (as of very recently) bouldering.

Bike Bike Hiking