John Gardner

I’m a DPhil student in the Department of Inorganic Chemistry at the University of Oxford. My research is supervised by Prof. Volker Deringer and funded by an EPA Cephalosporin Scholarship, an EPSRC DTP award and directly by the Department of Chemistry, University of Oxford.

I am broadly interested in (modern) Machine Learning and its application to materials science. My recent research interests include the investigation of synthetic data labels as a general pretraining task for atomistic deep learning, and the development of more rigorous testing frameworks for the validation of machine-learned interatomic potentials. Previously, I have also worked on interpretability methodology for computer vision models, and on the computational simulation of photo-dynamics in organic semiconductors.

Before starting my PhD, I ranked 4th in my year at the University of Oxford, where I studied for an MChem in Chemistry. During the summer of my 3rd year, I worked as an intern at TPP, developing NLP-based ML models to perform accelerated disease diagnosis. After my undergrad, I then worked for 2 years as part of an ML research group in industry, where my time was predominantly spent working with CV models for autonomous navigation and Monte-Carlo tree search for warehouse optimization. Simultaneously, I also completed an MSc in Computer Science and Data Analytics at the University of York, achieving a Distinction.

📖 Publications

Synthetic pre-training for neural-network interatomic potentials

John L.A. Gardner, Kathryn T. Baker and Volker L. Deringer

🔖Paper — 📝Pre-print — 🤖Code

Machine learning (ML) based interatomic potentials have transformed the field of atomistic materials modelling. However, ML potentials depend critically on the quality and quantity of quantum-mechanical reference data with which they are trained, and therefore develop- ing datasets and training pipelines is becoming an increasingly central challenge. Leveraging the idea of “synthetic” (artificial) data that is common in other areas of ML research, we here show that synthetic atomistic data, themselves obtained at scale with an existing ML po- tential, constitute a useful pre-training task for neural-network interatomic potential models. Once pre-trained with a large synthetic dataset, these models can be fine-tuned on a much smaller, quantum-mechanical one, improving numerical accuracy and stability in computa- tional practice. We demonstrate feasibility for a series of equivariant graph-neural-network potentials for carbon, and we carry out initial experiments to test the limits of the approach.

Synthetic data enable experiments in atomistic machine learning

John L.A. Gardner, Zoé Faure Beaulieu and Volker L. Deringer

🔖Paper — 📝Pre-print — 🤖Code — 💿Data

Machine-learning models are increasingly used to predict properties of atoms in chemical systems. There have been major advances in developing descriptors and regression frameworks for this task, typically starting from (relatively) small sets of quantum-mechanical reference data. Larger datasets of this kind are becoming available, but remain expensive to generate. Here we demonstrate the use of a large dataset that we have “synthetically” labelled with per-atom energies from an existing ML potential model. The cheapness of this process, compared to the quantum-mechanical ground truth, allows us to generate millions of datapoints, in turn enabling rapid experimentation with atomistic ML models from the small- to the large-data regime. This approach allows us here to compare regression frameworks in depth, and to explore visualisation based on learned representations. We also show that learning synthetic data labels can be a useful pre-training task for subsequent fine-tuning on small datasets. In the future, we expect that our open-sourced dataset, and similar ones, will be useful in rapidly exploring deep-learning models in the limit of abundant chemical data.

Coarse-grained versus fully atomistic machine learning for zeolitic imidazolate frameworks

Zoé Faure Beaulieu, Thomas C. Nicholas, John L. A. Gardner, Andrew L. Goodwin, Volker L. Deringer

🔖Paper — 📝Pre-print — 🤖Code — 💿Data

Zeolitic imidazolate frameworks are widely thought of as being analogous to inorganic AB₂ phases. We test the validity of this assumption by comparing simplified and fully atomistic machine-learning models for local environments in ZIFs. Our work addresses the central question to what extent chemical information can be “coarse-grained” in hybrid framework materials.

How to validate machine-learned interatomic potentials

Joe. D. Morrow, John L.A. Gardner and Volker L. Deringer

🔖Paper — 📝Pre-print — 🤖Code

Machine learning (ML) approaches enable large-scale atomistic simulations with near-quantum-mechanical accuracy. With the growing availability of these methods there arises a need for careful validation, particularly for physically agnostic models - that is, for potentials which extract the nature of atomic interactions from reference data. Here, we review the basic principles behind ML potentials and their validation for atomic-scale materials modeling. We discuss best practice in defining error metrics based on numerical performance as well as physically guided validation. We give specific recommendations that we hope will be useful for the wider community, including those researchers who intend to use ML potentials for materials “off the shelf”.

Using spectroscopy to probe relaxation, decoherence, and localization of photoexcited states in π-conjugated polymers

William Barford, John L.A. Gardner, and Jonathan R. Mannouch

🔖Paper — 📋PDF

We use the coarse-grained Frenkel–Holstein model to simulate the relaxation, decoherence, and localization of photoexcited states in conformationally disordered π-conjugated polymers. The dynamics are computed via wave-packet propagation using matrix product states and the time evolution block decimation method. The ultrafast (i.e., t < 10 fs) coupling of an exciton to C–C bond vibrations creates an exciton–polaron. The relatively short (ca. 10 monomers) exciton-phonon correlation length causes ultrafast exciton-site decoherence, which is observable on conformationally disordered chains as fluorescence depolarization. Dissipative coupling to the environment (modelled via quantum jumps) causes the localization of quasi-extended exciton states (QEESs) onto local exciton ground states (LEGSs, i.e., chromophores). This is observable as lifetime broadening of the 0–0 transition (and vibronic satellites) of the QEES in two-dimensional electronic coherence spectroscopy. However, as this process is incoherent, neither population increases of the LEGSs nor coherences with LEGSs are observable.

🎤 Presentations

🤖 Code

I enjoy coding, both for my research and for fun. My (current) favourite languages are Python and Julia, but I also have professional experience with C, Java, JavaScript and TypeScript.

As part of my DPhil, I have developed and open-sourced the following pip-installable packages, both to accelerate my own research and to hopefully help others too:

load-atoms

“A package for loading atomistic databases. (Large Open Access Datasets for ATOmistic Materials Science).”
docs repo pypi

locache

“A single-file utility library for caching the results of deterministic and pure function calls to disk”
repo pypi

digital-experiments

“A lightweight python library for keeping track of, and optimizing, configuration and results for digital experiments.”
docs repo pypi

🎓 Other Works