Potential Fitting

C-GAP-17

The complete dataset and labels used to train and test the C-GAP-17 interatomic potential for amorphous carbon. This dataset was built in an iterative manner, and contains 4,530 structures, covering a wide range of densities, temperatures and degrees of dis/order. More detail can be found in the paper’s supplementary information.

C-GAP-20U

The complete dataset used for training the C-GAP-20U interatomic potential for carbon. Suitably converged labels were obtained with revised DFT settings, see CAM.840-6.

GO-MACE-23

The complete dataset ("iter-12-filtered") used for training and testing the GO-MACE-23 interatomic potential. This dataset covers a wide range of graphene-oxide relevant configurations, several different stoichiometries, and many different functional groups. DFT labels were generated using CASTEP with the PBE functional. Original data were obtained from Zenodo. Reference energies for isolated atoms are {"C": -148.6811580026, "H": -12.53432584235, "O": -431.6357255604}.

GST-GAP-22

The complete dataset used for training the GST-GAP-22 interatomic potential, as labelled using the PBE functional. This dataset covers a range of compositions along the \(\text{GeTe} \rightarrow \text{Sb}_2\text{Te}_3\) pseudo-binary line, and was created using a two-step iterative process. More details are available in the paper’s supplementary information. The original data were obtained from Zenodo.

P-GAP-20

The complete Phosphorus dataset used to train the P-GAP-20 model from A General-Purpose Machine-Learning Force Field for Bulk and Nanostructured Phosphorus. This dataset contains structures generated by GAP-RSS, together with liquids, crystals and isolated fragments. For more information about the dataset’s construction, see the paper’s Supplementary Information.

Si-GAP-18

The complete dataset used to train the Si-GAP-18 model from Machine Learning a General-Purpose Interatomic Potential for Silicon. The CUR algorithm was used to select representative structures from a larger dataset. Energy and force labels were calculated using the PW91 exchange-correlation functional as implemented in CASTEP (see II.B: Database of the paper).

SiO2-GAP-22

The training database used to fit the GAP-22 potential for silica in: A Machine-Learned Interatomic Potential for Silica and Its Relation to Empirical Models. The dataset was generated using an iterative approach, in some cases driven by an emprical potential. More details are available in the supplementary information.

SiOx-ACE-24

The training database used to fit the SiOx-ACE-24 potential in: Modelling atomic and nanoscale structure in the silicon-oxygen system through active machine-learning. The dataset comprises structures taken from the Si-GAP-18 and SiO2-GAP-22 datasets, together with new structures generated using an active-learning approach.