.. This file is autogenerated by dev/scripts/generate_page.py ANI-1x ====== .. grid:: 1 1 2 2 .. grid-item:: .. raw:: html :file: ../_static/visualisations/ANI-1x.html .. grid-item:: :class: info-card The ANI-1x dataset is a comprehensive collection of labelled molecular structures designed for training machine learned potentials. ANI-1x was generated using an active learning approach to produce a diverse and useful dataset covering the chemical space of organic molecules composed of C, H, N, and O atoms, Accurate energy and force labels are provided for each structure using the :math:`\omega`\ B97x/6-31G(d) level of theory. Internall, files are downloaded from `FigShare `__. .. code-block:: pycon >>> from load_atoms import load_dataset >>> load_dataset("ANI-1x") ANI-1x: structures: 4,956,005 atoms: 75,700,481 species: H: 47.63% C: 30.30% N: 13.32% O: 8.75% properties: per atom: (forces) per structure: (dipole, energy, is_in_ccx) License ------- This dataset is licensed under the `CC0 `_ license. Citation -------- If you use this dataset in your work, please cite the following: .. code-block:: latex @article{Smith-18-05, title = { Less Is More: Sampling Chemical Space with Active Learning }, author = { Smith, Justin S. and Nebgen, Ben and Lubbers, Nicholas and Isayev, Olexandr and Roitberg, Adrian E. }, year = {2018}, journal = {The Journal of Chemical Physics}, volume = {148}, number = {24}, doi = {10.1063/1.5023802}, } Properties ---------- **Per-atom**: .. list-table:: :header-rows: 1 * - Property - Units - Type - Description * - :code:`forces` - eV/Å - :class:`ndarray(N, 3) ` - force vectors (as labelled with :math:`\omega`\ B97x/6-31G(d)) **Per-structure**: .. list-table:: :header-rows: 1 * - Property - Units - Type - Description * - :code:`energy` - eV - :class:`~float64` - energy of the structure (as labelled with :math:`\omega`\ B97x/6-31G(d)) * - :code:`dipole` - e Å - :class:`ndarray(3,) ` - dipole moment of the structure (as labelled with :math:`\omega`\ B97x/6-31G(d)) * - :code:`is_in_ccx` - - :class:`~bool` - whether the structure is in the :doc:`/datasets/ANI-1ccx` subset Miscellaneous information ------------------------- ``ANI-1x`` is imported as an :class:`~load_atoms.atoms_dataset.LmdbAtomsDataset`: .. dropdown:: Importer script for :code:`ANI-1x` .. literalinclude:: ../../../src/load_atoms/database/importers/ani_1x.py :language: python .. dropdown:: :class:`~load_atoms.database.DatabaseEntry` for :code:`ANI-1x` .. code-block:: yaml name: ANI-1x year: 2018 category: Benchmarks license: CC0 minimum_load_atoms_version: 0.3 format: lmdb description: | The ANI-1x dataset is a comprehensive collection of labelled molecular structures designed for training machine learned potentials. ANI-1x was generated using an active learning approach to produce a diverse and useful dataset covering the chemical space of organic molecules composed of C, H, N, and O atoms, Accurate energy and force labels are provided for each structure using the :math:`\omega`\ B97x/6-31G(d) level of theory. Internall, files are downloaded from `FigShare `__. citation: | @article{Smith-18-05, title = { Less Is More: Sampling Chemical Space with Active Learning }, author = { Smith, Justin S. and Nebgen, Ben and Lubbers, Nicholas and Isayev, Olexandr and Roitberg, Adrian E. }, year = {2018}, journal = {The Journal of Chemical Physics}, volume = {148}, number = {24}, doi = {10.1063/1.5023802}, } per_atom_properties: forces: desc: force vectors (as labelled with :math:`\omega`\ B97x/6-31G(d)) units: eV/Å per_structure_properties: energy: desc: energy of the structure (as labelled with :math:`\omega`\ B97x/6-31G(d)) units: eV dipole: desc: dipole moment of the structure (as labelled with :math:`\omega`\ B97x/6-31G(d)) units: e Å is_in_ccx: desc: whether the structure is in the :doc:`/datasets/ANI-1ccx` subset representative_structure: 205_000