Theory

An (uncharged, ground-state) atomic structure containing \(N\) atoms is completely defined by:

  • the positions of its atoms, \(\vec{R} \in \mathbb{R}^{N \times 3}\)

  • their atomic numbers, \({Z} \in \mathbb{Z}^N\)

  • the unit cell, \(C \in \mathbb{R}^{3 \times 3}\) (if the structure is periodic)

Energy

Since these three properties fully define the structure, it must be that the total energy, \(E \in \mathbb{R}\), can be expressed solely as a function of these three properties:

\[E = f\left(\vec{R}, Z, C\right)\]

Forces

The force on atom \(i\), \(\vec{F}_i \in \mathbb{R}^3\), is given by the negative gradient of the energy with respect to that atom’s position:

\[\vec{F}_i = -\frac{\partial E}{\partial \vec{R}_i}\]

By using torch.Tensor representations of \(\vec{R}\), \(Z\), and \(C\), and ensuring that the energy function, \(f\), makes use of torch operations, we can leverage automatic differentiation, supplied by torch.autograd.grad(), to calculate the forces on the atoms in a structure “for free” from any energy prediction.

Stress

Consider scaling a structure, that is “stretching” both the atomic positions and unit cell, by some amount, \(1 + \lambda\), along the \(x\) direction. This operation, \(\hat{O}_{\lambda}\), acts on the atomic positions, \(\vec{R}\), according to:

\[\begin{split}\hat{O}_{\lambda} \left(\vec{R}\right) = \vec{R} \times \begin{pmatrix} 1 + \lambda & 0 & 0 \\ 0 & 1 & 0 \\ 0 & 0 & 1 \end{pmatrix} = \vec{R} + \lambda R_x \begin{pmatrix} 1 \\ 0 \\ 0 \end{pmatrix}\end{split}\]

and analogously for the structure’s unit cell, \(C\). The response of the energy to this transformation gives an indication as to the stress acting on the structure. If

\[\frac{\partial E\left[\hat{O}_{\lambda}(\vec{R}), Z, \hat{O}_{\lambda}(C)\right]}{\partial \lambda} \bigg|_{\lambda=0} = \frac{\partial E}{\partial \lambda} \bigg|_{\lambda=0} < 0\]

then the energy decreases as the unit cell expands from its current state. This would indicate that the system is under compressive stress (along the \(x\) direction) and “wants” to expand.

Now consider a more general scaling operation, \(\hat{\mathbf{O}}_{\mathbf{\lambda}}\), that symmetrically scales both the atomic positions and unit cell as:

\[\begin{split}\begin{aligned} \hat{\mathbf{O}}_{\mathbf{\lambda}} \left(\vec{R}\right) &= \vec{R} \times \begin{pmatrix} 1 + \lambda_{xx} & \lambda_{xy} & \lambda_{xz} \\ \lambda_{yx} & 1 + \lambda_{yy} & \lambda_{yz} \\ \lambda_{zx} & \lambda_{zy} & 1 + \lambda_{zz} \end{pmatrix} \\ &= \vec{R} + \vec{R} \times \begin{pmatrix} \lambda_{xx} & \lambda_{xy} & \lambda_{xz} \\ \lambda_{yx} & \lambda_{yy} & \lambda_{yz} \\ \lambda_{zx} & \lambda_{zy} & \lambda_{zz} \end{pmatrix} \end{aligned}\end{split}\]

where, due to the symmetry of the expansion, \(\lambda_{ij} = \lambda_{ji} \quad \forall \; i \neq j \in \{x,y,z\}\).

The diagonal terms of this matrix again correspond to the compressive/dilative stress along each of the Cartesian axes.

The off-diagonal terms describe the shear stress, i.e. the tendency of the structure to slide in one plane relative to another.

In graph-pes, we follow the common definition of the stress tensor, \(\mathbf{\sigma} \in \mathbb{R}^{3 \times 3}\), as the derivative of the total energy with respect to these stretching coefficients, as normalised by the cell’s volume, \(V = \det(\mathbf{C})\): [1]

\[\mathbf{\sigma} = \frac{1}{V} \frac{\partial E}{\partial \mathbf{\lambda}} \bigg|_{\mathbf{\lambda} = 0} \quad \quad \sigma_{ij} = \frac{1}{V} \frac{\partial E}{\partial \lambda_{ij}} \bigg|_{\lambda_{ij} = 0}\]

We can again make use of automatic differentiation to calculate these stress tensors. To do this, we:

  1. define a symmetrized \(\mathbf{\lambda} = 0^{3 \times 3}\). This is all zeros since we
    • don’t want to actually change the atomic positions or unit cell for the energy calculation

    • want to evaluate the derivative at \(\mathbf{\lambda} = 0\)

  2. apply the scaling operation, \(\hat{\mathbf{O}}_{\mathbf{\lambda}}\), to the atomic positions and unit cell
    • again this is a no-op due to evaluating the scaling operation at \(\mathbf{\lambda} = 0\), but introduces the scaling coefficients into the computational graph

  3. evaluate the energy

  4. calculate the derivative of the energy with respect to \(\mathbf{\lambda}\) and normalise by the cell’s volume

Virial

In graph-pes, we follow the common definition of the virial stress tensor, \(\mathbf{W} \in \mathbb{R}^{3 \times 3}\), as:

\[\begin{split}\begin{aligned} \mathbf{W} &= - \frac{\partial E}{\partial \mathbf{\lambda}} \bigg|_{\mathbf{\lambda} = 0} \\ & = - \text{stress} \times V \end{aligned}\end{split}\]

i.e. as the negative of the stress tensor scaled by the cell’s volume. Hence, the virial stress tensor is an extensive property, while the stress tensor is an intensive property.

Correctness

Below we (empirically) show the correctness of the predict() implementation on GraphPESModel instances by comparing:

  1. predictions from the inbuilt graph_pes.models.LennardJones model against those from ASE’s LennardJones calculator.

  2. "force" and "stress" predictions from arbitrary GraphPESModel instances against finite-difference approximations of those quantities.

Lennard-Jones

[2]:
from __future__ import annotations

import ase
from ase.calculators.lj import LennardJones as ASE_LennardJones
from graph_pes.models import LennardJones as GraphPES_LennardJones
from graph_pes.utils.calculator import GraphPESCalculator

lj_properties = {
    "rc": 2.0,
    "sigma": 1.0,
    "epsilon": 1.0,
    "smooth": False,
}

ase_lj_calc = ASE_LennardJones(**lj_properties)
graph_pes_lj_calc = GraphPESCalculator(
    model=GraphPES_LennardJones.from_ase(**lj_properties)
)

dimer = ase.Atoms(
    positions=[
        [0, 0, 0],
        [0, 0, 0.98],
    ]
)

ase_lj_calc.get_potential_energy(dimer)
[2]:
0.643428299130453
[3]:
graph_pes_lj_calc.get_potential_energy(dimer)
[3]:
0.643427848815918
[4]:
ase_lj_calc.get_forces(dimer)
[4]:
array([[  0.        ,   0.        , -34.77113701],
       [  0.        ,   0.        ,  34.77113701]])
[6]:
graph_pes_lj_calc.get_forces(dimer)

[6]:
array([[ -0.      ,  -0.      , -34.771133],
       [ -0.      ,  -0.      ,  34.771133]], dtype=float32)
[20]:
import numpy as np
from ase.stress import voigt_6_to_full_3x3_stress


def get_stress_matrix(calculator, structure):
    stress = calculator.get_stress(structure)
    if stress.shape != (3, 3):
        stress = voigt_6_to_full_3x3_stress(stress)
    return np.round(stress, 3)


cell_size = 1
periodic_structure = ase.Atoms(
    positions=[
        [cell_size / 2, cell_size / 2, cell_size / 2],
    ],
    cell=[cell_size, cell_size, cell_size],
    pbc=True,
)

get_stress_matrix(ase_lj_calc, periodic_structure)
[20]:
array([[-18.039,  -0.   ,   0.   ],
       [ -0.   , -18.039,   0.   ],
       [  0.   ,   0.   , -18.039]])
[21]:
get_stress_matrix(graph_pes_lj_calc, periodic_structure)
[21]:
array([[-18.039,   0.   ,   0.   ],
       [  0.   , -18.039,  -0.   ],
       [  0.   ,  -0.   , -18.039]])

Finite-differences

The force on atom \(i\) along direction \(d\) is given by:

\[F^{(d)}_i = -\frac{\partial E}{\partial R^{(d)}_i}\]

which we can estimate by perturbing \(R^{(d)}_i\) and calculating the finite-difference:

\[F^{(d)}_i \approx \frac{E\left(R^{(d)}_i + \epsilon\right) - E\left(R^{(d)}_i - \epsilon\right)}{2\epsilon}\]

We can also estimate the stress tensor in a simlar fashion.

[9]:
from typing import Callable, Literal


def finite_difference_force_estimate(
    energy_function: Callable[[ase.Atoms], float],
    atoms: ase.Atoms,
    atom_index: int,
    direction: Literal["x", "y", "z"],
    epsilon: float = 1e-4,
) -> float:
    direction_index = {"x": 0, "y": 1, "z": 2}[direction]

    copy1 = atoms.copy()
    copy1.positions[atom_index][direction_index] += epsilon
    copy2 = atoms.copy()
    copy2.positions[atom_index][direction_index] -= epsilon

    f1 = energy_function(copy1)
    f2 = energy_function(copy2)
    return (f2 - f1) / (2 * epsilon)


def finite_difference_stress_component(
    energy_function: Callable[[ase.Atoms], float],
    atoms: ase.Atoms,
    component: tuple[int, int],
    epsilon: float = 1e-6,
) -> float:
    i, j = component

    scaling = np.eye(3)
    if i == j:
        scaling[i, j] += epsilon
    else:
        scaling[i, j] = scaling[j, i] = epsilon / 2

    copy = atoms.copy()
    copy.positions = atoms.positions @ scaling
    copy.cell = atoms.cell @ scaling

    f1 = energy_function(copy)
    f2 = energy_function(atoms)
    return (f1 - f2) / epsilon / atoms.get_volume()


def finite_difference_stress_estimate(
    energy_function: Callable[[ase.Atoms], float],
    atoms: ase.Atoms,
    epsilon: float = 1e-6,
) -> np.ndarray:
    stress = np.zeros((3, 3))
    for i in range(3):
        for j in range(3):
            stress[i, j] = finite_difference_stress_component(
                energy_function, atoms, (i, j), epsilon
            )
    return stress

Lets check that our autograd-based implementation matches the finite-difference estimates:

[22]:
from graph_pes.models import SchNet

model = SchNet(cutoff=3.0)
schnet_calc = GraphPESCalculator(model)

cell_size = 4
n_atoms = 10
random_structure = ase.Atoms(
    positions=np.random.RandomState(42).rand(n_atoms, 3) * cell_size,
    cell=[cell_size, cell_size, cell_size],
    pbc=True,
)
[23]:
atom_index = 0
direction = "x"

fd_estimate = finite_difference_force_estimate(
    schnet_calc.get_potential_energy,
    random_structure,
    atom_index,
    direction,
    epsilon=1e-3,
)
autograd_value = schnet_calc.get_forces(random_structure)[atom_index][
    {"x": 0, "y": 1, "z": 2}[direction]
]

np.round(fd_estimate, 6), np.round(autograd_value, 6)
[23]:
(-0.034377, -0.034358)
[24]:
# finite-difference stress
fd_stress = finite_difference_stress_estimate(
    schnet_calc.get_potential_energy, random_structure, epsilon=1e-3
)
np.round(fd_stress, 6)

[24]:
array([[ 0.001911,  0.002092, -0.003447],
       [ 0.002092,  0.003045, -0.00231 ],
       [-0.003447, -0.00231 ,  0.005248]])
[25]:
# autograd stress
autograd_stress = voigt_6_to_full_3x3_stress(
    schnet_calc.get_stress(random_structure)
)
np.round(autograd_stress, 6)
[25]:
array([[ 0.001928,  0.002072, -0.003426],
       [ 0.002072,  0.003042, -0.002297],
       [-0.003426, -0.002297,  0.005268]])