Core#

The primary functionality of the digital-experiments package is exposed via the experiment() decorator. Wrapping a function with this decorator returns an Experiment object, with identical signature to the original function.

Use the (experiment).observations method to access the results of the experiment, which are stored as Observation objects.

Within an experiment, use the current_id() and current_dir() functions to access the automatically assigned experiment ID and related storage directory.

Use the time_block() function to time certain blocks of code within the experiment.

Kitchen Sink example#

An example that intends to display the full range of available functionality:

from pathlib import Path
from time import sleep

from digital_experiments import current_dir, current_id, experiment, time_block
from digital_experiments.callbacks import SaveLogs


@experiment(
    backend="json",
    cache=True,
    callbacks=[SaveLogs("my-logs.txt")],
    root=Path("results"),
    verbose=True,
)
def my_experiment(a: int, b: int) -> int:
    with time_block("add"):
        sleep(0.5)
        c = a + b

    with time_block("multiply"):
        sleep(0.5)
        c = c * 2

    print("this will appear in the logs")
    (current_dir() / "output.txt").write_text(f"hello from {current_id()}")
    return c


# new experiment:
my_experiment(1, 2)
# "this will appear in the logs"
# (returns 6)

# don't record the experiment again due to cache=True
my_experiment(1, 2)
# (returns 6)

# get the observation
observation = my_experiment.observations()[-1]

print(my_experiment.artefacts(observation.id))
# "Path('results/storage/<id>/my-logs')"

for a in range(10):
    my_experiment(a, a)

# access the results as a pandas dataframe
df = my_experiment.to_dataframe()

Available functions#

digital_experiments.experiment(function: Callable) → Experiment#

digital_experiments.experiment(*, root: Path | None = None, verbose: bool = False, backend: str = 'json', cache: bool = False, callbacks: list[Callback] | None = None) → Callable[[Callable], Experiment]

Decorator to automate the recording of experiments.

Examples

As a simple decorator, using all defaults:

@experiment
def add(a, b):
    return a + b

add(1, 2)
# 3

add.observations() # returns a list of observations
# [Observation(<id1>, {'a': 1, 'b': 2} → 3})]

As a decorator with some custom options specified:

@experiment(root="my-experiments", verbose=True, backend="json")
def add(a, b):
    return a + b

Parameters:

function – The function to wrap
root – The root directory for storing results. If not specified, the environment variable DE_ROOT is used, or the default ./experiments/<function_name> is used.
verbose – Whether to print progress to stdout
backend – The type of backend to use for storing results. See the backends page for more details.
cache – Whether to use cached results if available
callbacks – A list of optional callbacks to use. See the callbacks page for more details.

digital_experiments.current_id() → str#

Get the id of the currently running experiment.

Example

from digital_experiments import experiment, current_id

@experiment
def example():
    print(current_id())

example() # prints something like "2021-01-01_12:00:00.000000"

digital_experiments.current_dir() → Path#

Get the directory of the currently running experiment.

Use this function within an experiment to get a unique directory per experiment run to store results in. Anything stored in this directory can be accessed later using the artefacts method.

Example

from digital_experiments import experiment, current_dir

@experiment
def example():
    (current_dir() / "results.txt").write_text("hello world")

example()
id = example.observations()[-1].id
example.artefacts(id) # returns [Path("<some>/<path>/<id>/results.txt")]

digital_experiments.time_block(name: str)#

Time the code that runs inside this context manager.

The start, end and duration are added into metadata["timing"][name] of the currently active experiment.

Parameters:: name (str) – The name of the timing block

Example

import time
from digital_experiments import experiment, time_block

@experiment
def example():
    with time_block("custom-block"):
        time.sleep(1)

example()
example.observations[-1].metadata["timing"]["custom-block"]
# returns something like:
# {
#     "start": "2021-01-01 12:00:00",
#     "end": "2021-01-01 12:00:01",
#     "duration": 1.0,
# }

Internal classes#

class digital_experiments.core.Experiment(function: Callable, backend: Backend, callbacks: list[Callback], cache: bool)#

An Experiment object wraps a function and records its results.

The resulting object can be called identically to the original function, but has the additional observations method, which returns a list of Observation objects corresponding to previous runs of the function, in this (and previous) Python sessions.

See @experiment for the intended entry point to this class.

observations(current_code_only: bool = True) → list[Observation]#

Get a list of all previous observations of this experiment. By default, this will include observations from previous Python sessions.

Parameters:: current_code_only (bool) – Whether to only return observations from the current version of the code. Defaults to True.

Example

@experiment
def example(a, b=2):
   return a + b

example(1)  # returns 3

example.observations()
# returns [Observation(<id>, {'a': 1, 'b': 2} → 3)]

artefacts(id: str) → list[Path]#

Get a list of artefacts associated with a particular observation.

Add artefacts to an experiment run by writing any and all files to the path returned by current_dir

Parameters:: id (str) – The id of the observation to get artefacts for

Example

from digital_experiments import experiment, current_dir

@experiment
def example():
   (current_dir() / "results.txt").write_text("hello world")

example()
id = example.observations()[-1].id
example.artefacts(id)
# returns [Path("<some>/<path>/<id>/results.txt")]

to_dataframe(current_code_only: bool = True, include_metadata: bool = False, normalising_sep: str = '.')#

Get a pandas DataFrame containing all observations of this experiment.

The resulting DataFrame is in “long” format, with one row per observation, and “normalised” (see pandas.json_normalize()) so that nested dict-like objects (including config, results and metadata) are flattened and cast into multiple columns.

Parameters:

current_code_only (bool) – Whether to only return observations from the current version of the code. Defaults to True.
include_metadata (bool) – Whether to include metadata in the DataFrame. Defaults to False.
normalising_sep (str) – The separator to use when normalising nested dictionaries. Defaults to “.”.

Returns:

A DataFrame containing all observations of this experiment. If pandas is not installed, this will raise an ImportError.

Return type:

pandas.DataFrame

Example

>>> @experiment
... def example(a, b=2):
...    return a + b

>>> example(1)
3
>>> example.to_dataframe()
   id  config.a  config.b  result
0   1         1         2       3

class digital_experiments.core.Observation(id: str, config: dict[str, Any], result: Any, metadata: dict[str, Any])#

Container for a single observation of an Experiment.

Each observation is composed of a unique id, the complete configuration (args, kwargs and defaults) used to run the experiment, the returned result, and a dictionary of metadata.

Parameters:

id (str) – A unique identifier for this observation
config (dict[str, Any]) – The configuration passed to the experiment to produce this observation
result (Any) – The result of the experiment
metadata (dict[str, Any]) – A dictionary of metadata about the observation