The Basics#
digital-experiments works straight out of the box:
[2]:
from digital_experiments import experiment
@experiment
def square(x):
return x * x
[square(i) for i in range(5)]
[2]:
[0, 1, 4, 9, 16]
Get all the Observations from the experiment (these are persisted over multiple python sessions):
[3]:
square.observations()
[3]:
[Observation(2024-02-24_11-42-11_694149, {'x': 0} → 0),
Observation(2024-02-24_11-42-12_043764, {'x': 1} → 1),
Observation(2024-02-24_11-42-12_052862, {'x': 2} → 4),
Observation(2024-02-24_11-42-12_065819, {'x': 3} → 9),
Observation(2024-02-24_11-42-12_075406, {'x': 4} → 16)]
If you have pandas
installed, you can also use the to_dataframe method:
[4]:
square.to_dataframe()
[4]:
id | result | config.x | |
---|---|---|---|
0 | 2024-02-24_11-42-11_694149 | 0 | 0 |
1 | 2024-02-24_11-42-12_043764 | 1 | 1 |
2 | 2024-02-24_11-42-12_052862 | 4 | 2 |
3 | 2024-02-24_11-42-12_065819 | 9 | 3 |
4 | 2024-02-24_11-42-12_075406 | 16 | 4 |
Observations#
Each Observation object is a light-weight wrapper around:
a unique identifier (implemented as a timestamped string)
the exact configuration (args, kwargs and defaults) used to run the experiment
the result of the experiment (the return value of the function)
a dictionary of metadata that internal and user-defined callback hooks can use to store other relevant information
[5]:
import json
_dict = square.observations()[0]._asdict()
print(json.dumps(_dict, indent=4))
{
"id": "2024-02-24_11-42-11_694149",
"config": {
"x": 0
},
"result": 0,
"metadata": {
"timing": {
"start": "2024-02-24 11:42:11",
"end": "2024-02-24 11:42:11",
"duration": 4e-06
},
"environment": {
"system": {
"platform": "macOS-14.2.1-arm64-arm-64bit",
"machine": "arm64",
"processor": "arm",
"system": "Darwin",
"python_version": "3.8.18",
"pwd": "/Users/john/projects/digital_experiments/docs/source"
},
"pip_freeze": "alabaster==0.7.13\nanyio==4.2.0\nappnope==0.1.3\nargon2-cffi==23.1.0\nargon2-cffi-bindings==21.2.0\narrow==1.3.0\nasttokens==2.4.1\nasync-lru==2.0.4\nattrs==23.2.0\nBabel==2.14.0\nbackcall==0.2.0\nbeautifulsoup4==4.12.3\nbleach==6.1.0\nbumpver==2023.1129\ncertifi==2023.11.17\ncffi==1.16.0\ncharset-normalizer==3.3.2\nclick==8.1.7\ncolorama==0.4.6\ncomm==0.2.1\ncoverage==7.4.0\ndebugpy==1.8.0\ndecorator==5.1.1\ndefusedxml==0.7.1\n-e git+https://github.com/jla-gardner/digital-experiments.git@581282ea698f7c031ec04fb17a199349fd6e04be#egg=digital_experiments\ndocutils==0.20.1\nexceptiongroup==1.2.0\nexecuting==2.0.1\nfastjsonschema==2.19.1\nfqdn==1.5.1\nfuro==2023.9.10\nidna==3.6\nimagesize==1.4.1\nimportlib-metadata==7.0.1\nimportlib-resources==6.1.1\niniconfig==2.0.0\nipykernel==6.29.0\nipython==8.12.3\nisoduration==20.11.0\njedi==0.19.1\nJinja2==3.1.3\njson5==0.9.14\njsonpointer==2.4\njsonschema==4.21.1\njsonschema-specifications==2023.12.1\njupyter-events==0.9.0\njupyter-lsp==2.2.2\njupyter_client==8.6.0\njupyter_core==5.7.1\njupyter_server==2.12.5\njupyter_server_terminals==0.5.1\njupyterlab==4.0.11\njupyterlab_pygments==0.3.0\njupyterlab_server==2.25.2\nlexid==2021.1006\nlivereload==2.6.3\nlooseversion==1.3.0\nMarkupSafe==2.1.4\nmatplotlib-inline==0.1.6\nmistune==3.0.2\nnbclient==0.9.0\nnbconvert==7.14.2\nnbformat==5.9.2\nnbsphinx==0.9.3\nnest-asyncio==1.5.9\nnotebook==7.0.7\nnotebook_shim==0.2.3\nnumpy==1.24.4\noverrides==7.6.0\npackaging==23.2\npandas==2.0.3\npandocfilters==1.5.1\nparso==0.8.3\npexpect==4.9.0\npickleshare==0.7.5\npkgutil_resolve_name==1.3.10\nplatformdirs==4.1.0\npluggy==1.3.0\nprometheus-client==0.19.0\nprompt-toolkit==3.0.43\npsutil==5.9.8\nptyprocess==0.7.0\npure-eval==0.2.2\npycparser==2.21\nPygments==2.17.2\npytest==7.4.4\npytest-cov==4.1.0\npython-dateutil==2.8.2\npython-json-logger==2.0.7\npytz==2023.3.post1\nPyYAML==6.0.1\npyzmq==25.1.2\nreferencing==0.32.1\nrequests==2.31.0\nrfc3339-validator==0.1.4\nrfc3986-validator==0.1.1\nrpds-py==0.17.1\nruff==0.1.14\nSend2Trash==1.8.2\nsix==1.16.0\nsniffio==1.3.0\nsnowballstemmer==2.2.0\nsoupsieve==2.5\nSphinx==7.1.2\nsphinx-autobuild==2021.3.14\nsphinx-basic-ng==1.0.0b2\nsphinx-copybutton==0.5.2\nsphinx_design==0.5.0\nsphinxcontrib-applehelp==1.0.4\nsphinxcontrib-devhelp==1.0.2\nsphinxcontrib-htmlhelp==2.0.1\nsphinxcontrib-jsmath==1.0.1\nsphinxcontrib-qthelp==1.0.3\nsphinxcontrib-serializinghtml==1.1.5\nsphinxext-opengraph==0.9.1\nstack-data==0.6.3\nterminado==0.18.0\ntinycss2==1.2.1\ntoml==0.10.2\ntomli==2.0.1\ntornado==6.4\ntraitlets==5.14.1\ntypes-python-dateutil==2.8.19.20240106\ntyping_extensions==4.9.0\ntzdata==2023.4\nuri-template==1.3.0\nurllib3==2.1.0\nwcwidth==0.2.13\nwebcolors==1.13\nwebencodings==0.5.1\nwebsocket-client==1.7.0\nzipp==3.17.0",
"git": {
"branch": "master",
"commit": "581282ea698f7c031ec04fb17a199349fd6e04be",
"remote": "https://github.com/jla-gardner/digital-experiments.git"
}
},
"code": "@experiment\ndef square(x):\n return x * x\n"
}
}
By default, digital-experiments
times how long the experiment took, and the exact code that was run. The latter is particularly useful when we’re rapdily iterating on an experiment’s code, and want to be able to reproduce the results of a previous run. Other useful information is also stored, such as the current git commit
, details of the python environment and information about the machine the experiment was run on. This ensures a high level of reproducibility and traceability.
Extra timing information can be added to this metadata by using the time_block context.
Backends#
By default, digital-experiments
stores each observation in its own .pkl
file located at ./experiments/<experiment_name>/<id>.pkl
:
[6]:
!ls ./experiments/square
2024-02-24_11-42-11_694149.pkl 2024-02-24_11-42-12_065819.pkl
2024-02-24_11-42-12_043764.pkl 2024-02-24_11-42-12_075406.pkl
2024-02-24_11-42-12_052862.pkl
Other backends are available (see the complete list here), or you can implement your own.
You can also specify the root directory for a given experiment by passing the root
argument to the [@experiment](api/core.rst#digital_experiments.experiment) decorator, or by setting the DE_ROOT
environment variable:
[7]:
from pathlib import Path
@experiment(backend="json", root=Path("some/other/path"))
def cube(x):
return x ** 3
cube(4)
!ls ./some/other/path
2024-02-24_11-42-12_539708.json
Artefacts#
digital-experiments
assigns and provides a unique directory on disk per run of an experiment. This can be accessed within an experiment using the current_dir function. Any files saved to this directory during the experiment are available post hoc via the artefacts function.
[8]:
from digital_experiments import current_dir
@experiment
def saving_experiment():
(current_dir() / 'test.txt').write_text('hello world')
saving_experiment()
id = saving_experiment.observations()[0].id
saving_experiment.artefacts(id)
[8]:
[PosixPath('experiments/saving_experiment/storage/2024-02-24_11-42-12_687206/test.txt')]