{
"cells": [
{
"cell_type": "markdown",
"id": "aad8e99a",
"metadata": {},
"source": [
"# Low-rank approximations\n",
"\n",
"Assume we have a training set composed of $N$ data points.\n",
"Fitting a (full-rank) GPR model requires:\n",
"- $\\mathcal{O}(N^2)$ memory (for storing the $N\\times N$ kernel matrix)\n",
"- $\\mathcal{O}(N^3)$ time (dominated by inverting above matrix)\n",
"\n",
"Further, assume we want to make predictions at $T$ test points. Making predictions requires:\n",
"- $\\mathcal{O}(NT)$ time and memory to compute the predictive mean (dominated by calculating the $N\\times T$ matrix of kernel evaluations)\n",
"- $\\mathcal{O}(N^2T)$ time to compute the predictive (diagonal) variance\n",
"\n",
"This means that full-rank GPR models become prohibitively expensive for large datasets.\n",
"\n",
"A popular approach to reducing this cost is to approximate the $N\\times N$ kernel matrix taken over the training data with a lower-rank matrix. To do this, a set of $M$ \"sparse\" or \"inducing\" points are chosen at representative locations in the training data. This slightly reduces the accuracy of the model, but can dramatically reduce the cost of fitting and inference when $M \\ll N$:\n",
"\n",
"- $\\mathcal{O}(NM^2 + M^3)$ memory to store the $M\\times M$ kernel matrix over the inducing points\n",
"- $\\mathcal{O}(N^2M + M^3)$ time to fit the model\n",
"- $\\mathcal{O}(TM)$ time to make mean predictions\n",
"- $\\mathcal{O}(TM^2)$ space to make predictive variance predictions\n",
"\n",
"`mini-gpr` implements the [Subset of Regressors](https://uk.mathworks.com/help/stats/subset-of-regressors-approximation-for-gpr-models.html) (`SoR`) approach: this model conforms to the same interface as the full-rank `GPR` model, and so is very easy to use as a drop-in replacement.\n",
"\n",
"Below, we show examples of optimising both GPR and SoR models for a synthetic 1D system:"
]
},
{
"cell_type": "code",
"execution_count": 1,
"id": "5ae1498d",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"{'sigma': 2.2437003072015034, 'scale': 1.4595497350774491}\n"
]
},
{
"data": {
"image/svg+xml": [
"\n",
"\n",
"\n"
],
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"from mini_gpr.kernels import RBF\n",
"from mini_gpr.models import GPR\n",
"from mini_gpr.opt import maximise_log_likelihood, optimise_model\n",
"from mini_gpr.tutorials import sample_toy_1d_system\n",
"from mini_gpr.viz import show_model_predictions\n",
"\n",
"x_train, y_train = sample_toy_1d_system()\n",
"full_rank_model = optimise_model(\n",
" GPR(RBF(), noise=0.1),\n",
" maximise_log_likelihood,\n",
" x_train,\n",
" y_train,\n",
" optimise_noise=True,\n",
")\n",
"show_model_predictions(full_rank_model, x_train, y_train)\n",
"print(full_rank_model.kernel.params)"
]
},
{
"cell_type": "code",
"execution_count": 2,
"id": "fbf06b3a",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"{'sigma': 2.5377083633879733, 'scale': 2.2140375820386287}\n"
]
},
{
"data": {
"image/svg+xml": [
"\n",
"\n",
"\n"
],
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"import numpy as np\n",
"\n",
"from mini_gpr.models import SoR\n",
"\n",
"model = SoR(\n",
" kernel=RBF(),\n",
" # use 5 inducing points, equally along the range of the training data\n",
" sparse_points=np.linspace(1, 9, num=5),\n",
" noise=0.1,\n",
")\n",
"low_rank_model = optimise_model(\n",
" model,\n",
" maximise_log_likelihood,\n",
" x_train,\n",
" y_train,\n",
" optimise_noise=True,\n",
")\n",
"show_model_predictions(low_rank_model, x_train, y_train)\n",
"print(low_rank_model.kernel.params)"
]
}
],
"metadata": {
"kernelspec": {
"display_name": ".venv",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.11.11"
}
},
"nbformat": 4,
"nbformat_minor": 5
}