SPSA#
What is SPSA?#
Simultaneous Perturbation Stochastic Approximation (SPSA) is an algorithm developed by Spall. It can be used if noisy and unbiased measurements of the gradient \(g(\boldsymbol{\theta}\)) are available. It can also be used if only (noisy) measurements of the loss function \(f(\boldsymbol{\theta})\) are available.
The advantage of SPSA compared to other algorithms is that only two loss measurements are required to generate an update. Therefore, SPSA is scalable.
How does SPSA work?#
Let \(\eta_i \in (0, \infty)\) be a perturbation vector and \(\Delta_i\) be a random vector such that \(\{\Delta_i\}\) is an iid sequence with \(\Delta_i(k)\) and \(1/\Delta_i(k)\) bounded and symmetric around zero. The components \(\Delta_i(k)\) are mutually independent. In practice, often the following binary random variable is used for \(\Delta_i\):
for all \(i\) and \(j\).
The SPSA algorithm is then as follows:
Binder#
If you want to experiment with the SPSA algorithm, you can use the provided Jupyter Notebook. If you want to experiment with the N-dimensional version of the algorithm, then you can use this Jupyter Notebook.
You can also run the notebooks directly in your browser by using Binder; simply click on the following button to open the SPSA notebook: