RMSprop

phasic.svgd.RMSprop(learning_rate=0.001, decay=0.99, epsilon=1e-08)

RMSprop optimizer for SVGD.

RMSprop adapts the learning rate for each parameter by dividing by an exponentially decaying average of squared gradients. This helps with non-stationary objectives and noisy gradients.

Parameters

learning_rate : float or StepSizeSchedule = 0.001

Base learning rate. Can be a schedule for learning rate decay.

decay : float or StepSizeSchedule = 0.99

Decay rate for the moving average of squared gradients. Higher values give longer memory. Can be a schedule.

epsilon : float = 1e-8

Small constant for numerical stability.

Attributes

v : array or None

Moving average of squared gradients, shape (n_particles, theta_dim)

Examples

>>> from phasic import SVGD, RMSprop
>>>
>>> optimizer = RMSprop(learning_rate=0.001, decay=0.99)
>>> svgd = SVGD(
...     model=model,
...     observed_data=observations,
...     theta_dim=2,
...     optimizer=optimizer
... )

References

Hinton, G. (2012). Lecture 6.5 - RMSprop. Coursera: Neural Networks for Machine Learning.

Notes

Update rule: v = decay * v + (1 - decay) * gradient²; params += lr * gradient / (√v + ε)

Methods

Name Description
reset Reset optimizer state for given particle shape.
step Compute RMSprop update.

reset

phasic.svgd.RMSprop.reset(shape)

Reset optimizer state for given particle shape.

step

phasic.svgd.RMSprop.step(phi, particles=None)

Compute RMSprop update.

Parameters

phi : array(n_particles, theta_dim)

SVGD gradient direction.

particles : array(n_particles, theta_dim) = None

Current particle positions. Not used by RMSprop.

Returns

update : array(n_particles, theta_dim)

Scaled update to add to particles.