RMSprop
phasic.svgd.RMSprop(learning_rate=0.001, decay=0.99, epsilon=1e-08)RMSprop optimizer for SVGD.
RMSprop adapts the learning rate for each parameter by dividing by an exponentially decaying average of squared gradients. This helps with non-stationary objectives and noisy gradients.
Parameters
learning_rate :floatorStepSizeSchedule= 0.001-
Base learning rate. Can be a schedule for learning rate decay.
decay :floatorStepSizeSchedule= 0.99-
Decay rate for the moving average of squared gradients. Higher values give longer memory. Can be a schedule.
epsilon :float= 1e-8-
Small constant for numerical stability.
Attributes
v :arrayor None-
Moving average of squared gradients, shape (n_particles, theta_dim)
Examples
>>> from phasic import SVGD, RMSprop
>>>
>>> optimizer = RMSprop(learning_rate=0.001, decay=0.99)
>>> svgd = SVGD(
... model=model,
... observed_data=observations,
... theta_dim=2,
... optimizer=optimizer
... )References
Hinton, G. (2012). Lecture 6.5 - RMSprop. Coursera: Neural Networks for Machine Learning.
Notes
Update rule: v = decay * v + (1 - decay) * gradient²; params += lr * gradient / (√v + ε)
Methods
| Name | Description |
|---|---|
| reset | Reset optimizer state for given particle shape. |
| step | Compute RMSprop update. |
reset
phasic.svgd.RMSprop.reset(shape)Reset optimizer state for given particle shape.
step
phasic.svgd.RMSprop.step(phi, particles=None)Compute RMSprop update.
Parameters
phi :array(n_particles,theta_dim)-
SVGD gradient direction.
particles :array(n_particles,theta_dim) = None-
Current particle positions. Not used by RMSprop.
Returns
update :array(n_particles,theta_dim)-
Scaled update to add to particles.