SGDMomentum
phasic.svgd.SGDMomentum(learning_rate=0.01, momentum=0.9, max_velocity=1.0)SGD with momentum optimizer for SVGD.
Momentum helps accelerate gradients in the right direction and dampens oscillations. It accumulates a velocity vector in directions of persistent gradient descent.
Parameters
learning_rate :floatorStepSizeSchedule= 0.01-
Step size for parameter updates. Can be a schedule for learning rate decay.
momentum :floatorStepSizeSchedule= 0.9-
Momentum coefficient. Higher values give more weight to past gradients. Typical values: 0.9 (standard), 0.99 (high momentum). Can be a schedule.
max_velocity :floator None = 1.0-
Maximum absolute velocity to prevent unbounded accumulation. Set to None to disable velocity clipping (not recommended with positive_params=True as it can cause numerical issues).
Attributes
v :arrayor None-
Velocity (accumulated gradient), shape (n_particles, theta_dim)
Examples
>>> from phasic import SVGD, SGDMomentum
>>>
>>> optimizer = SGDMomentum(learning_rate=0.01, momentum=0.9)
>>> svgd = SVGD(
... model=model,
... observed_data=observations,
... theta_dim=2,
... optimizer=optimizer
... )Notes
Update rule: v = momentum * v + lr * gradient; params += v
Velocity is clipped to [-max_velocity, max_velocity] to prevent unbounded growth, which can cause numerical issues when using positive_params=True.
Methods
| Name | Description |
|---|---|
| reset | Reset optimizer state for given particle shape. |
| step | Compute SGD with momentum update. |
reset
phasic.svgd.SGDMomentum.reset(shape)Reset optimizer state for given particle shape.
step
phasic.svgd.SGDMomentum.step(phi, particles=None)Compute SGD with momentum update.
Parameters
phi :array(n_particles,theta_dim)-
SVGD gradient direction.
particles :array(n_particles,theta_dim) = None-
Current particle positions. Not used by SGDMomentum.
Returns
update :array(n_particles,theta_dim)-
Scaled update to add to particles.