SGDMomentum

phasic.svgd.SGDMomentum(learning_rate=0.01, momentum=0.9, max_velocity=1.0)

SGD with momentum optimizer for SVGD.

Momentum helps accelerate gradients in the right direction and dampens oscillations. It accumulates a velocity vector in directions of persistent gradient descent.

Parameters

learning_rate : float or StepSizeSchedule = 0.01

Step size for parameter updates. Can be a schedule for learning rate decay.

momentum : float or StepSizeSchedule = 0.9

Momentum coefficient. Higher values give more weight to past gradients. Typical values: 0.9 (standard), 0.99 (high momentum). Can be a schedule.

max_velocity : float or None = 1.0

Maximum absolute velocity to prevent unbounded accumulation. Set to None to disable velocity clipping (not recommended with positive_params=True as it can cause numerical issues).

Attributes

v : array or None

Velocity (accumulated gradient), shape (n_particles, theta_dim)

Examples

>>> from phasic import SVGD, SGDMomentum
>>>
>>> optimizer = SGDMomentum(learning_rate=0.01, momentum=0.9)
>>> svgd = SVGD(
...     model=model,
...     observed_data=observations,
...     theta_dim=2,
...     optimizer=optimizer
... )

Notes

Update rule: v = momentum * v + lr * gradient; params += v

Velocity is clipped to [-max_velocity, max_velocity] to prevent unbounded growth, which can cause numerical issues when using positive_params=True.

Methods

Name Description
reset Reset optimizer state for given particle shape.
step Compute SGD with momentum update.

reset

phasic.svgd.SGDMomentum.reset(shape)

Reset optimizer state for given particle shape.

step

phasic.svgd.SGDMomentum.step(phi, particles=None)

Compute SGD with momentum update.

Parameters

phi : array(n_particles, theta_dim)

SVGD gradient direction.

particles : array(n_particles, theta_dim) = None

Current particle positions. Not used by SGDMomentum.

Returns

update : array(n_particles, theta_dim)

Scaled update to add to particles.