WarmupExpStepSize

phasic.svgd.WarmupExpStepSize(
    peak_lr=0.001,
    warmup_steps=70,
    last_lr=1e-06,
    tau=1000.0,
)

Linear warmup followed by exponential decay.

Useful for Adam optimizer where initial learning rate should ramp up before decaying. Prevents large updates early when moment estimates are poorly calibrated.

Parameters

peak_lr : float = 0.001

Maximum learning rate reached at end of warmup

warmup_steps : int = 100

Number of iterations for linear warmup

last_lr : float = 1e-6

Final learning rate as iteration → ∞

tau : float = 1000.0

Decay time constant after warmup (larger = slower decay)

Examples

>>> schedule = WarmupExpStepSize(peak_lr=0.01, warmup_steps=70, last_lr=0.001, tau=500)
>>> schedule(0)      # iteration 0: start of warmup
0.0001
>>> schedule(50)     # iteration 50: halfway through warmup
0.0051
>>> schedule(100)    # iteration 100: end of warmup, peak lr
0.01
>>> schedule(600)    # iteration 600: decaying after warmup
0.0046