StepLR decays the initial learning rate with some multiplicative factor. The decaying happens every N epochs or every N eval period (in case iteration training is used). This value is set by the user.
The decay of learning rate happens every N epochs. This "N" is the step size.
It is the multiplicative factor by which the learning rate is decayed.
Let us suppose the step size is set to 30, gamma is 0.1 and the base learning rate is 0.05
for €€0<=epoch<30€€, €€lr=0.05€€
for €€30<=epoch<60€€, €€lr=0.05 \cdot 0.1=0.005€€
for €€60<=epoch<90€€, €€lr=0.05 \cdot 0.1^2=0.0005€€
.. and so on
import torch model = [Parameter(torch.randn(2, 2, requires_grad=True))] optimizer = torch.optim.AdamW(model.parameters(), lr=learning_rate, weight_decay=0.01, amsgrad=False) scheduler=torch.optim.lr_scheduler.StepLR(optimizer, step_size=30, gamma=0.1, last_epoch=-1, verbose=False) for epoch in range(20): for input, target in dataset: optimizer.zero_grad() output = model(input) loss = loss_fn(output, target) loss.backward() optimizer.step() scheduler.step()