This scheduling technique reduces the learning rate every epoch (or every eval period in case of iteration trainer) by a factor "gamma".
At the last epoch, it sets the learning rate as the initial Base Learning Rate.
The factor by which the learning rate is decayed every epoch.
The value of gamma should be less than 1 in order to reduce the learning rate.
import torch model = [Parameter(torch.randn(2, 2, requires_grad=True))] optimizer = torch.optim.AdamW(model.parameters(), lr=learning_rate, weight_decay=0.01, amsgrad=False) scheduler=torch.optim.lr_scheduler.ExponentialLR(optimizer, gamma=0.1, last_epoch=-1, verbose=False) for epoch in range(20): for input, target in dataset: optimizer.zero_grad() output = model(input) loss = loss_fn(output, target) loss.backward() optimizer.step() scheduler.step()