If you have ever worked on a Computer Vision project, you might know that using a learning rate scheduler might significantly increase your model training performance. On this page, we will:
Let’s jump in.
CosineAnnealingLR is a scheduling technique that starts with a very large learning rate and then aggressively decreases it to a value near 0 before increasing the learning rate again.
Each time the “restart” occurs, we take the good weights from the previous “cycle” as the starting point. Thus, with each restart, the algorithm approaches the minimal loss closer.
Below is the formula for the learning rate at each step:
In this formula:
Hello, thank you for using the code provided by Hasty. Please note that some code blocks might not be 100% complete and ready to be run as is. This is done intentionally as we focus on implementing only the most challenging parts that might be tough to pick up from scratch. View our code block as a LEGO block - you can’t use it as a standalone solution, but you can take it and add to your system to complement it. If you have questions about using the tool, please get in touch with us to get direct help from the Hasty team.
import torch
model = [Parameter(torch.randn(2, 2, requires_grad=True))]
optimizer = torch.optim.AdamW(model.parameters(), lr=learning_rate, weight_decay=0.01, amsgrad=False)
scheduler = torch.optim.lr_scheduler.CosineAnnealingLR(optimizer, T_max=1000, eta_min=0, last_epoch=-1, verbose=False)
for epoch in range(20):
for input, target in dataset:
optimizer.zero_grad()
output = model(input)
loss = loss_fn(output, target)
loss.backward()
optimizer.step()
scheduler.step()
Automate 90% of the work, reduce your time to deployment by 40%, and replace your whole ML software stack with our platform.
Start for free Request a demo