torchreid.optim

Optimizer

torchreid.optim.optimizer.build_optimizer(model, optim='adam', lr=0.0003, weight_decay=0.0005, momentum=0.9, sgd_dampening=0, sgd_nesterov=False, rmsprop_alpha=0.99, adam_beta1=0.9, adam_beta2=0.99, staged_lr=False, new_layers='', base_lr_mult=0.1)[source]

A function wrapper for building an optimizer.

Parameters
  • model (nn.Module) – model.

  • optim (str, optional) – optimizer. Default is “adam”.

  • lr (float, optional) – learning rate. Default is 0.0003.

  • weight_decay (float, optional) – weight decay (L2 penalty). Default is 5e-04.

  • momentum (float, optional) – momentum factor in sgd. Default is 0.9,

  • sgd_dampening (float, optional) – dampening for momentum. Default is 0.

  • sgd_nesterov (bool, optional) – enables Nesterov momentum. Default is False.

  • rmsprop_alpha (float, optional) – smoothing constant for rmsprop. Default is 0.99.

  • adam_beta1 (float, optional) – beta-1 value in adam. Default is 0.9.

  • adam_beta2 (float, optional) – beta-2 value in adam. Default is 0.99,

  • staged_lr (bool, optional) – uses different learning rates for base and new layers. Base layers are pretrained layers while new layers are randomly initialized, e.g. the identity classification layer. Enabling staged_lr can allow the base layers to be trained with a smaller learning rate determined by base_lr_mult, while the new layers will take the lr. Default is False.

  • new_layers (str or list) – attribute names in model. Default is empty.

  • base_lr_mult (float, optional) – learning rate multiplier for base layers. Default is 0.1.

Examples::
>>> # A normal optimizer can be built by
>>> optimizer = torchreid.optim.build_optimizer(model, optim='sgd', lr=0.01)
>>> # If you want to use a smaller learning rate for pretrained layers
>>> # and the attribute name for the randomly initialized layer is 'classifier',
>>> # you can do
>>> optimizer = torchreid.optim.build_optimizer(
>>>     model, optim='sgd', lr=0.01, staged_lr=True,
>>>     new_layers='classifier', base_lr_mult=0.1
>>> )
>>> # Now the `classifier` has learning rate 0.01 but the base layers
>>> # have learning rate 0.01 * 0.1.
>>> # new_layers can also take multiple attribute names. Say the new layers
>>> # are 'fc' and 'classifier', you can do
>>> optimizer = torchreid.optim.build_optimizer(
>>>     model, optim='sgd', lr=0.01, staged_lr=True,
>>>     new_layers=['fc', 'classifier'], base_lr_mult=0.1
>>> )

LR Scheduler

torchreid.optim.lr_scheduler.build_lr_scheduler(optimizer, lr_scheduler='single_step', stepsize=1, gamma=0.1, max_epoch=1)[source]

A function wrapper for building a learning rate scheduler.

Parameters
  • optimizer (Optimizer) – an Optimizer.

  • lr_scheduler (str, optional) – learning rate scheduler method. Default is single_step.

  • stepsize (int or list, optional) – step size to decay learning rate. When lr_scheduler is “single_step”, stepsize should be an integer. When lr_scheduler is “multi_step”, stepsize is a list. Default is 1.

  • gamma (float, optional) – decay rate. Default is 0.1.

  • max_epoch (int, optional) – maximum epoch (for cosine annealing). Default is 1.

Examples::
>>> # Decay learning rate by every 20 epochs.
>>> scheduler = torchreid.optim.build_lr_scheduler(
>>>     optimizer, lr_scheduler='single_step', stepsize=20
>>> )
>>> # Decay learning rate at 30, 50 and 55 epochs.
>>> scheduler = torchreid.optim.build_lr_scheduler(
>>>     optimizer, lr_scheduler='multi_step', stepsize=[30, 50, 55]
>>> )