Main Content

Compute gradients for custom training loops using automatic differentiation

Use `dlgradient`

to compute derivatives using automatic
differentiation for custom training loops.

**Tip**

For most deep learning tasks, you can use a pretrained network and adapt it to your own data. For an example showing how to use transfer learning to retrain a convolutional neural network to classify a new set of images, see Train Deep Learning Network to Classify New Images. Alternatively, you can create and train networks from scratch using `layerGraph`

objects with the `trainNetwork`

and `trainingOptions`

functions.

If the `trainingOptions`

function does not provide the training options that you need for your task, then you can create a custom training loop using automatic differentiation. To learn more, see Define Deep Learning Network for Custom Training Loops.

`[`

returns the gradients of `dydx1,...,dydxk`

] = dlgradient(`y`

,`x1,...,xk`

)`y`

with respect to the variables
`x1`

through `xk`

.

Call `dlgradient`

from inside a function passed to
`dlfeval`

. See Compute Gradient Using Automatic Differentiation and Use Automatic Differentiation In Deep Learning Toolbox.

`[`

returns the gradients and specifies additional options using one or more name-value pairs.
For example, `dydx1,...,dydxk`

] = dlgradient(`y`

,`x1,...,xk`

,`Name,Value`

)`dydx = dlgradient(y,x,'RetainData',true)`

causes the gradient
to retain intermediate values for reuse in subsequent `dlgradient`

calls.
This syntax can save time, but uses more memory. For more information, see Tips.

The

`dlgradient`

function does not support calculating higher-order derivatives when using`dlnetwork`

objects containing custom layers with a custom backward function.The

`dlgradient`

function does not support calculating higher-order derivatives when using`dlnetwork`

objects containing the following layers:`gruLayer`

`lstmLayer`

`bilstmLayer`

The

`dlgradient`

function does not support calculating higher-order derivatives that depend on the following functions:`gru`

`lstm`

`embed`

`prod`

`interp1`

A

`dlgradient`

call must be inside a function. To obtain a numeric value of a gradient, you must evaluate the function using`dlfeval`

, and the argument to the function must be a`dlarray`

. See Use Automatic Differentiation In Deep Learning Toolbox.To enable the correct evaluation of gradients, the

`y`

argument must use only supported functions for`dlarray`

. See List of Functions with dlarray Support.If you set the

`'RetainData'`

name-value pair argument to`true`

, the software preserves tracing for the duration of the`dlfeval`

function call instead of erasing the trace immediately after the derivative computation. This preservation can cause a subsequent`dlgradient`

call within the same`dlfeval`

call to be executed faster, but uses more memory. For example, in training an adversarial network, the`'RetainData'`

setting is useful because the two networks share data and functions during training. See Train Generative Adversarial Network (GAN).When you need to calculate first-order derivatives only, ensure that the

`'EnableHigherDerivatives'`

option is`false`

as this is usually quicker and requires less memory.Complex gradients are calculated using the Wirtinger derivative. The gradient is defined in the direction of increase of the real part of the function to differentiate. This is because the variable to differentiate — for example, the loss — must be real, even if the function is complex.