taylorPrunableNetwork
Description
A TaylorPrunableNetwork
object enables support for pruning of
filters in convolution layers by using first-order Taylor approximation. To prune filters in a
dlnetwork
object, first convert it to a
TaylorPrunableNetwork
object and then use the associated object
functions.
To prune a deep neural network, you require the Deep Learning Toolbox™ Model Quantization Library support package. This support package is a free add-on that you can download using the Add-On Explorer. Alternatively, see Deep Learning Toolbox Model Quantization Library.
Creation
Description
converts the specified neural network to a prunableNet
= taylorPrunableNetwork(net
)TaylorPrunableNetwork
object. A TaylorPrunableNetwork
is a different representation of the
same network that is suitable for pruning by using the Taylor pruning algorithm. If the
input network does not support pruning, then the function throws an error.
Input Arguments
Properties
Object Functions
forward | Compute deep learning network output for training |
predict | Compute deep learning network output for inference |
updatePrunables | Remove filters from prunable layers based on importance scores |
updateScore | Compute and accumulate Taylor-based importance scores for pruning |
dlnetwork | Deep learning neural network |
Examples
More About
Algorithms
For an individual input data point in the pruning dataset, you use the forward
function to calculate the output of the deep learning network and the
activations of the prunable filters. Then you calculate the gradients of the loss with respect
to these activations using automatic differentiation. You then pass the network, the
activations, and the gradients to the updateScore
function. For each prunable filter in the network, the updateScore
function calculates the change in loss that occurs if that filter is pruned from the network
(up to first-order Taylor approximation). Based on this change, the function associates an
importance score with that filter and updates the TaylorPrunableNetwork
object [1].
Inside the custom pruning loop, you accumulate importance scores for the prunable filters
over all mini-batches of the pruning dataset. Then you pass the network object to the
updatePrunables
function. This functions prunes the filters that have the lowest importance scores and hence
have the smallest effect on the accuracy of the network output. The number of filters that a
single call to the updatePrunables
function prunes is determined by the
optional name-value argument MaxToPrune
, that has a default value of
8
.
All these steps complete a single pruning iteration. To further compress your model, repeat these steps multiple times over a loop.
References
[1] Molchanov, Pavlo, Stephen Tyree, Tero Karras, Timo Aila, and Jan Kautz. "Pruning Convolutional Neural Networks for Resource Efficient Inference." Preprint, submitted June 8, 2017. https://arxiv.org/abs/1611.06440.