Sparsifier

Make your neural network sparse with fastai

A sparse vector, as opposed to a dense one, is a vector which contains a lot of zeroes. When we speak about making a neural network sparse, we thus mean that the network’s weight are mostly zeroes.

With fasterai, you can do that thanks to the Sparsifier class.

Let’s start by creating a model

model = resnet18()

As you probably know, weights in a convolutional neural network have 4 dimensions ($ c_{out} c_{in} k_h k_w$)

model.conv1.weight.ndim
4

In the case of ResNet18, the dimension of the first layer weights is \(64 \times 3 \times 7 \times 7\). We thus can plot each of the \(64\) filter as a \(7 \times 7\) color image (because they contains \(3\) channels).

plot_kernels(model.conv1)

The Sparsifier class allows us to remove some (part of) the filters, that are considered to be less useful than others. This can be done by first creating an instance of the class, specifying:

User can pass a single layer to prune by using the Sparsifier.sparsify_layer method.


Sparsifier.sparsify_layer

 Sparsifier.sparsify_layer (m, sparsity, round_to=None)
model = resnet18()
sparsifier = Sparsifier(model, 'filter', 'local', large_final)
sparsifier.sparsify_layer(model.conv1, 70)
sparsifier.print_sparsity()
Sparsity in Conv2d 1: 70.31%
Sparsity in Conv2d 7: 0.00%
Sparsity in Conv2d 10: 0.00%
Sparsity in Conv2d 13: 0.00%
Sparsity in Conv2d 16: 0.00%
Sparsity in Conv2d 20: 0.00%
Sparsity in Conv2d 23: 0.00%
Sparsity in Conv2d 26: 0.00%
Sparsity in Conv2d 29: 0.00%
Sparsity in Conv2d 32: 0.00%
Sparsity in Conv2d 36: 0.00%
Sparsity in Conv2d 39: 0.00%
Sparsity in Conv2d 42: 0.00%
Sparsity in Conv2d 45: 0.00%
Sparsity in Conv2d 48: 0.00%
Sparsity in Conv2d 52: 0.00%
Sparsity in Conv2d 55: 0.00%
Sparsity in Conv2d 58: 0.00%
Sparsity in Conv2d 61: 0.00%
Sparsity in Conv2d 64: 0.00%

Most of the time, we may want to prune the whole model at once, using the Sparsifier.sparsify_model method, indicating the percentage of sparsity to you want to apply.


Sparsifier.sparsify_model

 Sparsifier.sparsify_model (sparsity, round_to=None)

There are several ways in which we can make that first layer sparse. You will find the most important below:

model = resnet18()
sparsifier = Sparsifier(model, 'weight', 'local', large_final)
sparsifier.sparsify_model(70)
sparsifier.print_sparsity()
Sparsity in Conv2d 1: 69.99%
Sparsity in Conv2d 7: 70.00%
Sparsity in Conv2d 10: 70.00%
Sparsity in Conv2d 13: 70.00%
Sparsity in Conv2d 16: 70.00%
Sparsity in Conv2d 20: 70.00%
Sparsity in Conv2d 23: 70.00%
Sparsity in Conv2d 26: 70.00%
Sparsity in Conv2d 29: 70.00%
Sparsity in Conv2d 32: 70.00%
Sparsity in Conv2d 36: 70.00%
Sparsity in Conv2d 39: 70.00%
Sparsity in Conv2d 42: 70.00%
Sparsity in Conv2d 45: 70.00%
Sparsity in Conv2d 48: 70.00%
Sparsity in Conv2d 52: 70.00%
Sparsity in Conv2d 55: 70.00%
Sparsity in Conv2d 58: 70.00%
Sparsity in Conv2d 61: 70.00%
Sparsity in Conv2d 64: 70.00%

You now have a model that is \(70\%\) sparse !

Granularity

As we said earlier, the granularity defines the structure of parameter that you will remove.

In the example below, we removed weight from each convolutional filter, meaning that we now have sparse filters, as can be seen in the image below:

plot_kernels(model.conv1)

Another granularity is, for example, removing column vectors from the filters. To do so, just change the granularity parameter accordingly.

model = resnet18()
sparsifier = Sparsifier(model, 'column', 'local', large_final)
sparsifier.sparsify_layer(model.conv1, 70)
plot_kernels(model.conv1)

For more information and examples about the pruning granularities, I suggest you to take a look at the corresponding section.

Context

The context defines where to look in the model, i.e. from where do we compare weight. The two basic contexts are: * local, i.e. we compare weight from each layer individually. This will lead to layers with similar levels of sparsity. * global, i.e. we compare weight from the whole model. This will lead to layers with different levels of sparsity

model = resnet18()
sparsifier = Sparsifier(model, 'weight', 'local', large_final)
sparsifier.sparsify_model(70)
sparsifier.print_sparsity()
Sparsity in Conv2d 1: 69.99%
Sparsity in Conv2d 7: 70.00%
Sparsity in Conv2d 10: 70.00%
Sparsity in Conv2d 13: 70.00%
Sparsity in Conv2d 16: 70.00%
Sparsity in Conv2d 20: 70.00%
Sparsity in Conv2d 23: 70.00%
Sparsity in Conv2d 26: 70.00%
Sparsity in Conv2d 29: 70.00%
Sparsity in Conv2d 32: 70.00%
Sparsity in Conv2d 36: 70.00%
Sparsity in Conv2d 39: 70.00%
Sparsity in Conv2d 42: 70.00%
Sparsity in Conv2d 45: 70.00%
Sparsity in Conv2d 48: 70.00%
Sparsity in Conv2d 52: 70.00%
Sparsity in Conv2d 55: 70.00%
Sparsity in Conv2d 58: 70.00%
Sparsity in Conv2d 61: 70.00%
Sparsity in Conv2d 64: 70.00%
model = resnet18()
sparsifier = Sparsifier(model, 'weight', 'global', large_final)
sparsifier.sparsify_model(70)
sparsifier.print_sparsity()
Sparsity in Conv2d 1: 66.14%
Sparsity in Conv2d 7: 32.17%
Sparsity in Conv2d 10: 32.31%
Sparsity in Conv2d 13: 32.28%
Sparsity in Conv2d 16: 31.73%
Sparsity in Conv2d 20: 44.03%
Sparsity in Conv2d 23: 44.44%
Sparsity in Conv2d 26: 15.32%
Sparsity in Conv2d 29: 44.43%
Sparsity in Conv2d 32: 44.24%
Sparsity in Conv2d 36: 59.17%
Sparsity in Conv2d 39: 59.22%
Sparsity in Conv2d 42: 22.02%
Sparsity in Conv2d 45: 59.24%
Sparsity in Conv2d 48: 59.14%
Sparsity in Conv2d 52: 75.82%
Sparsity in Conv2d 55: 75.86%
Sparsity in Conv2d 58: 30.28%
Sparsity in Conv2d 61: 75.86%
Sparsity in Conv2d 64: 75.88%

Criteria

The criteria defines how we select the parameters to remove. It is usually given by a scoring method. The most common one is the large_final, i.e. select parameters with the highest absolute value as they are supposed to contribute the most to the final results of the model.

model = resnet18()
sparsifier = Sparsifier(model, 'weight', 'global', large_final)
sparsifier.sparsify_model(70)
sparsifier.print_sparsity()
Sparsity in Conv2d 1: 66.46%
Sparsity in Conv2d 7: 31.56%
Sparsity in Conv2d 10: 31.81%
Sparsity in Conv2d 13: 31.92%
Sparsity in Conv2d 16: 32.47%
Sparsity in Conv2d 20: 44.29%
Sparsity in Conv2d 23: 43.94%
Sparsity in Conv2d 26: 15.31%
Sparsity in Conv2d 29: 44.11%
Sparsity in Conv2d 32: 44.10%
Sparsity in Conv2d 36: 59.13%
Sparsity in Conv2d 39: 59.23%
Sparsity in Conv2d 42: 21.60%
Sparsity in Conv2d 45: 59.36%
Sparsity in Conv2d 48: 59.32%
Sparsity in Conv2d 52: 75.91%
Sparsity in Conv2d 55: 75.88%
Sparsity in Conv2d 58: 30.19%
Sparsity in Conv2d 61: 75.80%
Sparsity in Conv2d 64: 75.87%
model = resnet18()
sparsifier = Sparsifier(model, 'weight', 'global', small_final)
sparsifier.sparsify_model(70)
sparsifier.print_sparsity()
Sparsity in Conv2d 1: 38.87%
Sparsity in Conv2d 7: 2.38%
Sparsity in Conv2d 10: 0.65%
Sparsity in Conv2d 13: 1.63%
Sparsity in Conv2d 16: 1.32%
Sparsity in Conv2d 20: 1.72%
Sparsity in Conv2d 23: 4.12%
Sparsity in Conv2d 26: 0.17%
Sparsity in Conv2d 29: 0.97%
Sparsity in Conv2d 32: 7.12%
Sparsity in Conv2d 36: 26.24%
Sparsity in Conv2d 39: 8.62%
Sparsity in Conv2d 42: 0.22%
Sparsity in Conv2d 45: 10.87%
Sparsity in Conv2d 48: 4.43%
Sparsity in Conv2d 52: 90.31%
Sparsity in Conv2d 55: 94.71%
Sparsity in Conv2d 58: 0.41%
Sparsity in Conv2d 61: 96.16%
Sparsity in Conv2d 64: 84.94%

For more information and examples about the pruning criteria, I suggest you to take a look at the corresponding section.

Remark

In some case, you may want to impose the remaining amount of parameters to be a multiple of 8, this can be done by passing the round_to parameter.

model = resnet18()
sparsifier = Sparsifier(model, 'filter', 'local', large_final)
sparsifier.sparsify_model(70, round_to=8)
sparsifier.print_sparsity()
Sparsity in Conv2d 1: 62.50%
Sparsity in Conv2d 7: 62.50%
Sparsity in Conv2d 10: 62.50%
Sparsity in Conv2d 13: 62.50%
Sparsity in Conv2d 16: 62.50%
Sparsity in Conv2d 20: 68.75%
Sparsity in Conv2d 23: 68.75%
Sparsity in Conv2d 26: 68.75%
Sparsity in Conv2d 29: 68.75%
Sparsity in Conv2d 32: 68.75%
Sparsity in Conv2d 36: 68.75%
Sparsity in Conv2d 39: 68.75%
Sparsity in Conv2d 42: 68.75%
Sparsity in Conv2d 45: 68.75%
Sparsity in Conv2d 48: 68.75%
Sparsity in Conv2d 52: 68.75%
Sparsity in Conv2d 55: 68.75%
Sparsity in Conv2d 58: 68.75%
Sparsity in Conv2d 61: 68.75%
Sparsity in Conv2d 64: 68.75%
model = resnet18()
sparsifier = Sparsifier(model, 'filter', 'global', large_final)
sparsifier.sparsify_model(70, round_to=8)
sparsifier.print_sparsity()
Sparsity in Conv2d 1: 87.50%
Sparsity in Conv2d 7: 0.00%
Sparsity in Conv2d 10: 0.00%
Sparsity in Conv2d 13: 0.00%
Sparsity in Conv2d 16: 0.00%
Sparsity in Conv2d 20: 93.75%
Sparsity in Conv2d 23: 93.75%
Sparsity in Conv2d 26: 0.00%
Sparsity in Conv2d 29: 93.75%
Sparsity in Conv2d 32: 93.75%
Sparsity in Conv2d 36: 96.88%
Sparsity in Conv2d 39: 96.88%
Sparsity in Conv2d 42: 0.00%
Sparsity in Conv2d 45: 96.88%
Sparsity in Conv2d 48: 93.75%
Sparsity in Conv2d 52: 98.44%
Sparsity in Conv2d 55: 98.44%
Sparsity in Conv2d 58: 0.00%
Sparsity in Conv2d 61: 98.44%
Sparsity in Conv2d 64: 96.88%

For more information about granularities at which you can operate, please check the related page.