Day 21 : 模型優化 - 剪枝 Pruning - iT 邦幫忙::一起幫忙解決難題，拯救 IT 人的一天

2021 iThome 鐵人賽

DAY 21

AI & Data

從 AI 落地談 MLOps系列第 21 篇

Day 21 : 模型優化 - 剪枝 Pruning

13th鐵人賽 tensorflow mlops

威利斯

2021-09-21 00:15:38

4613 瀏覽

分享至

如果說可以讓模型縮小10倍，精度還維持水準，這是什麼巫術?
延續 Day 20 的模型優化作法，本次再結合剪枝技術做到更輕量的模型效果。

什麼是剪枝 Pruning

剪枝 Pruning 將無關緊要的權重 (weight) 刪除歸零，在壓縮時因為稀疏矩陣的特性，能明顯縮小尺寸，可以壓縮到原來 1/3。
如果經過剪枝再量化的模型，甚至可以縮小的原來 1/10 大小。

模型優化剪枝實作

Colab 支援。
本示範採用 Tensorflow 模型優化模組的 prune_low_magnitude() ，可以將 Keras 模型在訓練期間將影響較小的權重修剪歸零。
```
!pip install tensorflow\_model\_optimization
```

建立基本模型

我們的基本模型以訓練後量化相同的基準模型進行優化，模型一樣採用tf.keras.datasets.mnist，用CNN進行建模。
```
ACCURACY:
{'baseline Keras model': 0.9574999809265137}

MODEL_SIZE:
{'baseline h5': 98136} 
```

使用剪枝調整模型

進行剪枝，另外因為剪枝模型方法有增加一層包裝層，摘要顯示的參數會增加。

# Get the pruning method
prune_low_magnitude = tfmot.sparsity.keras.prune_low_magnitude

# Compute end step to finish pruning after 2 epochs.
batch_size = 128
epochs = 2
validation_split = 0.1

num_images = train_images.shape[0] * (1 - validation_split)
end_step = np.ceil(num_images / batch_size).astype(np.int32) * epochs

# Define pruning schedule.
pruning_params = {
    'pruning_schedule': tfmot.sparsity.keras.PolynomialDecay(
        initial_sparsity=0.50,
        final_sparsity=0.80,
        begin_step=0,
        end_step=end_step)
    }

# Pass in the trained baseline model
model_for_pruning = prune_low_magnitude(
    baseline_model, 
    **pruning_params
    )

# `prune_low_magnitude` requires a recompile.
model_for_pruning.compile(
    optimizer='adam',
    loss='sparse_categorical_crossentropy',
    metrics=['accuracy']
    )

model_for_pruning.summary()

參數數量增加了。

觀察剪枝前後的模型權重 weight 變化

剪枝前，有些微弱的權重。

重新訓練模型。並在 Callback 增加 tfmot.sparsity.keras.UpdatePruningStep() 參數。

# Callback to update pruning wrappers at each step
callbacks=[tfmot.sparsity.keras.UpdatePruningStep()]

# Train and prune the model
model_for_pruning.fit(
    train_images, 
    train_labels,
    epochs=epochs, 
    validation_split=validation_split,
    callbacks=callbacks
    )

重新訓練後已修剪，觀察同一層的權重變化，許多不重要的權重已歸零。

剪枝後移除包裝層

剪枝之後，您可以用tfmot.sparsity.keras.strip_pruning()刪除包裝層以具有與基線模型相同的層和參數。
此方法也有助於保存模型並導出為*.tflite檔案格式。
剪枝後尚未壓縮的檔案，模型檔案大小與原先一致，這也挺合理的畢竟都還占著位子。
```
MODEL_SIZE:
{'baseline h5': 98136,
'pruned non quantized h5': 98136} 
```

模型壓縮3倍術

剪枝後的模型再壓縮。

壓縮後檔案大小約為原本1/3，這是因為剪枝後歸零的權重可以更有效的壓縮。

import tempfile
import zipfile

_, zipped_file = tempfile.mkstemp('.zip')
with zipfile.ZipFile(zipped_file, 'w', compression=zipfile.ZIP_DEFLATED) as f:
    f.write('pruned_model.h5')


MODEL_SIZE['pruned non quantized h5'] = get_gzipped_model_size('pruned_model.h5')

MODEL_SIZE:
{'baseline h5': 98136,
'pruned non quantized h5': 25665}

模型壓縮10倍術

現在嘗試將已精剪枝後的模型再量化。

量化原本就會縮小約3倍，將剪枝模型壓縮後再量化，與基本模型相比，這使模型大小減少了約為原本1/10，而且精度還能維持水準。

# 剪枝壓縮後再量化模型
converter = tf.lite.TFLiteConverter.from_keras_model(baseline_model)
converter.optimizations = [tf.lite.Optimize.DEFAULT]

tflite_model = converter.convert()

with open('pruned_quantized.tflite', 'wb') as f:
    f.write(tflite_model)

MODEL_SIZE:
{'baseline h5': 98136,
 'pruned non quantized h5': 25665,
 'pruned quantized tflite': 8129}

 ACCURACY
 {'baseline Keras model': 0.9574999809265137,
 'pruned and quantized tflite': 0.9683,
 'pruned model h5': 0.9685999751091003}