【原】【杂谈】当前模型量化有哪些可用的开源工具？

有三AI 2020-11-27

展开全文

模型量化属于模型优化中的重要技术之一，是非常有效地提升模型推理速度的技术方案，那么当前有哪些可用的模型量化工具呢？

作者&编辑 | 言有三

1 Tensorflow Lite

TensorFlow Lite是谷歌推出的面向嵌入式设备的推理框架，支持float16和int8低精度，其中8bit量化算法细节可以参考白皮书“Quantizing deep convolutional networks for efficient inference: A whitepaper”，支持训练后量化和量化感知训练，这也是大部分量化框架的算法原理。

https://github.com/tensorflow/model-optimization

另外新技术的尝鲜可以关注TensorFlow Model Optimization Toolkit，地址如上，它是谷歌官方开源的模型优化技术包，目前包含了模型剪枝和量化两种API。如果想使用该工具包，需要安装tf-nightly or tf-nightly-gpu。不过会有一些环境冲突，所以体验者最好做好环境隔离工作。

2 TensorRT

TensorRT是Nvidia提出的神经网络推理(Inference)引擎，支持训练后8bit量化，它使用基于交叉熵的模型量化算法，通过最小化两个分布的差异程度来实现。

https://github.com/NVIDIA/TensorRT

caffe-int8-convert-tools是一个Caffe模型量化工具，基于TensorRT2.0。

https://github.com/BUG1989/caffe-int8-convert-tools

3 PaddleSlim

PaddleSlim是百度提出的模型量化工具，包含在PaddlePaddle框架中，支持量化感知训练，离线量化，权重全局量化和通道级别量化。

https://github.com/PaddlePaddle/models/tree/develop/PaddleSlim/quant_low_level_api

4 Pytorch

Pytorch1.3开始已经支持量化功能，基于QNNPACK实现，支持训练后量化，动态量化和量化感知训练等技术。

https://github.com/pytorch/glow/blob/master/docs/Quantization.md

https://github.com/pytorch/QNNPACK

另外Distiller是Intel基于Pytorch开源的模型优化工具，自然也支持Pytorch中的量化技术。

https://github.com/NervanaSystems/distiller

5 其他框架

微软的NNI集成了多种量化感知的训练算法，并支持PyTorch，TensorFlow，MXNet，Caffe2等多个开源框架。

https://github.com/microsoft/nni

keras，Core ML的相关量化开源工具如下

https://github.com/google/qkeras

https://github.com/kingreza/quantization

6 一些论文的实现

以下是一些重要文章算法的实现，发布机构包括Intel研究院，Xilinx，Facebook等。

[1] 论文Incremental Network Quantization: Towards Lossless CNNs with Low-Precision Weights by英特尔https://github.com/AojunZhou/Incremental-Network-Quantization

[2] 论文FINN: A Framework for Fast, Scalable Binarized Neural Network Inference by Xlinx
https://github.com/Xilinx/BNN-PYNQ

[3] 论文And the bit goes down: Revisiting the quantization of neural networks by FaceBook

https://github.com/facebookresearch/kill-the-bits

[4] 论文LQ-Nets: Learned Quantization for Highly Accurate and Compact Deep Neural Networks by microsoft

https://github.com/microsoft/LQ-Nets

[5] 论文HAQ: Hardware-Aware Automated Quantization with Mixed Precision by Massachusetts Institute of Technology

https://github.com/mit-han-lab/haq

更多的留待读者自己学习吧，咱们就不沉迷于收藏了。

7 更多理论学习

如果想要系统性学习模型优化相关的理论，可以移步有三AI知识星球 -> 网络结构1000变 -> 模型压缩板块 -> 模型剪枝，量化与蒸馏板块，一些解读案例如下：