First, thank you so much for awesome project and releasing to the public
I am trying model quantization following your quantization guidelines
https://github.com/Tencent/ncnn/tree/master/tools/quantize
but after finishing all steps without any single lines of error, model size doesn't decrease at all. Checking ncnn::Mat::elemsize, model file changed from float64 to int8 which means quantization works to some degree.
What's the problem with it? Am I did something wrong?
7条答案
按热度按时间xfb7svmp1#
我也碰到了同样的问题。
example.param
和 `example.bin 是得到的初始ncnn模型。ncnnoptimize 优化之后,模型大小将为 883k,然后ncnn2int8定点化之后,模型大小又变成了1.8M
nlejzf6q2#
https://github.com/Tencent/ncnn/wiki/quantized-int8-inference
根据这里的描述,可能现在这种方法采用的是 “the runtime way, no model binary reduction”,所以模型大小并没有下降
l7wslrjt3#
我也碰到了同样的问题。
example.param
和 `example.bin 是得到的初始ncnn模型。ncnnoptimize 优化之后,模型大小将为 883k,然后ncnn2int8定点化之后,模型大小又变成了1.8M
并没有看到你在
./ncnn2int8 example-nobn-fp32.param example-nobn-fp32.bin example-int8.param mobilenet-int8.bin
这里面指定 table 文件,所以应该没有存成 8bit 吧。过程中有看到 “quantize_convolution layer_name ” 类似的打印么。另外,发现你输出为 mobilenet-int8.bin,在截图里面没有看到。
vx6bjr1n4#
@lexuszhi1990
这位同学的解释是正确的@MambaWong
qfe3c7zg5#
@MambaWong
could you please check my case?
I used torchvision model and convert it to .bin, .param file using onnx
( https://github.com/Tencent/ncnn/wiki/use-ncnn-with-pytorch-or-onnx )
mlnl4t2r6#
hello, have you solved your problems, I also meet this problem. @bocharm
sycxhyv77#
请更新ncnn量化和转换工具
https://github.com/Tencent/ncnn/wiki/quantized-int8-inference