ncnn uses much more memory than other inference frameworks.

7fyelxc5  于 2022-10-22  发布在  其他
关注(0)|答案(5)|浏览(235)

I recently ran a benchmark comparing the inference speed and memory usage of various inference frameworks, including ncnn, mxnet, onnxruntime, and openvino. For this benchmark, I used the Insightface arcface resnet100 model.

While ncnn inference speed on x86 is comparable to many of the other large frameworks (I was impressed by this, good work!), I saw that it uses significantly more memory to run inference than other frameworks.

For reference, ncnn used 1.7Gb of RAM to run this model, while the other frameworks used between 0.37Gb - 0.57Gb of RAM.

Is this a bug? Or does ncnn simply use more memory?
You can find the full benchmarks here (benchmark numbers reported in readme).

vmdwslir

vmdwslir2#

opt.use_packing_layout = false , ncnn used 0.8Gb of RAM

uurity8g

uurity8g3#

Yes that's true, although disabling that option also significantly slows down inference on machines supporting SIMD registers such as AVX2 registers.
Ultimately I would like to see the AVX2 optimizations without the large memory overhead, but it does not look like that is available right now, perhaps a good goal to work towards.

anauzrmj

anauzrmj4#

I'm also experiencing this issue. With a quantized int8 model of ~65MB, on both iOS and Android the memory used is about 500-600MB, 3X-4X than using TFLite or MLCore.
Is there any workaround, also decreasing inference speed, to lower the memory usage?
I tried the opt.use_packing_layout = false as suggested but it doesn't help at all, maybe because the model is quantized?

jdzmm42g

jdzmm42g5#

i dont know is it to late to answer this issue,i met zhe same situation as you.
it is beacuse that ncnn have some acceleration algorithm witch consum more momery,if you want to reduce the usage
you can set opt.use_winograd_convolution as false.
In my project,r101 just uses about 450mb

相关问题