系统信息
- 自定义代码:无,使用上游基准
- 操作系统:Android 12
- 设备:谷歌Pixel 4a
- TensorFlow版本:夜间发布基准构建(URL中未指定确切版本)
重现步骤
- 启用开发者选项和USB调试。
- 执行以下脚本——它将下载TF基准测试和一个模型。
请注意,我们有一个内部的YOLOv4模型,我们无法共享。我已经找到了一个现有的模型。然而,结果或多或少是相同的,所以我猜想可能是模型架构出了问题。此外——该模型在CPU/NPU上运行正常。
结果:
- 预期:没有错误消息。
- 实际:有很多错误消息——似乎是每个模型节点的错误。请查看下面的输出。
#!/bin/bash
set -eou pipefail
MODEL_FILE_NAME="model.tflite"
BENCHMARK_PATH="$(mktemp -d)"
BENCHMARK_FILE_NAME="tensorflow-benchmark"
DEVICE_PATH="/data/local/tmp"
echo ":: Fetch benchmark..."
curl \
--location "https://storage.googleapis.com/tensorflow-nightly-public/prod/tensorflow/release/lite/tools/nightly/latest/android_aarch64_benchmark_model" \
--output "${BENCHMARK_PATH}/${BENCHMARK_FILE_NAME}"
echo ":: Fetch model..."
curl \
--location "https://github.com/theAIGuysCode/tensorflow-yolov4-tflite/raw/master/android/app/src/main/assets/yolov4-416-fp32.tflite" \
--output "${MODEL_FILE_NAME}"
echo ":: Move benchmark to the device..."
adb push "${BENCHMARK_PATH}/${BENCHMARK_FILE_NAME}" "${DEVICE_PATH}"
adb shell chmod +x "${DEVICE_PATH}/${BENCHMARK_FILE_NAME}"
echo ":: Move model to the device..."
adb push "${MODEL_FILE_NAME}" "${DEVICE_PATH}"
echo ":: Run benchmark..."
adb shell taskset f0 "${DEVICE_PATH}/${BENCHMARK_FILE_NAME}" \
--graph="${DEVICE_PATH}/${MODEL_FILE_NAME}" \
--use_gpu=true
echo ":: Remove benchmark..."
adb shell rm "${DEVICE_PATH}/${BENCHMARK_FILE_NAME}"
rm -rf "${BENCHMARK_PATH}"
echo ":: Remove model..."
adb shell rm "${DEVICE_PATH}/${MODEL_FILE_NAME}"
rm -rf "${MODEL_FILE_NAME}"
:: Fetch benchmark...
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
100 6029k 100 6029k 0 0 11.8M 0 --:--:-- --:--:-- --:--:-- 11.8M
:: Fetch model...
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
100 196 100 196 0 0 336 0 --:--:-- --:--:-- --:--:-- 335
100 23.1M 100 23.1M 0 0 8255k 0 0:00:02 0:00:02 --:--:-- 23.7M
:: Move benchmark to the device...
/var/folders/d8/zmkczjms4jxbtbw24wt7qzbw0000gp/T/tmp.Yg5JHGh1/tens...ark: 1 file pushed, 0 skipped. 99.8 MB/s (6174376 bytes in 0.059s)
:: Move model to the device...
model.tflite: 1 file pushed, 0 skipped. 36.6 MB/s (24279948 bytes in 0.632s)
:: Run benchmark...
STARTING!
Log parameter values verbosely: [0]
Graph: [/data/local/tmp/model.tflite]
Use gpu: [1]
Loaded model /data/local/tmp/model.tflite
INFO: Initialized TensorFlow Lite runtime.
GPU delegate created.
INFO: Created TensorFlow Lite delegate for GPU.
INFO: Replacing 144 node(s) with delegate (TfLiteGpuDelegateV2) node, yielding 1 partitions.
INFO: Initialized OpenCL-based API.
INFO: Created 1 GPU delegate kernels.
Explicitly applied GPU delegate, and the model graph will be completely executed by the delegate.
The input model file size (MB): 24.2799
Initialized session in 2394.17ms.
Running benchmark for at least 1 iterations and at least 0.5 seconds but terminate if exceeding 150 seconds.
ERROR: TfLiteGpuDelegate Invoke: Given object is not valid
ERROR: Node number 144 (TfLiteGpuDelegateV2) failed to invoke.
ERROR: TfLiteGpuDelegate Invoke: Given object is not valid
ERROR: Node number 144 (TfLiteGpuDelegateV2) failed to invoke.
ERROR: TfLiteGpuDelegate Invoke: Given object is not valid
ERROR: Node number 144 (TfLiteGpuDelegateV2) failed to invoke.
ERROR: TfLiteGpuDelegate Invoke: Given object is not valid
ERROR: Node number 144 (TfLiteGpuDelegateV2) failed to invoke.
ERROR: TfLiteGpuDelegate Invoke: Given object is not valid
ERROR: Node number 144 (TfLiteGpuDelegateV2) failed to invoke.
ERROR: TfLiteGpuDelegate Invoke: Given object is not valid
>>> THIS CONTINUES FOR A WHILE <<<
count=852 first=237 curr=175 min=17 max=381 avg=176.995 std=47
Benchmarking failed.
5条答案
按热度按时间a0x5cqrl1#
为了加快这里的故障排除过程,你能填写一下问题 template 吗?
谢谢!
wswtfjt72#
@sushreebarsa,添加了更多相关信息。
ckx4rj1h3#
为了复现这里报告的问题,您能否提供完整的代码和数据集,以及您使用的TensorFlow版本?谢谢!
uurity8g4#
@sushreebarsa, PTAL在原始问题中。它包含脚本下载官方TensorFlow基准测试和模型。不需要自定义代码/项目来重现该问题。
jgovgodb5#
我在tfLite 2.8.0上遇到了同样的问题,目标SDK为32。到目前为止,我唯一的解决方法是降级tensorflow到2.5.0,如果我想保持yolo v4的GPU支持,否则我可以继续使用tf 2.8.0。
如果有人能开发一个更未来就绪的解决方案,那就太好了。