tensorflow Android: GPU代理在YOLOv4模型上失败

lnlaulya  于 4个月前  发布在  Android
关注(0)|答案(5)|浏览(49)

系统信息

  • 自定义代码:无,使用上游基准
  • 操作系统:Android 12
  • 设备:谷歌Pixel 4a
  • TensorFlow版本:夜间发布基准构建(URL中未指定确切版本)

重现步骤

  1. 启用开发者选项和USB调试。
  2. 执行以下脚本——它将下载TF基准测试和一个模型。
    请注意,我们有一个内部的YOLOv4模型,我们无法共享。我已经找到了一个现有的模型。然而,结果或多或少是相同的,所以我猜想可能是模型架构出了问题。此外——该模型在CPU/NPU上运行正常。
    结果:
  • 预期:没有错误消息。
  • 实际:有很多错误消息——似乎是每个模型节点的错误。请查看下面的输出。
#!/bin/bash
set -eou pipefail

MODEL_FILE_NAME="model.tflite"

BENCHMARK_PATH="$(mktemp -d)"
BENCHMARK_FILE_NAME="tensorflow-benchmark"

DEVICE_PATH="/data/local/tmp"

echo ":: Fetch benchmark..."
curl \
  --location "https://storage.googleapis.com/tensorflow-nightly-public/prod/tensorflow/release/lite/tools/nightly/latest/android_aarch64_benchmark_model" \
  --output "${BENCHMARK_PATH}/${BENCHMARK_FILE_NAME}"

echo ":: Fetch model..."
curl \
  --location "https://github.com/theAIGuysCode/tensorflow-yolov4-tflite/raw/master/android/app/src/main/assets/yolov4-416-fp32.tflite" \
  --output "${MODEL_FILE_NAME}"

echo ":: Move benchmark to the device..."
adb push "${BENCHMARK_PATH}/${BENCHMARK_FILE_NAME}" "${DEVICE_PATH}"
adb shell chmod +x "${DEVICE_PATH}/${BENCHMARK_FILE_NAME}"

echo ":: Move model to the device..."
adb push "${MODEL_FILE_NAME}" "${DEVICE_PATH}"

echo ":: Run benchmark..."
adb shell taskset f0 "${DEVICE_PATH}/${BENCHMARK_FILE_NAME}" \
  --graph="${DEVICE_PATH}/${MODEL_FILE_NAME}" \
  --use_gpu=true

echo ":: Remove benchmark..."
adb shell rm "${DEVICE_PATH}/${BENCHMARK_FILE_NAME}"
rm -rf "${BENCHMARK_PATH}"

echo ":: Remove model..."
adb shell rm "${DEVICE_PATH}/${MODEL_FILE_NAME}"
rm -rf "${MODEL_FILE_NAME}"
:: Fetch benchmark...
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100 6029k  100 6029k    0     0  11.8M      0 --:--:-- --:--:-- --:--:-- 11.8M
:: Fetch model...
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100   196  100   196    0     0    336      0 --:--:-- --:--:-- --:--:--   335
100 23.1M  100 23.1M    0     0  8255k      0  0:00:02  0:00:02 --:--:-- 23.7M
:: Move benchmark to the device...
/var/folders/d8/zmkczjms4jxbtbw24wt7qzbw0000gp/T/tmp.Yg5JHGh1/tens...ark: 1 file pushed, 0 skipped. 99.8 MB/s (6174376 bytes in 0.059s)
:: Move model to the device...
model.tflite: 1 file pushed, 0 skipped. 36.6 MB/s (24279948 bytes in 0.632s)
:: Run benchmark...
STARTING!
Log parameter values verbosely: [0]
Graph: [/data/local/tmp/model.tflite]
Use gpu: [1]
Loaded model /data/local/tmp/model.tflite
INFO: Initialized TensorFlow Lite runtime.
GPU delegate created.
INFO: Created TensorFlow Lite delegate for GPU.
INFO: Replacing 144 node(s) with delegate (TfLiteGpuDelegateV2) node, yielding 1 partitions.
INFO: Initialized OpenCL-based API.
INFO: Created 1 GPU delegate kernels.
Explicitly applied GPU delegate, and the model graph will be completely executed by the delegate.
The input model file size (MB): 24.2799
Initialized session in 2394.17ms.
Running benchmark for at least 1 iterations and at least 0.5 seconds but terminate if exceeding 150 seconds.
ERROR: TfLiteGpuDelegate Invoke: Given object is not valid
ERROR: Node number 144 (TfLiteGpuDelegateV2) failed to invoke.
ERROR: TfLiteGpuDelegate Invoke: Given object is not valid
ERROR: Node number 144 (TfLiteGpuDelegateV2) failed to invoke.
ERROR: TfLiteGpuDelegate Invoke: Given object is not valid
ERROR: Node number 144 (TfLiteGpuDelegateV2) failed to invoke.
ERROR: TfLiteGpuDelegate Invoke: Given object is not valid
ERROR: Node number 144 (TfLiteGpuDelegateV2) failed to invoke.
ERROR: TfLiteGpuDelegate Invoke: Given object is not valid
ERROR: Node number 144 (TfLiteGpuDelegateV2) failed to invoke.
ERROR: TfLiteGpuDelegate Invoke: Given object is not valid
>>> THIS CONTINUES FOR A WHILE <<<
count=852 first=237 curr=175 min=17 max=381 avg=176.995 std=47

Benchmarking failed.
a0x5cqrl

a0x5cqrl1#

为了加快这里的故障排除过程,你能填写一下问题 template 吗?
谢谢!

wswtfjt7

wswtfjt72#

@sushreebarsa,添加了更多相关信息。

ckx4rj1h

ckx4rj1h3#

为了复现这里报告的问题,您能否提供完整的代码和数据集,以及您使用的TensorFlow版本?谢谢!

uurity8g

uurity8g4#

@sushreebarsa, PTAL在原始问题中。它包含脚本下载官方TensorFlow基准测试和模型。不需要自定义代码/项目来重现该问题。

jgovgodb

jgovgodb5#

我在tfLite 2.8.0上遇到了同样的问题,目标SDK为32。到目前为止,我唯一的解决方法是降级tensorflow到2.5.0,如果我想保持yolo v4的GPU支持,否则我可以继续使用tf 2.8.0。
如果有人能开发一个更未来就绪的解决方案,那就太好了。

相关问题