我目前正在尝试使用TensorFlow Applications模块中的EfficientDet D0模型执行迁移学习。我的目标是在Food101数据集上训练这个模型,用于对象检测任务。但是,每当我尝试运行代码时,都会遇到错误。
Saving TensorBoard log files to: training_logs/efficientnetb0_101_classes_all_data_feature_extract/20230627-212648
Epoch 1/3
2023-06-27 21:26:49.281247: W tensorflow/compiler/xla/service/gpu/llvm_gpu_backend/gpu_backend_lib.cc:561] libdevice is required by this HLO module but was not found at ./libdevice.10.bc
2023-06-27 21:26:49.281917: W tensorflow/core/framework/op_kernel.cc:1839] OP_REQUIRES failed at xla_ops.cc:629 : INTERNAL: libdevice not found at ./libdevice.10.bc
2023-06-27 21:26:49.283370: W tensorflow/compiler/xla/service/gpu/llvm_gpu_backend/gpu_backend_lib.cc:561] libdevice is required by this HLO module but was not found at ./libdevice.10.bc
2023-06-27 21:26:49.283918: W tensorflow/core/framework/op_kernel.cc:1839] OP_REQUIRES failed at xla_ops.cc:629 : INTERNAL: libdevice not found at ./libdevice.10.bc
---------------------------------------------------------------------------
InternalError Traceback (most recent call last)
Cell In[29], line 5
1 # Turn off all warnings except for errors
2 # tf.get_logger().setLevel('ERROR')
3
4 # Fit the model with callbacks
----> 5 history_101_food_classes_feature_extract = model.fit(train_data,epochs=3,steps_per_epoch=len(train_data),validation_data=test_data,validation_steps=int(0.15 * len(test_data)),
6 callbacks=[create_tensorboard_callback("training_logs",
7 "efficientnetb0_101_classes_all_data_feature_extract"),
8 model_checkpoint])
File ~/.local/lib/python3.10/site-packages/keras/src/utils/traceback_utils.py:70, in filter_traceback.<locals>.error_handler(*args, **kwargs)
67 filtered_tb = _process_traceback_frames(e.__traceback__)
68 # To get the full stack trace, call:
69 # `tf.debugging.disable_traceback_filtering()`
---> 70 raise e.with_traceback(filtered_tb) from None
71 finally:
72 del filtered_tb
File ~/.local/lib/python3.10/site-packages/tensorflow/python/eager/execute.py:53, in quick_execute(op_name, num_outputs, inputs, attrs, ctx, name)
51 try:
52 ctx.ensure_initialized()
---> 53 tensors = pywrap_tfe.TFE_Py_Execute(ctx._handle, device_name, op_name,
54 inputs, attrs, num_outputs)
55 except core._NotOkStatusException as e:
56 if name is not None:
InternalError: Graph execution error:
Detected at node cond_1/Adam/StatefulPartitionedCall defined at (most recent call last):
但它运行良好的谷歌可乐,这是wierd。任何帮助将不胜感激。
1条答案
按热度按时间ia2d9nvy1#
感谢@Dev Bhuyan的输入,它确实有效。然而,添加更多的上下文信息。
TensorFlow中的语句tf.config.run_functions_early(True)启用了急切执行模式。
默认情况下,TensorFlow使用计算图执行模型,其中操作被添加到图中,然后在会话中执行。另一方面,急切执行允许立即执行操作并返回结果(在我的例子中,这解决了某些图实现错误)
参考:https://www.tensorflow.org/API_docs/python/tf/compat/v1/enable_eager_execution#:~:text= Eager%20execution%20provides%20an%20imperative,compat。