bug描述 Describe the Bug
Version:
paddlepaddle-gpu 2.5.1.post117
visualdl 2.4.2
情况说明:
使用 fleet API,4 卡分布式训练时候,visualdl / tensorboardx 记录训练 acc/loss。
执行到代码出 logwriter 处,报错:FileExistsError:
[2023-08-11 08:09:01,447] [ WARNING] fleet.py:290 - The dygraph parallel environment has been initialized.
[2023-08-11 08:09:01,448] [ WARNING] fleet.py:313 - The dygraph hybrid parallel environment has been initialized.
Traceback (most recent call last):
File "main.py", line 23, in <module>
logwriter = LogWriter(logdir='./runs/')
File "/home/smk/anaconda3/envs/paddle/lib/python3.8/site-packages/visualdl/writer/writer.py", line 120, in __init__
self._get_file_writer()
File "/home/smk/anaconda3/envs/paddle/lib/python3.8/site-packages/visualdl/writer/writer.py", line 135, in _get_file_writer
self._file_writer = RecordFileWriter(
File "/home/smk/anaconda3/envs/paddle/lib/python3.8/site-packages/visualdl/writer/record_writer.py", line 90, in __init__
bfile.makedirs(logdir)
File "/home/smk/anaconda3/envs/paddle/lib/python3.8/site-packages/visualdl/io/bfile.py", line 695, in makedirs
return default_file_factory.get_filesystem(path).makedirs(path)
File "/home/smk/anaconda3/envs/paddle/lib/python3.8/site-packages/visualdl/io/bfile.py", line 97, in makedirs
os.makedirs(path)
File "/home/smk/anaconda3/envs/paddle/lib/python3.8/os.py", line 223, in makedirs
mkdir(name, mode)
FileExistsError: [Errno 17] File exists: './runs/'
LAUNCH INFO 2023-08-11 08:09:04,712 Exit code -15
其他补充信息 Additional Supplementary Information
No response
5条答案
按热度按时间htrmnn0y1#
没人来处理这个问题么?
cnh2zyt32#
把那个目录删除呢?或者定向到其他目录?
14ifxucb3#
没有用的。尝试过来。删除也没用。默认参数也是报错的。
raogr8fs4#
chhqkbe15#
ok。感谢。