linux 如何在docker容器中为Python正确设置utf-8语言环境?

ecfdbz9o  于 2022-11-22  发布在  Linux
关注(0)|答案(1)|浏览(216)

我试图在一个docker容器中运行我的python文件。
我正在使用PyTorch的NVIDIA容器映像,版本19.05,它提供了Ubuntu 16.04,包括Python 3.6环境。
根据another similar question,我在运行docker映像时添加了环境参数-e PYTHONIOENCODING=utf-8

nvidia-docker run -dit --name teddy -p 8122:22 -e PYTHONIOENCODING=utf-8 1e0071d37342

尽管我已经检查了容器中的区域设置,结果似乎是正确的:

root@ce83e4a4301a:/workspace# locale
LANG=
LANGUAGE=
LC_CTYPE="C.UTF-8"
LC_NUMERIC="C.UTF-8"
LC_TIME="C.UTF-8"
LC_COLLATE="C.UTF-8"
LC_MONETARY="C.UTF-8"
LC_MESSAGES="C.UTF-8"
LC_PAPER="C.UTF-8"
LC_NAME="C.UTF-8"
LC_ADDRESS="C.UTF-8"
LC_TELEPHONE="C.UTF-8"
LC_MEASUREMENT="C.UTF-8"
LC_IDENTIFICATION="C.UTF-8"
LC_ALL=C.UTF-8

我还是得到了错误:

root@ce83e4a4301a:/workspace/paddlespeech/examples/other/tts_finetune/tts3# ./run_en.sh 
check oov
Traceback (most recent call last):
  File "local/check_oov.py", line 240, in <module>
    lang=args.lang)
  File "local/check_oov.py", line 161, in get_check_result
    pronunciation_phones = get_pronunciation_phones(lexicon_file)
  File "local/check_oov.py", line 99, in get_pronunciation_phones
    for line in f2.readlines():
  File "/opt/conda/lib/python3.6/encodings/ascii.py", line 26, in decode
    return codecs.ascii_decode(input, self.errors)[0]
UnicodeDecodeError: 'ascii' codec can't decode byte 0xef in position 6269: ordinal not in range(128)

(The当代码在同一台计算机上运行而不是在容器中运行时,它是正确的。)
我检查了代码:

...
    with open(lexicon_file, "r") as f2:
        for line in f2.readlines():
...

但是,通过手动添加参数encoding="utf-8"解决了此问题,如下所示:

...
    with open(lexicon_file, "r", encoding="utf-8") as f2:
        for line in f2.readlines():
...
xwbd5t1u

xwbd5t1u1#

在创建过程中,应使用b前缀将字符串设置为二进制文字

>>> b"(\xef\xbd\xa1\xef\xbd\xa5\xcf\x89\xef\xbd\xa5\xef\xbd\xa1)\xef\xbe\x89".decode("utf-8")
'(。・ω・。)ノ'

相关问题