将字节转换为字符串

irlmq6kh 于 2022-09-18 发布在 Java

关注(0)|答案(24)|浏览(252)

I captured the standard output of an external program into a bytes object:

>>> from subprocess import *
>>> command_stdout = Popen(['ls', '-l'], stdout=PIPE).communicate()[0]
>>>
>>> command_stdout
b'total 0\n-rw-rw-r-- 1 thomas thomas 0 Mar  3 07:03 file1\n-rw-rw-r-- 1 thomas thomas 0 Mar  3 07:03 file2\n'

I want to convert that to a normal Python string, so that I can print it like this:

>>> print(command_stdout)
-rw-rw-r-- 1 thomas thomas 0 Mar  3 07:03 file1
-rw-rw-r-- 1 thomas thomas 0 Mar  3 07:03 file2

I tried the binascii.b2a_qp() method, but got the same bytes object again:

>>> binascii.b2a_qp(command_stdout)
b'total 0\n-rw-rw-r-- 1 thomas thomas 0 Mar  3 07:03 file1\n-rw-rw-r-- 1 thomas thomas 0 Mar  3 07:03 file2\n'

How do I convert the bytes object to a str with Python 3?

python

来源：https://stackoverflow.com/questions/606191/convert-bytes-to-a-string

24条答案

按热度按时间

vbopmzt11#

字节数

m=b'This is bytes'

转换为字符串

方法一

m.decode("utf-8")

或

m.decode()

方法二

import codecs
codecs.decode(m,encoding="utf-8")

或

import codecs
codecs.decode(m)

方法三

str(m,encoding="utf-8")

或

str(m)[2:-1]

结果

'This is bytes'

赞(0）回复(0）举报 2022-09-18

mfpqipee2#

尝试使用此函数；该函数将忽略所有非字符集(如utf-8)二进制文件，并返回一个干净的字符串。它经过python3.6及更高版本的测试。

def bin2str(text, encoding = 'utf-8'):
    """Converts a binary to Unicode string by removing all non Unicode char
    text: binary string to work on
    encoding: output encoding *utf-8"""

    return text.decode(encoding, 'ignore')

在这里，该函数将获取二进制数据并对其进行解码(使用Python预定义的字符集将二进制数据转换为字符，ignore参数忽略二进制文件中的所有非字符集数据，并最终返回所需的string值。

如果您不确定编码，请使用sys.getdefaultencoding()获取设备的默认编码。

赞(0）回复(0）举报 2022-09-18

p4rjhz4m3#

尝尝这个

bytes.fromhex('c3a9').decode('utf-8')

赞(0）回复(0）举报 2022-09-18

nzk0hqpo4#

如果要转换任何字节，而不仅仅是将字符串转换为字节：

with open("bytesfile", "rb") as infile:
    str = base64.b85encode(imageFile.read())

with open("bytesfile", "rb") as infile:
    str2 = json.dumps(list(infile.read()))

然而，这并不是很有效率。它将把一张2MB的图片变成9MB。

赞(0）回复(0）举报 2022-09-18

oogrdqng5#

def toString(string):    
    try:
        return v.decode("utf-8")
    except ValueError:
        return string

b = b'97.080.500'
s = '97.080.500'
print(toString(b))
print(toString(s))

赞(0）回复(0）举报 2022-09-18

vxbzzdmp6#

我们可以使用bytes.decode(encoding='utf-8', errors='strict')对Bytes对象进行解码以生成一个字符串，以用于文档编制。单击此处

Python3示例：

byte_value = b"abcde"
print("Initial value = {}".format(byte_value))
print("Initial value type = {}".format(type(byte_value)))
string_value = byte_value.decode("utf-8")

# utf-8 is used here because it is a very common encoding, but you need to use the encoding your data is actually in.

print("------------")
print("Converted value = {}".format(string_value))
print("Converted value type = {}".format(type(string_value)))

产出：

Initial value = b'abcde'
Initial value type = <class 'bytes'>
------------
Converted value = abcde
Converted value type = <class 'str'>

注意：在Python3中，默认的编码类型是utf-8。因此，<byte_string>.decode("utf-8")也可以写为<byte_string>.decode()

赞(0）回复(0）举报 2022-09-18

5fjcxozz7#

使用.decode()进行解码。这将对字符串进行解码。将'utf-8')作为值传递到内部。

赞(0）回复(0）举报 2022-09-18

bybem2ql8#

从*sys — System-specific parameters and functions*开始：

要在标准流中写入或读取二进制数据，请使用底层二进制缓冲区。例如，要将字节写入标准输出，请使用sys.stdout.buffer.write(b'abc')。

赞(0）回复(0）举报 2022-09-18

vzgqcmou9#

对于“运行外壳命令并以文本而不是字节形式获得其输出”的特定情况，在Python3.7上，您应该使用subprocess.run并传入text=True(以及capture_output=True来捕获输出)

command_result = subprocess.run(["ls", "-l"], capture_output=True, text=True)
command_result.stdout  # is a `str` containing your program's stdout

text过去被称为universal_newlines，在Python3.7中进行了更改(好吧，使用了别名)。如果要支持3.7之前的Python版本，请传入universal_newlines=True而不是text=True

赞(0）回复(0）举报 2022-09-18

inkz8wg910#

For Python 3, this is a much safer and Pythonic approach to convert from byte to string:

def byte_to_str(bytes_or_str):
    if isinstance(bytes_or_str, bytes): # Check if it's in bytes
        print(bytes_or_str.decode('utf-8'))
    else:
        print("Object not of byte type")

byte_to_str(b'total 0\n-rw-rw-r-- 1 thomas thomas 0 Mar  3 07:03 file1\n-rw-rw-r-- 1 thomas thomas 0 Mar  3 07:03 file2\n')

Output:

total 0
-rw-rw-r-- 1 thomas thomas 0 Mar  3 07:03 file1
-rw-rw-r-- 1 thomas thomas 0 Mar  3 07:03 file2

赞(0）回复(0）举报 2022-09-18

scyqe7ek11#

在处理Windows系统中的数据(行结尾为\r\n)时，我的答案是

String = Bytes.decode("utf-8").replace("rn", "n")

为什么？尝试使用多行Input.txt：

Bytes = open("Input.txt", "rb").read()
String = Bytes.decode("utf-8")
open("Output.txt", "w").write(String)

您的所有行尾都将加倍(到\r\r\n)，从而导致额外的空行。Python的文本读取函数通常会规格化行尾，因此字符串只使用\n。如果您从Windows系统接收二进制数据，则Python没有机会做到这一点。因此，

Bytes = open("Input.txt", "rb").read()
String = Bytes.decode("utf-8").replace("rn", "n")
open("Output.txt", "w").write(String)

将复制您的原始文件。

赞(0）回复(0）举报 2022-09-18

cclgggtu12#

我做了一个清理清单的函数

def cleanLists(self, lista):
    lista = [x.strip() for x in lista]
    lista = [x.replace('n', '') for x in lista]
    lista = [x.replace('b', '') for x in lista]
    lista = [x.encode('utf8') for x in lista]
    lista = [x.decode('utf8') for x in lista]

    return lista

赞(0）回复(0）举报 2022-09-18

mf98qq9413#

对bytes对象进行解码以生成一个字符串：

>>> b"abcde".decode("utf-8") 
'abcde'

上面的例子假设bytes对象是UTF-8格式的，因为它是一种常见的编码。但是，您应该使用数据实际所在的编码！

赞(0）回复(0）举报 2022-09-18

klsxnrf114#

如果您出现此错误：

utf-8 codec can't decode byte 0x8a，

那么最好使用以下代码将字节转换为字符串：

bytes = b"abcdefg"
string = bytes.decode("utf-8", "ignore")

赞(0）回复(0）举报 2022-09-18

xiozqbni15#

如果您应该通过尝试decode()获得以下结果：
AttributeError：“Str”对象没有属性“”Decode“”

您还可以在强制转换中直接指定编码类型：

>>> my_byte_str
b'Hello World'

>>> str(my_byte_str, 'utf-8')
'Hello World'

赞(0）回复(0）举报 2022-09-18

我来回答

将字节转换为字符串

24条答案

字节数

转换为字符串

方法一

方法二

方法三

结果

相关问题

热门标签

最新问答