使用Python将CSV转换为二进制

ao218c7q  于 2023-06-19  发布在  Python
关注(0)|答案(2)|浏览(323)

我有很多csv文件,我想把它们转换成二进制文件,所以我想创建一个python脚本,可以自动完成这项任务。我的CSV文件包含0或255。(每个文件有80行和320列)
我写了这段代码:

import numpy as np
import csv

csv_filename = '320x80_ImageTest_1_255.csv'
filename = "output.bin"

with open(csv_filename) as f:
    reader = csv.reader(f, delimiter =';')
    lst = list(reader)

array = np.array(lst)

with open ('new_binary.bin','wb') as FileToWrite:
    for i in range(len(array)):
        for j in range(len(array[0])):
            FileToWrite.write(''.join(chr(int(array[i][j]))).encode())

问题是输出文件是这样的:screen of the output file
但是除了这个字符,我想要的是FF,它对应于十六进制的255,我在哪里做错了?有人能帮帮我吗?

mjqavswn

mjqavswn1#

您是否需要以下内容:

rows = [
    ["0", "0", "0", "0", "255", "255", "0", "0", "255"],
    ["255", "255", "0", "0", "0", "255", "0", "0", "255"],
    ["0", "255", "0", "0", "255", "0", "255", "0", "0"],
    ["0", "0", "255", "0", "0", "255", "0", "0", "0"],
]

with open("output.bin", "wb") as f_out:
    for row in rows:
        for field in row:
            f_out.write(int(field).to_bytes())

然后,检查output.bin:

with open("output.bin", "rb") as f_in:
    while True:
        x = f_in.read(9)
        if len(x) == 0:
            break
        print(x)
b'\x00\x00\x00\x00\xff\xff\x00\x00\xff'
b'\xff\xff\x00\x00\x00\xff\x00\x00\xff'
b'\x00\xff\x00\x00\xff\x00\xff\x00\x00'
b'\x00\x00\xff\x00\x00\xff\x00\x00\x00'

感谢Writing integers in binary to file in python向我展示了to_bytes(...)方法,并感谢MyICQ指出了默认值。

pgpifvop

pgpifvop2#

这几乎就是所描述的。
我省略了读取变量的输入,这应该是微不足道的。由于输入包含'字符,因此无法将其读取为json。相反,我把它看作是一系列的数字,被一些东西分开。然后应用正则表达式将数字转换为数组。

# Regular expression support
import re

# the input, should be read from file
dirtyinput = "[['0;0;0;0;255;255;0;0;255], ['255;255;0;0;0;255;0;0;255], ['0;255;0;0;255;0;255;0;0], ['0;0;255;0;0;255;0;0;0]]"

# extract numbers
numbers = re.findall(r'\d+', dirtyinput)

# Using function from answer by Zach Young
with open("output.bin", "wb") as f_out:
    for n in numbers:
        f_out.write(int(n).to_bytes(1, 'big'))

# --------- another method, iterating the data (efficient if the data is large)
#
with open("output2.bin", "wb") as f:
    for x in re.finditer(r'\d+', dirtyinput):
        f.write(int(x.group()).to_bytes(1,'big'))

# -------- testing result
# 
with open("output.bin", "rb") as f_in:
    while True:
        x = f_in.read(9)
        if len(x) == 0:
            break
        print(x)
b'\x00\x00\x00\x00\xff\xff\x00\x00\xff'
b'\xff\xff\x00\x00\x00\xff\x00\x00\xff'
b'\x00\xff\x00\x00\xff\x00\xff\x00\x00'
b'\x00\x00\xff\x00\x00\xff\x00\x00\x00'

我得到的结果与上面的答案相同。

相关问题