pandas 将bytea转换为nd.float32的ndarray

5jvtdoz2  于 2023-02-17  发布在  其他
关注(0)|答案(1)|浏览(138)

我有一个ndarray np.float32,它以 bytea 格式保存在Postgres数据库中:

import pandas as pd
import numpy as np
import sqlite3

myndarray=np.array([-3.55219245e-02, 1.33227497e-01, -4.96977456e-02, 2.16857344e-01], dtype=np.float32)
myarray=[myndarray.tobytes()]
mydataframe=pd.DataFrame(myarray, columns=['Column1'])
mydataframe.to_sql('mytable', sqlite3.connect("/tmp/floats.sqlite"))

在SQLITE3中,这将产生:

CREATE TABLE IF NOT EXISTS "mytable" ("index" INTEGER, "Column1" TEXT);
INSERT INTO mytable VALUES(0,X'707f11bdca6c083edd8f4bbdda0f5e3e');

在Postgresql中,这将生成:

mydatabase=# select * from mytable;
 index |              Column1
-------+------------------------------------
     0 | \x707f11bdca6c083edd8f4bbdda0f5e3e
  • bytea* 是什么格式?如何将\x707f...转换回myndarray?这里没有Maven,我找到了很多关于frombuffer(),python2 buffer()memoryview()的晦涩文档,但我离一个正确的结果还很远。

目前为止我最好的是:

np.frombuffer(bytearray('707f11bdca6c083edd8f4bbdda0f5e3e', 'utf-8'), dtype=np.float32)

这是完全错误的(myndarray有4个值):

[2.1627062e+23 1.6690035e+22 3.3643249e+21 5.2896255e+22 2.1769183e+23
 1.6704162e+22 2.0823326e+23 5.2948159e+22]
mklgxw1f

mklgxw1f1#

经过大量的试验和错误(重复一遍,我不知道python),我找到了一个解决方案。

ndarray=np.frombuffer(bytes.fromhex("707f11bdca6c083edd8f4bbdda0f5e3e"), np.float32)

print(ndarray)
# [-0.03552192  0.1332275  -0.04969775  0.21685734]

print(type(ndarray))
# <class 'numpy.ndarray'>

print(type(ndarray[0]))
# <class 'numpy.float32'>

现在,完整的例子:

ngine=create_engine('postgresql://postgres:mypassword@localhost/mydatabase')
myndarray=np.array([-3.55219245e-02, 1.33227497e-01, -4.96977456e-02, 2.16857344e-01], dtype=np.float32)

print(myndarray)
# [-0.03552192  0.1332275  -0.04969775  0.21685734]

myarray=[myndarray.tobytes()]
mydataframe=pd.DataFrame(myarray, columns=['Column1'])
mydataframe.to_sql('mytable', ngine)
# SELECT * FROM mytable
#
#index  Column1
#    0  \x707f11bdca6c083edd8f4bbdda0f5e3e

bytea=pd.read_sql(sql=select(mytable), con=ngine).iloc[0]['Column1']
print(type(bytea))
# <class 'str'>

print(bytea)
# \x707f11bdca6c083edd8f4bbdda0f5e3e

print(bytea[2:])
# 707f11bdca6c083edd8f4bbdda0f5e3e
# Surprise! The \x is not interpreted! 

ndarray=np.frombuffer(bytes.fromhex(bytea[2:]), np.float32)
print(ndarray)
# [-0.03552192  0.1332275  -0.04969775  0.21685734]

谢谢,@hpaulj,@PranavHosangadi,@juanpa,阿里维拉加,@ akka 特。

相关问题