如何将Dataframe数据存储到Firebase Storage?

u1ehiz5o  于 2022-11-17  发布在  其他
关注(0)|答案(4)|浏览(149)

假设一个包含一些数据的pandas Dataframe,将这些数据存储到Firebase的最佳方法是什么?
我是否应该将Dataframe转换为本地文件(例如.csv,.txt),然后将其上传到Firebase Storage,或者是否可以直接存储PandasDataframe而无需转换?或者是否有更好的最佳实践?

更新01/03-到目前为止,我已经提出了这个解决方案,它需要在本地写一个csv文件,然后阅读并上传它,然后删除本地文件。但是我怀疑这是最有效的方法,因此我想知道是否可以做得更好更快?

import os
import firebase_admin
from firebase_admin import db, storage

cred   = firebase_admin.credentials.Certificate(cert_json)
app    = firebase_admin.initialize_app(cred, config)
bucket = storage.bucket(app=app)

def upload_df(df, data_id):
    """
    Upload a Dataframe as a csv to Firebase Storage
    :return: storage_ref
    """

    # Storage location + extension
    storage_ref = data_id + ".csv"

    # Store locally
    df.to_csv(data_id)

    # Upload to Firebase Storage
    blob    = bucket.blob(storage_ref)
    with open(data_id,'rb') as local_file:
        blob.upload_from_file(local_file)

    # Delete locally
    os.remove(data_id)

    return storage_ref
bmp9r5qi

bmp9r5qi1#

对于python-firebaseto_dict

postdata = my_df.to_dict()

# Assumes any auth/headers you need are already taken care of.
result = firebase.post('/my_endpoint', postdata, {'print': 'pretty'})
print(result)
# Snapshot info

您可以使用快照信息和端点取回数据,并使用from_dict()重新建立df。您可以将此解决方案调整为SQLJSON解决方案,pandas也支持这些解决方案。
或者,根据脚本的执行位置,您可以考虑将firebase视为数据库,并使用firebase_admin中的dbapi(查看this)。
至于它是否符合最佳实践,在不了解您的用例的情况下很难说。

v09wglhw

v09wglhw2#

如果只想减少代码长度以及创建和删除文件的步骤,可以使用upload_from_string

import firebase_admin
from firebase_admin import db, storage

cred   = firebase_admin.credentials.Certificate(cert_json)
app    = firebase_admin.initialize_app(cred, config)
bucket = storage.bucket(app=app)

def upload_df(df, data_id):
    """
    Upload a Dataframe as a csv to Firebase Storage
    :return: storage_ref
    """
    storage_ref = data_id + '.csv'
    blob = bucket.blob(storage_ref)
    blob.upload_from_string(df.to_csv())

    return storage_ref

https://googleapis.github.io/google-cloud-python/latest/storage/blobs.html#google.cloud.storage.blob.Blob.upload_from_string

34gzjxbg

34gzjxbg3#

经过几个小时的摸索,下面的解决方案对我来说很有效。你需要把你的csv文件转换成字节,然后上传它。

import pyrebase
import pandas

firebaseConfig = {
   "apiKey": "xxxxx",
   "authDomain": "xxxxx",
   "projectId": "xxxxx",
   "storageBucket": "xxxxx",
   "messagingSenderId": "xxxxx",
   "appId": "xxxxx",
   "databaseURL":"xxxxx"
};

firebase = pyrebase.initialize_app(firebaseConfig)

storage = firebase.storage()

df = pd.read_csv("/content/Future Prices.csv")

# here is the magic. Convert your csv file to bytes and then upload it
df_string = df.to_csv(index=False)
db_bytes = bytes(df_string, 'utf8')

fileName = "Future Prices.csv"

storage.child("predictions/" + fileName).put(db_bytes)

这就是快乐编程!

68de4m5k

68de4m5k4#

我发现从非常小的 Dataframe 开始(低于100 KB!),在存储之前压缩它们是值得的。我使用了下面的google cloud和pickle库。你的文件也可以在通常的firebase存储器上使用这种方式,你在内存和速度上都得到了收益,无论是在写还是阅读的时候。

import firebase_admin
from firebase_admin import credentials, initialize_app, storage
from google.cloud import storage
import pickle

cred = credentials.Certificate(json_cert_file)
firebase_admin.initialize_app(cred, {'storageBucket': 'YOUR_storageBucket (without gs://)'})
bucket = storage.bucket()

file_name = data_id + ".pkl"
blob = bucket.blob(file_name)

# write df to storage
blob.upload_from_string(pickle.dumps(df))

# read df from storage
df = pickle.loads(blob.download_as_string())

相关问题