使用pythonDataframe中的块大小在s3中上载多个文件

6qfn3psc  于 2021-07-13  发布在  Java
关注(0)|答案(0)|浏览(256)

我正在尝试根据read\u sql\u query dataframe中定义的块大小在s3中上载多个文件。问题是,它使用列名附加同一个文件(在达到chunk size值之后),但是它创建了正确数量的多个文件。任何建议。

count = 0
s3 = boto3.client('s3')

# Uploading to S3

bucketname = "dummy-bucket"
itemname = "test/dummy/sample_"
out_buf = StringIO() 
engine = create_engine('example')

with engine.connect() as conn, conn.begin():
    sql = """
                select 
                dummy from dummy_table
          """

    for chunk in pd.read_sql_query(sql , conn, chunksize=10000):
       file_path = itemname +'part.%s.csv' % (count)
       chunk.to_csv(out_buf, index=False)
       s3.put_object(Bucket=bucketname,Body=out_buf.getvalue(),Key=file_path)
       count += 1
       print(f'File {file_path} uploaded.')
conn.close()

暂无答案!

目前还没有任何答案,快来回答吧!

相关问题