Azure Blob -使用Python读取

ztmd8pv5  于 2022-12-24  发布在  Python
关注(0)|答案(9)|浏览(161)

有人能告诉我,是否可以直接从Azure blob存储中读取csv文件作为流,并使用Python处理它?我知道可以使用C#.Net(如下所示)完成此操作,但想知道Python中的等效库来完成此操作。

CloudBlobClient client = storageAccount.CreateCloudBlobClient();
CloudBlobContainer container = client.GetContainerReference("outfiles");
CloudBlob blob = container.GetBlobReference("Test.csv");*
s5a0g9ez

s5a0g9ez1#

下面是使用new version of the SDK(12.0.0)执行此操作的方法:

from azure.storage.blob import BlobClient

blob = BlobClient(account_url="https://<account_name>.blob.core.windows.net"
                  container_name="<container_name>",
                  blob_name="<blob_name>",
                  credential="<account_key>")

with open("example.csv", "wb") as f:
    data = blob.download_blob()
    data.readinto(f)

有关详细信息,请参见here

hrirmatl

hrirmatl2#

你可以像这样用python从blob中流式传输:

from tempfile import NamedTemporaryFile
from azure.storage.blob.blockblobservice import BlockBlobService

entry_path = conf['entry_path']
container_name = conf['container_name']
blob_service = BlockBlobService(
            account_name=conf['account_name'],
            account_key=conf['account_key'])

def get_file(filename):
    local_file = NamedTemporaryFile()
    blob_service.get_blob_to_stream(container_name, filename, stream=local_file, 
    max_connections=2)

    local_file.seek(0)
    return local_file
lhcgjxsq

lhcgjxsq3#

在此处提供您的Azure订阅Azure存储名称和密钥作为帐户密钥

block_blob_service = BlockBlobService(account_name='$$$$$$', account_key='$$$$$$')

这仍会获取blob并在当前位置保存为“output.jpg”

block_blob_service.get_blob_to_path('you-container_name', 'your-blob', 'output.jpg')

这将从blob获取文本/项目

blob_item= block_blob_service.get_blob_to_bytes('your-container-name','blob-name')

    blob_item.content
iszxjhcz

iszxjhcz4#

我建议使用smart_open。

import os

from azure.storage.blob import BlobServiceClient
from smart_open import open

connect_str = os.environ['AZURE_STORAGE_CONNECTION_STRING']
transport_params = {
    'client': BlobServiceClient.from_connection_string(connect_str),
}

# stream from Azure Blob Storage
with open('azure://my_container/my_file.txt', transport_params=transport_params) as fin:
    for line in fin:
        print(line)

# stream content *into* Azure Blob Storage (write mode):
with open('azure://my_container/my_file.txt', 'wb', transport_params=transport_params) as fout:
    fout.write(b'hello world')
h43kikqp

h43kikqp5#

以下是使用Pandas从Blob读取CSV的简单方法:

import os
from azure.storage.blob import BlobServiceClient

service_client = BlobServiceClient.from_connection_string(os.environ['AZURE_STORAGE_CONNECTION_STRING'])
client = service_client.get_container_client("your_container")
bc = client.get_blob_client(blob="your_folder/yourfile.csv")
data = bc.download_blob()
with open("file.csv", "wb") as f:
   data.readinto(f)
df = pd.read_csv("file.csv")
332nm8kg

332nm8kg6#

由于我无法在这个线程上找到我需要的东西,我想跟踪@SebastianDziadzio的答案,以便在不将数据下载为本地文件的情况下检索数据,这正是我试图为自己找到的。
with语句替换为以下内容:

from io import BytesIO
import pandas as pd

with BytesIO() as input_blob:
    blob_client_instance.download_blob().download_to_stream(input_blob)
    input_blob.seek(0)
    df = pd.read_csv(input_blob, compression='infer', index_col=0)
9rbhqvlz

9rbhqvlz7#

从Azure Blob读取我希望使用csv从Azure Blob存储到openpyxl xlsx

from io import BytesIO
conn_str = os.environ.get('BLOB_CONN_STR')
container_name = os.environ.get('CONTAINER_NAME')
blob = BlobClient.from_connection_string(conn_str, container_name=container_name,
                                         blob_name="YOUR BLOB PATH HERE FROM AZURE BLOB")
data = blob.download_blob()
 workbook_obj = openpyxl.load_workbook(filename=BytesIO(data.readall()))

在Azure Blob中写入

我挣扎了很多,我不希望任何人这样做,如果你正在使用openpyxl,并希望直接从Azure函数写入到blob存储做以下步骤,你会实现你所寻求的。
多谢。如果你需要帮助的话。

blob=BlobClient.from_connection_string(conn_str=conString,container_name=container_name, blob_name=r'YOUR_PATH/test1.xlsx')
blob.upload_blob(save_virtual_workbook(wb))
bnl4lu3b

bnl4lu3b8#

我知道这是一个老职位,但如果有人想做同样的。我能够访问根据以下代码
注意:你需要设置AZURE_STORAGE_CONNECTION_STRING(可从Azure门户-〉转到你的存储-〉设置-〉访问密钥获得),然后你将在那里获得连接字符串。
对于Windows:setx蓝色存储连接字符串""
对于Linux:导出AZURE存储连接字符串=""
对于macOS:导出AZURE存储连接字符串=""

import os
from azure.storage.blob import BlobServiceClient, BlobClient, ContainerClient, __version__

connect_str = os.getenv('AZURE_STORAGE_CONNECTION_STRING')
print(connect_str)
blob_service_client = BlobServiceClient.from_connection_string(connect_str)
container_client = blob_service_client.get_container_client("Your Storage Name Here")
try:

    print("\nListing blobs...")

    # List the blobs in the container
    blob_list = container_client.list_blobs()
    for blob in blob_list:
        print("\t" + blob.name)

except Exception as ex:
    print('Exception:')
    print(ex)
hiz5n14c

hiz5n14c9#

是的,这当然是可能的。查看Azure Storage SDK for Python

from azure.storage.blob import BlockBlobService

block_blob_service = BlockBlobService(account_name='myaccount', account_key='mykey')

block_blob_service.get_blob_to_path('mycontainer', 'myblockblob', 'out-sunset.png')

您可以在此处阅读完整的SDK文档:http://azure-storage.readthedocs.io .

相关问题