在Python中从Azure Data Lake存储Gen2读取和写入文件

falq053o  于 2023-02-25  发布在  Python
关注(0)|答案(1)|浏览(153)

根据微软文件:
使用帐户密钥连接到Azure Data Lake存储Gen2:

def initialize_storage_account(storage_account_name, storage_account_key):
    
    try:  
        global service_client

        service_client = DataLakeServiceClient(account_url="{}://{}.dfs.core.windows.net".format(
            "https", storage_account_name), credential=storage_account_key)
    
    except Exception as e:
        print(e)

将文件上载到目录:

def upload_file_to_directory():
    try:
        file_system_client = service_client.get_file_system_client(file_system="my-file-system")

        directory_client = file_system_client.get_directory_client("my-directory/filter")
        
        file_client = directory_client.create_file("my_csv_write.csv")
        local_file = open("C:\\Users\\my_csv_read.csv",'r')

        file_contents = local_file.read()

        file_client.append_data(data=file_contents, offset=0, length=len(file_contents))

        file_client.flush_data(len(file_contents))
        print("File uploaded")

    except Exception as e:
      print(e)

我可以用这个功能将文件从我的本地上传到Azure存储,而且它工作正常。
但我想做的是从Azure存储读取文件并写入Azure存储。

def read_and_write_to_directory():
    try:
        file_system_client = service_client.get_file_system_client(file_system="my-file-system")

        directory_client_read = file_system_client.get_directory_client("my-directory")
        directory_client_write = file_system_client.get_directory_client("my-directory/filter")

        file_client_read = directory_client_read.get_file_client("my_csv_read.csv")
        file_path = open(file_client_read,'r')
        file_contents = file_path.read()
        

        file_client_write = directory_client_write.create_file("my_csv_write.csv")
        file_client_write.append_data(file_contents, overwrite=True)
        
    
    except Exception as e:
        print(e)

但它不起作用,
误差

expected str, bytes or os.PathLike object, not DataLakeFileClient

那么,从Azure Lake存储读取文件和向Azure Lake存储写入文件的正确方法是什么?

h7appiyu

h7appiyu1#

可能有点晚了,但是我在寻找另一个问题的时候偶然发现了这个问题,我认为你需要读取文件内容,然后写它,就像这里描述的那样,然后写它,就像这里描述的那样,所以你的代码看起来像这样:

def read_and_write_to_directory():
try:
    file_system_client = service_client.get_file_system_client(file_system="my-file-system")

    directory_client_read = file_system_client.get_directory_client("my-directory")
    directory_client_write = file_system_client.get_directory_client("my-directory/filter")

    file_client_read = directory_client_read.get_file_client("my_csv_read.csv")
    #file_path = open(file_client_read,'r')
    #file_contents = file_path.read()

    download = file_client_read.download_file()
    downloaded_bytes = download.readall()

    

    file_client_write = directory_client_write.create_file("my_csv_write.csv")
    #file_client_write.append_data(file_contents, overwrite=True)
    file_client_write.append_data(downloaded_bytes, overwrite=True)

except Exception as e:
    print(e)

相关问题