django 将枕头图像从PDF保存到Google Cloud Server

muk1a3rh  于 2023-03-31  发布在  Go
关注(0)|答案(3)|浏览(135)

我正在开发一个Django Web应用程序,它可以接收PDF文件并对PDF的每个页面执行一些图像处理。我得到了一个PDF,我需要将每个页面保存到我的Google Cloud Storage中。我正在使用pdf2imageconvert_from_path()为PDF中的每个页面生成一个Pillow图像列表。现在,我想将这些图像保存到Google Cloud Storages,但我无法解决这个问题。
我已经成功地将这些枕头图像保存在本地,但我不知道如何在云中执行此操作。

fullURL = file.pdf.url
client = storage.Client()
bucket = client.get_bucket('name-of-my-bucket')
blob = bucket.blob(file.pdf.name[:-4] + '/')
blob.upload_from_string('', content_type='application/x-www-form-urlencoded;charset=UTF-8')
pages = convert_from_path(fullURL, 400)
for i,page in enumerate(pages):
    blob = bucket.blob(file.pdf.name[:-4] + '/' + str(i) + '.jpg')
    blob.upload_from_string('', content_type='image/jpeg')
    outfile = file.pdf.name[:-4] + '/' + str(i) + '.jpg'
    page.save(outfile)
    of = open(outfile, 'rb')
    blob.upload_from_file(of)
kqhtkvqz

kqhtkvqz1#

所以从不使用blobstore开始。他们正试图摆脱它,让人们使用云存储。首先设置云存储
https://cloud.google.com/appengine/docs/standard/python/googlecloudstorageclient/setting-up-cloud-storage
我使用webapp2而不是Django,但我相信你可以解决这个问题。我也不使用Pillow图像,所以你必须打开你要上传的图像。然后做一些类似的事情(假设你试图发布数据):

import cloudstorage as gcs
  import io
  import StringIO 
  from google.appengine.api import app_identity

before get和post在它自己的节中

def create_file(self, filename, Dacontents):

    write_retry_params = gcs.RetryParams(backoff_factor=1.1)
    gcs_file = gcs.open(filename,
                        'w',
                        content_type='image/jpeg',
                        options={'x-goog-meta-foo': 'foo',
                                'x-goog-meta-bar': 'bar'},
                        retry_params=write_retry_params)
    gcs_file.write(Dacontents)
    gcs_file.close()

在获取HTML时

<form action="/(whatever yoururl is)" method="post"enctype="multipart/form-data">
  <input type="file" name="orders"/>
   <input type="submit"/>
    </form>

发布中

orders=self.request.POST.get(‘orders)#this is for webapp2

    bucket_name = os.environ.get('BUCKET_NAME',app_identity.get_default_gcs_bucket_name())
    bucket = '/' + bucket_name
    OpenOrders=orders.file.read()
    if OpenOrders:
        filename = bucket + '/whateverYouWantToCallIt'            
        self.create_file(filename,OpenOrders)
yjghlzjz

yjghlzjz2#

由于您已将文件保存在本地,因此它们在运行Web应用程序的本地目录中可用。
你可以做的只是遍历该目录的文件,并将它们一个接一个地上传到Google Cloud Storage。

示例代码如下:

你需要这个库:
谷歌云存储
Python代码:

#Libraries
import os
from google.cloud import storage

#Public variable declarations:
bucket_name = "[BUCKET_NAME]"
local_directory = "local/directory/of/the/files/for/uploading/"
bucket_directory = "uploaded/files/" #Where the files will be uploaded in the bucket

#Upload file from source to destination
def upload_blob(source_file_name, destination_blob_name):
    storage_client = storage.Client()
    bucket = storage_client.get_bucket(bucket_name)
    blob = bucket.blob(destination_blob_name)

    blob.upload_from_filename(source_file_name)

#Iterate through all files in that directory and upload one by one using the same filename
def upload_files():
    for filename in os.listdir(local_directory):
        upload_blob(local_directory + filename, bucket_directory + filename)
    return "File uploaded!"

#Call this function in your code:
upload_files()

注意:我已经测试了谷歌应用引擎Web应用程序的代码,它为我工作。采取的想法是如何工作,并根据您的需要修改它。我希望这是有帮助的。

lf5gs5x2

lf5gs5x23#

您不需要在本地保存图像,也可以将图像直接写入gcs bucket,如下所述:

import io
from PIL import Image
from google.cloud import storage
from pdf2image import convert_from_bytes

storage_client = storage.Client()

def convert_pil_image_to_byte_array(img):
    img_byte_array = io.BytesIO()
    img.save(img_byte_array, format='JPEG', subsampling=0, quality=100)
    img_byte_array = img_byte_array.getvalue()
    return img_byte_array

def write_to_gcs_bucket(bucket_name, source_prefix, target_prefix):
    bucket = storage_client.get_bucket(bucket_name)
    blob = bucket.get_blob(source_prefix)
    contents = blob.download_as_string()
    images = convert_from_bytes(contents,first_page = 5)
    for i in range(len(images)):
        object_byte = convert_pil_image_to_byte_array(images[i])
        file_name = 'slide' + str(i) + '.jpg'
        blob = bucket.blob(target_prefix + file_name)
        blob.upload_from_string(object_byte)

相关问题