quivr [Bug]:处理文件时发生错误:'File'对象没有'file'属性

lfapxunr  于 2个月前  发布在  其他
关注(0)|答案(4)|浏览(41)

发生了什么?
在一个新的虚拟机上安装了Quivr示例。登录并将附件中的文档上传到一个脑海中。
woocommerce-api-v3.md
通知:

Processing File woocommerceapiv3.md
An error occurred while processing the file: 'File' object has no attribute 'file'

相关的日志输出

worker        | [2024-06-29 07:34:21,772: INFO/MainProcess] Task process_file_and_notify[e7a3c704-c721-4c1f-9509-a36c2c71f9ad] received
backend-core  | INFO:     192.168.1.15:65181 - "POST /upload?brain_id=07121c8f-4d1c-42b6-b0c1-c78d0a6a0eec&chat_id=e742ea72-83ef-424a-b1d4-6ee49ffdedcf HTTP/1.1" 200 OK
worker        | [2024-06-29 07:34:21,795: INFO/ForkPoolWorker-22] HTTP Request: GET http://host.docker.internal:54321/storage/v1/object/quivr/07121c8f-4d1c-42b6-b0c1-c78d0a6a0eec/woocommerceapiv3.md "HTTP/1.1 200 OK"
backend-core  | INFO:     192.168.1.15:65181 - "DELETE /chat/e742ea72-83ef-424a-b1d4-6ee49ffdedcf HTTP/1.1" 200 OK
worker        | [2024-06-29 07:34:21,812: INFO/ForkPoolWorker-22] HTTP Request: GET http://host.docker.internal:54321/rest/v1/vectors?select=id&file_sha1=eq.None "HTTP/1.1 200 OK"
worker        | [2024-06-29 07:34:21,822: INFO/ForkPoolWorker-22] HTTP Request: GET http://host.docker.internal:54321/rest/v1/vectors?select=id&file_sha1=eq.None "HTTP/1.1 200 OK"
worker        | [2024-06-29 07:34:21,827: INFO/ForkPoolWorker-22] HTTP Request: GET http://host.docker.internal:54321/rest/v1/brains_vectors?select=brain_id%2C%20vector_id&brain_id=eq.07121c8f-4d1c-42b6-b0c1-c78d0a6a0eec&file_sha1=eq.None "HTTP/1.1 200 OK"
worker        | [2024-06-29 07:34:21,832: WARNING/ForkPoolWorker-22] Error processing file: 'File' object has no attribute 'file'
worker        | [ERROR] quivr_api.celery_worker [celery_worker.py:91]: 'File' object has no attribute 'file'
worker        | Traceback (most recent call last):
worker        |   File "/code/api/quivr_api/celery_worker.py", line 69, in process_file_and_notify
worker        |     filter_file(
worker        |   File "/code/api/quivr_api/packages/files/processors.py", line 103, in filter_file
worker        |     raise e
worker        |   File "/code/api/quivr_api/packages/files/processors.py", line 86, in filter_file
worker        |     result = file_processors[file.file_extension](
worker        |              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
worker        |   File "/code/api/quivr_api/packages/files/parsers/markdown.py", line 10, in process_markdown
worker        |     return process_file(
worker        |            ^^^^^^^^^^^^^
worker        |   File "/code/api/quivr_api/packages/files/parsers/common.py", line 36, in process_file
worker        |     doc = file.file
worker        |           ^^^^^^^^^
worker        |   File "/usr/local/lib/python3.11/site-packages/pydantic/main.py", line 811, in __getattr__
worker        |     raise AttributeError(f'{type(self).__name__!r} object has no attribute {item!r}')
worker        | AttributeError: 'File' object has no attribute 'file'
worker        | [2024-06-29 07:34:21,841: INFO/ForkPoolWorker-22] HTTP Request: PATCH http://host.docker.internal:54321/rest/v1/notifications?id=eq.1d11cf53-d076-4ad7-ac3e-d410e1525954 "HTTP/1.1 200 OK"
worker        | [2024-06-29 07:34:21,843: INFO/ForkPoolWorker-22] Task process_file_and_notify[e7a3c704-c721-4c1f-9509-a36c2c71f9ad] succeeded in 0.06813362699995196s: None
backend-core  | INFO:     192.168.1.15:65181 - "GET /knowledge?brain_id=07121c8f-4d1c-42b6-b0c1-c78d0a6a0eec HTTP/1.1" 200 OK

Twitter / LinkedIn详细信息

  • 无响应*
0aydgbwb

0aydgbwb2#

要解决在处理quivr中上传的文件时出现的错误 "'File' object has no attribute 'file'",需要确保您没有错误地尝试访问不存在的属性fileFile对象应该具有正确定义的属性和方法。
以下是处理File对象的已更正部分代码:

  1. 确保File类正确定义:
class File:
    def __init__(self, file_name, tmp_file_path, bytes_content, file_size, file_extension):
        self.file_name = file_name
        self.tmp_file_path = tmp_file_path
        self.bytes_content = bytes_content
        self.file_size = file_size
        self.file_extension = file_extension

    def file_already_exists(self):
        # Implementation
        pass

    def file_already_exists_in_brain(self, brain_id):
        # Implementation
        pass

    def file_is_empty(self):
        return self.file_size == 0

    def link_file_to_brain(self, brain_id):
        # Implementation
        pass
  1. 更新process_file_and_notify函数:
@celery.task(name="process_file_and_notify")
def process_file_and_notify(
    file_name: str,
    file_original_name: str,
    brain_id,
    notification_id=None,
    integration=None,
    delete_file=False,
):
    try:
        supabase_client = get_supabase_client()
        tmp_name = file_name.replace("/", "_")
        base_file_name = os.path.basename(file_name)
        _, file_extension = os.path.splitext(base_file_name)

        with NamedTemporaryFile(
            suffix="_" + tmp_name,
        ) as tmp_file:
            res = supabase_client.storage.from_("quivr").download(file_name)
            tmp_file.write(res)
            tmp_file.flush()
            file_instance = File(
                file_name=base_file_name,
                tmp_file_path=tmp_file.name,
                bytes_content=res,
                file_size=len(res),
                file_extension=file_extension,
            )
            brain_vector_service = BrainVectorService(brain_id)
            if delete_file:
                brain_vector_service.delete_file_from_brain(
                    file_original_name, only_vectors=True
                )

            filter_file(
                file=file_instance,
                brain_id=brain_id,
                original_file_name=file_original_name,
            )

            if notification_id:
                notification_service.update_notification_by_id(
                    notification_id,
                    NotificationUpdatableProperties(
                        status=NotificationsStatusEnum.SUCCESS,
                        description="Your file has been properly uploaded!",
                    ),
                )
            brain_service.update_brain_last_update_time(brain_id)

            return True

    except TimeoutError:
        logger.error("TimeoutError")

    except Exception as e:
        logger.exception(e)
        notification_service.update_notification_by_id(
            notification_id,
            NotificationUpdatableProperties(
                status=NotificationsStatusEnum.ERROR,
                description=f"An error occurred while processing the file: {e}",
            ),
        )
  1. 确保filter_file函数正确实现:
def filter_file(
    file,
    brain_id,
    original_file_name=None,
):
    file_exists = file.file_already_exists()
    file_exists_in_brain = file.file_already_exists_in_brain(brain_id)
    using_file_name = file.file_name

    brain = brain_service.get_brain_by_id(brain_id)
    if brain is None:
        raise Exception("It seems like you're uploading knowledge to an unknown brain.")

    if file_exists_in_brain:
        return create_response(
            f"🤔 {using_file_name} already exists in brain {brain.name}.",
            "warning",
        )
    elif file.file_is_empty():
        return create_response(
            f"❌ {original_file_name} is empty.",
            "error",
        )
    elif file_exists:
        file.link_file_to_brain(brain_id)
        return create_response(
            f"✅ {using_file_name} has been uploaded to brain {brain.name}.",
            "success",
        )

    if file.file_extension in file_processors:
        try:
            result = file_processors[file.file_extension](
                file=file,
                brain_id=brain_id,
                original_file_name=original_file_name,
            )
            if result is None or result == 0:
                return create_response(
                    f"? {using_file_name} has been uploaded to brain. There might have been an error while reading it, please make sure the file is not illformed or just an image",
                    "warning",
                )
            return create_response(
                f"✅ {using_file_name} has been uploaded to brain {brain.name} in {result} chunks",
                "success",
            )
        except Exception as e:
            print(f"Error processing file: {e}")
            raise e

    return create_response(
        f"❌ {using_file_name} is not supported.",
        "error",
    )

通过确保File类及其方法正确定义和使用,您应该能够解决错误1(https://github.com/QuivrHQ/quivr/blob/main/backend/api/quivr_api/modules/sync/utils/upload.py)2(https://github.com/QuivrHQ/quivr/blob/main/backend/api/quivr_api/modules/upload/controller/upload_routes.py)3(https://github.com/QuivrHQ/quivr/blob/main/backend/api/quivr_api/modules/upload/service/upload_file.py)4(https://github.com/QuivrHQ/quivr/blob/main/backend/api/quivr_api/modules/assistant/ito/summary.py)。

  • 要继续与Dosu交流,请提及@dosu。*
vmjh9lq9

vmjh9lq93#

更新:我用来上传到Quivr的任何文件类型都会产生相同的错误。例如,PDF:
在最新的Ubuntu 22.04上运行,docker-compose.yml。无法弄清楚我在这里遗漏了什么...

worker        | [2024-06-29 09:37:48,388: INFO/ForkPoolWorker-22] HTTP Request: GET http://host.docker.internal:54321/rest/v1/brains_vectors?select=brain_id%2C%20vector_id&brain_id=eq.07121c8f-4d1c-42b6-b0c1-c78d0a6a0eec&file_sha1=eq.None "HTTP/1.1 200 OK"
worker        | [2024-06-29 09:37:48,393: WARNING/ForkPoolWorker-22] Error processing file: 'File' object has no attribute 'file'
worker        | [ERROR] quivr_api.celery_worker [celery_worker.py:91]: 'File' object has no attribute 'file'
worker        | Traceback (most recent call last):
worker        |   File "/code/api/quivr_api/celery_worker.py", line 69, in process_file_and_notify
worker        |     filter_file(
worker        |   File "/code/api/quivr_api/packages/files/processors.py", line 103, in filter_file
worker        |     raise e
worker        |   File "/code/api/quivr_api/packages/files/processors.py", line 86, in filter_file
worker        |     result = file_processors[file.file_extension](
worker        |              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
worker        |   File "/code/api/quivr_api/packages/files/parsers/pdf.py", line 14, in process_pdf
worker        |     return process_file(
worker        |            ^^^^^^^^^^^^^
worker        |   File "/code/api/quivr_api/packages/files/parsers/common.py", line 36, in process_file
worker        |     doc = file.file
worker        |           ^^^^^^^^^
worker        |   File "/usr/local/lib/python3.11/site-packages/pydantic/main.py", line 811, in __getattr__
worker        |     raise AttributeError(f'{type(self).__name__!r} object has no attribute {item!r}')
worker        | AttributeError: 'File' object has no attribute 'file'
worker        | [2024-06-29 09:37:48,402: INFO/ForkPoolWorker-22] HTTP Request: PATCH http://host.docker.internal:54321/rest/v1/notifications?id=eq.85f784e8-43b4-4d3b-8fd1-1fa8e4eaa150 "HTTP/1.1 200 OK"
worker        | [2024-06-29 09:37:48,405: INFO/ForkPoolWorker-22] Task process_file_and_notify[86e3f8bf-e69f-44b5-9f83-75795cef444f] succeeded in 0.09197982099999535s: None
backend-core  | INFO:     192.168.1.15:63821 - "GET /knowledge?brain_id=07121c8f-4d1c-42b6-b0c1-c78d0a6a0eec HTTP/1.1" 200 OK
jtw3ybtb

jtw3ybtb4#

我明白了:
临时移除 LLAMA_CLOUD_API_KEY 解决了问题
quivr/backend/api/quivr_api/packages/files/parsers/common.py
第35行到第37行 in 2e4b801
| ifos.getenv("LLAMA_CLOUD_API_KEY"): |
| doc=file.file |
| document_ext=os.path.splitext(doc.filename)[1] |

相关问题