langchain 使用HuggingFace的Chat 在使用本地TGI时,HuggingFace需要hf令牌来访问HuggingFace端点,

ilmyapht  于 4个月前  发布在  其他
关注(0)|答案(3)|浏览(65)

检查其他资源

  • 为这个问题添加了一个非常描述性的标题。
  • 使用集成搜索在LangChain文档中搜索。
  • 使用GitHub搜索查找类似的问题,但没有找到。
  • 我确信这是LangChain中的一个错误,而不是我的代码。
  • 通过更新到LangChain的最新稳定版本(或特定集成包)无法解决此错误。

示例代码

from langchain_huggingface import HuggingFaceEndpoint, ChatHuggingFace

# This part works as expected
llm = HuggingFaceEndpoint(endpoint_url="http://127.0.0.1:8080")

# This part raises huggingface_hub.errors.LocalTokenNotFoundError
chat_llm = ChatHuggingFace(llm=llm)

错误信息和堆栈跟踪(如果适用)

Traceback (most recent call last):
.venv/lib/python3.10/site-packages/langchain_huggingface/chat_models/huggingface.py", line 320, in init
self._resolve_model_id()
.venv/lib/python3.10/site-packages/langchain_huggingface/chat_models/huggingface.py", line 458, in _resolve_model_id
available_endpoints = list_inference_endpoints("*")
.venv/lib/python3.10/site-packages/huggingface_hub/hf_api.py", line 7081, in list_inference_endpoints
user = self.whoami(token=token)
.venv/lib/python3.10/site-packages/huggingface_hub/utils/_validators.py", line 114, in _inner_fn
return fn(*args, **kwargs)
.venv/lib/python3.10/site-packages/huggingface_hub/hf_api.py", line 1390, in whoami
headers=self._build_hf_headers(
.venv/lib/python3.10/site-packages/huggingface_hub/hf_api.py", line 8448, in _build_hf_headers
return build_hf_headers(
.venv/lib/python3.10/site-packages/huggingface_hub/utils/_validators.py", line 114, in _inner_fn
return fn(*args, **kwargs)
.venv/lib/python3.10/site-packages/huggingface_hub/utils/_headers.py", line 124, in build_hf_headers
token_to_send = get_token_to_send(token)
.venv/lib/python3.10/site-packages/huggingface_hub/utils/_headers.py", line 158, in get_token_to_send
raise LocalTokenNotFoundError(
huggingface_hub.errors.LocalTokenNotFoundError: Token is required ( token=True ), but no token found. You need to provide a token or be logged in to Hugging Face with huggingface-cli login or huggingface_hub.login . See https://huggingface.co/settings/tokens .

描述

class ChatHuggingFace(BaseChatModel):
    """Hugging Face LLM's as ChatModels.
...
"""  # noqa: E501
    ...
    def __init__(self, **kwargs: Any):
        super().__init__(**kwargs)

        from transformers import AutoTokenizer  # type: ignore[import]

        self._resolve_model_id()  # ---> Even when providing the model_id it will enter here

        self.tokenizer = (
            AutoTokenizer.from_pretrained(self.model_id)
            if self.tokenizer is None
            else self.tokenizer
        )
    ...
    def _resolve_model_id(self) -> None:
        """Resolve the model_id from the LLM's inference_server_url"""

        from huggingface_hub import list_inference_endpoints  # type: ignore[import]

        if _is_huggingface_hub(self.llm) or (
            hasattr(self.llm, "repo_id") and self.llm.repo_id
        ):
            self.model_id = self.llm.repo_id
            return
        elif _is_huggingface_textgen_inference(self.llm):
            endpoint_url: Optional[str] = self.llm.inference_server_url
        elif _is_huggingface_pipeline(self.llm):
            self.model_id = self.llm.model_id
            return
        else: # This is the case we are in when _is_huggingface_endpoint() is True
            endpoint_url = self.llm.endpoint_url
        available_endpoints = list_inference_endpoints("*")  # ---> This line raises the error if we don't provide the hf token
        for endpoint in available_endpoints:
            if endpoint.url == endpoint_url:
                self.model_id = endpoint.repository

        if not self.model_id:
            raise ValueError(
                "Failed to resolve model_id:"
                f"Could not find model id for inference server: {endpoint_url}"
                "Make sure that your Hugging Face token has access to the endpoint."
            )

我通过修改构造函数方法解决了这个问题,当提供 model_id 时,它不会解析它:

class ChatHuggingFace(BaseChatModel):
    """Hugging Face LLM's as ChatModels.
...
"""  # noqa: E501

    ...
    def __init__(self, **kwargs: Any):
        super().__init__(**kwargs)

        from transformers import AutoTokenizer  # type: ignore[import]

        self.model_id or self._resolve_model_id()  # ---> Not a good solution because if model_id is invalid then the tokenizer instantiation will fail only if the tokinizer is not provided and also won't check other hf_hub inference cases

        self.tokenizer = (
            AutoTokenizer.from_pretrained(self.model_id)
            if self.tokenizer is None
            else self.tokenizer
        )

我想象有一种更好的方法来解决这个问题,例如通过添加一些逻辑来检查 endpoint_url 是否是一个有效的IP来请求,或者如果它是用TGI提供的,或者只是通过检查它是否是localhost:

flvlnr44

flvlnr441#

相信这将由#23821修复 - 如果@Jofthomas没有时间,我会检查一下!

fslejnso

fslejnso2#

相信这将由 #23821 修复 - 如果 @Jofthomas 没有时间,我会看看!
嘿 @efriis ,谢谢你的回答!查看 #23821 时,我认为它无法解决问题,因为那个 PR 正在改进 huggingface_tokenHuggingFaceEndpoint 内的管理,正如我在描述中提到的,HuggingFaceEndpoint 在本地主机上按预期工作。
我强烈认为问题出在 ChatHuggingFace 内部,因为它在第 458 行(从 huggingface_hub 调用 list_inference_endpoints("*"))时不应该这样做,因为推理端点是使用 TGI 在本地服务器上提供的。

kt06eoxx

kt06eoxx3#

你是对的@avargasestay ,我的PR草案并没有解决这个问题。我会在下一次提交中提供修复方案。感谢你提醒我。

相关问题