text-generation-inference 选择错误的工具将导致服务器崩溃,

4ioopgfo  于 2个月前  发布在  其他
关注(0)|答案(1)|浏览(73)

系统信息

操作系统:

PRETTY_NAME="Debian GNU/Linux 11 (bullseye)"
NAME="Debian GNU/Linux"
VERSION_ID="11"
VERSION="11 (bullseye)"
VERSION_CODENAME=bullseye
ID=debian
HOME_URL="https://www.debian.org/"
SUPPORT_URL="https://www.debian.org/support"
BUG_REPORT_URL="https://bugs.debian.org/"

使用的模型: mistralai/Mistral-7B-Instruct-v0.3
硬件: 1 L4
尝试了最新版本的docker镜像。

信息

  • Docker
  • 直接使用CLI

任务

  • 一个官方支持的命令
  • 我自己的修改

重现

使用以下命令启动服务器:

docker run --gpus all --shm-size 1g -p 8080:80 -e HUGGING_FACE_HUB_TOKEN=$HUGGING_FACE_HUB_TOKEN -v $PWD/data:/data ghcr.io/huggingface/text-generation-inference:latest --model-id mistralai/Mistral-7B-Instruct-v0.3

然后发送以下调用:

import requests

conversation = [
    {"role": "user", "content": "What's the weather like in Paris?"},
]

tools = [
    {
        "type": "function",
        "function": {
            "name": "get_current_weather",
            "description": "Get▁the▁current▁weather",
            "parameters": {
                "type": "object",
                "properties": {
                    "location": {"type": "string", "description": "The city and state, e.g. San Francisco, CA"},
                    "format": {
                        "type": "string",
                        "enum": ["celsius", "fahrenheit"],
                        "description": "The temperature unit to use. Infer this from the users location.",
                    },
                },
                "required": ["location", "format"],
            },
        },
    }
]

response = requests.post(
    url="http://localhost:8080/v1/chat/completions",
    json={
        "messages": conversation,
        "model": "mistralai/Mistral-7B-Instruct-v0.2",
        "temperature": 0.1,
        "tool_choice": "required",
        # "tool_prompt": "\"You will be presented with a JSON schema representing a set of tools.\nIf the user request lacks of sufficient information to make a precise tool selection: Do not invent any tool's properties, instead notify with an error message.\n\nJSON Schema:\n\"",
        "tools": tools,
        "max_tokens": 1000,
    },
)

错误:

(task, pid=12212) 2024-05-29T14:56:04.338119Z  INFO text_generation_router: router/src/main.rs:369: Connected
(task, pid=12212) 2024-05-29T14:56:04.338153Z  WARN text_generation_router: router/src/main.rs:383: Invalid hostname, defaulting to 0.0.0.0
(task, pid=12212) 2024-05-29T14:58:01.008313Z  INFO chat_completions{total_time="5.576392398s" validation_time="1.850855ms" queue_time="130.083µs" inference_time="5.574411606s" time_per_token="61.937906ms" seed="Some(14966871623831239824)"}: text_generation_router::server: router/src/server.rs:322: Success
(task, pid=12212) thread 'tokio-runtime-worker' panicked at router/src/infer.rs:407:44:
(task, pid=12212) Tool with name required not found
(task, pid=12212) note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
(task, pid=12212) 2024-05-29T14:58:07.628698Z ERROR text_generation_launcher: Webserver Crashed
(task, pid=12212) 2024-05-29T14:58:07.629433Z  INFO text_generation_launcher: Shutting down shards
(task, pid=12212) 2024-05-29T14:58:07.631861Z  INFO shard-manager: text_generation_launcher: Terminating shard rank=0
(task, pid=12212) 2024-05-29T14:58:07.631937Z  INFO shard-manager: text_generation_launcher: Waiting for shard to gracefully shutdown rank=0
(task, pid=12212) 2024-05-29T14:58:09.433647Z  INFO shard-manager: text_generation_launcher: shard terminated rank=0

预期行为

我认为服务器应该发送一个类似 {'error': 'Input validation error: Tool with name required not found', 'error_type': 'validation'} 的错误消息,但不应该崩溃。

9ceoxa92

9ceoxa921#

你好,我确认了,我已经遇到过这个问题。

相关问题