haystack 实验性：OpenAIFunctionCaller字符串解析

9cbw7uwe 于 3个月前发布在其他

关注(0)|答案(2)|浏览(49)

描述bug

在使用haystack实验版的OpenAIFunctionCaller时，如果函数返回包含特殊字符，这些字符将被错误地解析。这对于某些语言和一些模型(如gemma)来说很重要，因为它们喜欢使用表情符号。

错误信息

示例聊天：
样式：
聊天消息.角色
聊天消息.内容

From: ChatRole.USER
Qual o preço da taça de romã? Pesquise na base de dados, por favor

From: ChatRole.ASSISTANT
[{"id": "call_0rca", "function": {"arguments": "{\"query\":\"pre\\u00e7o da ta\\u00e7a de rom\\u00e3\"}", "name": "consulta_base_de_dados"}, "type": "function"}]

From: ChatRole.FUNCTION
{"reply": "O termo n\u00e3o resultou em nenhum resultado relevante"}

模型调用RAG管道，但返回结果被错误地解析。

预期行为

From: ChatRole.ASSISTANT
[{"id": "call_0rca", "function": {"arguments": "{\"query\":\"preço da taça de romã\"}", "name": "consulta_base_de_dados"}, "type": "function"}]

From: ChatRole.FUNCTION
{"reply": "O termo não resultou em nenhum resultado relevante"}

附加上下文

我认为这个问题与OpenAiFunctionCaller类中的json.dumps有关。

function_to_call = self.available_functions[function_name]
                    try:
                        function_response = function_to_call(**function_args)
                        messages.append(
                            ChatMessage.from_function(
                                content=json.dumps(function_response),
                                name=function_name,
                            )
                        )

示例：

import json

strings = [
    '{"arguments": "çãõóòàáéêíôû"}',
    '{"arguments": "ßÜüöÖäÄëÉ"}',
    '{"arguments": "¡¿?¡!¿?!"}',
    '{"arguments": "😊👍"}',
]
for string in strings:
    input = json.loads(string)
    print("Input:")
    print(input)  # Output: {'arguments': 'ç\u00e3\u00f4\u00f3\u00f2\u00e0\u00e1'}
    print(input["arguments"])
    output = json.dumps(input)
    print("Output:")
    print(output)

    print()

输出：

Input:
{'arguments': 'çãõóòàáéêíôû'}
çãõóòàáéêíôû
Output:
{"arguments": "\u00e7\u00e3\u00f5\u00f3\u00f2\u00e0\u00e1\u00e9\u00ea\u00ed\u00f4\u00fb"}

Input:
{'arguments': 'ßÜüöÖäÄëÉ'}
ßÜüöÖäÄëÉ
Output:
{"arguments": "\u00df\u00dc\u00fc\u00f6\u00d6\u00e4\u00c4\u00eb\u00c9"}

Input:
{'arguments': '¡¿?¡!¿?!'}
¡¿?¡!¿?!
Output:
{"arguments": "\u00a1\u00bf?\u00a1!\u00bf?!"}

Input:
{'arguments': '😊👍'}
😊👍
Output:
{"arguments": "\ud83d\ude0a\ud83d\udc4d"}

重现方法

创建一个返回包含特殊字符的字符串的函数，并使用OpenAiFunctionCaller处理它，或者使用上面的代码片段测试json.dumps如何处理特殊字符。我认为在这个类中也发生了同样的事情。

常见问题解答

[ ] 你查看过our new FAQ page吗？
系统信息：
OS: Ubuntu 22.04(在devcontainer内部)
GPU/CPU: RTX 4070/I9 139000KS
Haystack版本(提交或版本号):

haystack-ai = "2.3.1"
haystack-experimental = "0.1.1"

文档存储：QDrant
模型后端：Ollama(与OpenAI兼容的API)
模型：llama3.1 8b用于函数调用和gemma2用于RAG摘要

haystack

来源：https://github.com/deepset-ai/haystack/issues/8170

2条答案

按热度按时间

cu6pst1q1#

这可能与#7674有关。

赞(0）回复(0）举报 3个月前

bqjvbblv2#

同时注意到当前的ChatMessage和ChatRole类(以及因此OpenAiFunctionCaller也是如此)将"function"用作返回函数调用的角色，但显然在openai规范中，"function"已被弃用，可能不会被某些提供者接受。
参见：ollama/ollama#6213(评论)
但是我也看到你们已经在使用工具抽象进行更新了。

赞(0）回复(0）举报 3个月前

我来回答

haystack 实验性：OpenAIFunctionCaller字符串解析

2条答案

相关问题

热门标签

最新问答