ChatGPT-3 当向OpenAI聊天API发送消息时，它是否添加了JSON特殊字符ex,“{”到prompt_tokens的最终数量？

s4n0splo 于 2023-05-29 发布在其他

关注(0)|答案(1)|浏览(152)

你好，我正在发送此消息：最美的国家是哪个？
我把它作为一个json对象{“role”：“user”，“content”：“What is the most beautiful country？“}
我以为它会为提示返回7个令牌，但它没有。
它将为提示符返回15个令牌。这是正确的，还是它不应该返回那个金额？即使只发送一个点“.”作为消息，它也会为提示返回9个令牌。
我用的是GPT-3.5-Turbo

gpt-3

来源：https://stackoverflow.com/questions/76347639/when-sending-a-message-to-openai-chat-api-does-it-add-json-special-characters-ex

1条答案

按热度按时间

rslzwgfq1#

如果您将{"role": "user", "content": "What is the most beautiful country?"}作为messages参数发送，则发送到OpenAI API端点的不仅是What is the most beautiful country?，而是整个"role": "user", "content": "What is the most beautiful country?"。
我可以使用tiktoken来确认这一点。
如果你运行get_tokens_long_example.py，你会得到以下输出：
14

get_tokens_long_example.py

import tiktoken

def num_tokens_from_string(string: str, encoding_name: str) -> int:
    """Returns the number of tokens in a text string."""
    encoding = tiktoken.get_encoding(encoding_name)
    num_tokens = len(encoding.encode(string))
    return num_tokens

print(num_tokens_from_string("'role':'user','content':'What is the most beautiful country?'", "cl100k_base"))

如果你运行get_tokens_short_example.py，你会得到以下输出：
8

get_tokens_short_example.py

import tiktoken

def num_tokens_from_string(string: str, encoding_name: str) -> int:
    """Returns the number of tokens in a text string."""
    encoding = tiktoken.get_encoding(encoding_name)
    num_tokens = len(encoding.encode(string))
    return num_tokens

print(num_tokens_from_string("'role':'user','content':'.'", "cl100k_base"))

你说OpenAI API报告了第一个示例中使用的15令牌和第二个示例中使用的9令牌。你可能注意到我使用tiktoken获得了14和8令牌（即在两个示例中少了1个令牌）。这似乎是一个应该是solved的known tiktoken problem。
无论如何，我没有深入研究为什么我仍然少了1个令牌，但我能够证明整个"role": "user", "content": "What is the most beautiful country?"都被发送到OpenAI API端点，而不仅仅是What is the most beautiful country?。
有关tiktoken的更多信息，请参阅此答案。

赞(0）回复(0）举报 2023-05-29

我来回答

ChatGPT-3 当向OpenAI聊天API发送消息时，它是否添加了JSON特殊字符ex,“{”到prompt_tokens的最终数量？

1条答案

相关问题

热门标签

最新问答