llama_index [问题]:complete、chat、stream_complete和stream_chat之间的不一致性？

7xzttuei 于 5个月前发布在其他

关注(0)|答案(6)|浏览(42)

问题验证

在文档和Discord中寻找答案。

问题

使用OpenAI,可以完成和聊天返回功能。但是stream_chat和stream_complete没有这个功能。
使用Ollama,可以完成、聊天和stream_complete返回诸如total_duration、prompt_eval_count等信息。但是，stream_chat没有这个功能。
OpenAI没有stream_chat: ChatResponse(message=ChatMessage(role=<MessageRole.ASSISTANT: 'assistant'>, content="Ahoy matey! The name's Captain Rainbowbeard, the most colorful pirate on the seven seas! What can I do for ye today?", additional_kwargs={}), raw={'id': 'chatcmpl-8u54QByCw9wg2ZnG3S9zUJO40Y0T8', 'choices': [Choice(finish_reason='stop', index=0, logprobs=None, message=ChatCompletionMessage(content="Ahoy matey! The name's Captain Rainbowbeard, the most colorful pirate on the seven seas! What can I do for ye today?", role='assistant', function_call=None, tool_calls=None))], 'created': 1708376550, 'model': 'gpt-3.5-turbo-0125', 'object': 'chat.completion', 'system_fingerprint': 'fp_6dd124df95', 'usage': CompletionUsage(completion_tokens=30, prompt_tokens=23, total_tokens=53)}, delta=None, additional_kwargs={})
OpenAI stream_chat: ChatResponse(message=ChatMessage(role=<MessageRole.ASSISTANT: 'assistant'>, content="Ahoy matey! The name's Captain Rainbowbeard! Aye, I be a pirate with a love for all things colorful and bright. Me beard be as vibrant as a rainbow, and me ship be the most colorful vessel on the seven seas! What can I do for ye today, me hearty?", additional_kwargs={}), raw={'id': 'chatcmpl-8u559v4ifr04tuzbTJvNqlFdef6fe', 'choices': [Choice(delta=ChoiceDelta(content=None, function_call=None, role=None, tool_calls=None), finish_reason='stop', index=0, logprobs=None)], 'created': 1708376595, 'model': 'gpt-3.5-turbo-0125', 'object': 'chat.completion.chunk', 'system_fingerprint': 'fp_69829325d0'}, delta='', additional_kwargs={})
Ollama没有stream_chat: ChatResponse(message=ChatMessage(role=<MessageRole.ASSISTANT: 'assistant'>, content=" My name is Captain Parrot Beak.

I'm an eccentric and jolly pirate who loves to live life on the seas, singing sea shanties, telling tall tales, and swabbing my wooden leg. You could say I'm a bit of a buccaneer with flair! Join me in adventures filled with gold, laughter, and a touch of the paranormal. Just remember, we're here to have fun. After all, life is better with pirates in it! Arrrr, matey!"}, 'done': True, 'total_duration': 20255928348, 'load_duration': 2862670561, 'prompt_eval_count': 31, 'prompt_eval_duration': 1548238000, 'eval_count': 117, 'eval_duration': 15844499000}, delta=None, additional_kwargs={'model': 'neural-chat', 'created_at': '2024-02-19T20:57:22.330826Z', 'done': True, 'total_duration': 20255928348, 'load_duration': 2862670561, 'prompt_eval_count': 31, 'prompt_eval_duration': 1548238000, 'eval_count': 117, 'eval_duration': 15844499000})
Ollama Stream chat: ChatResponse(message=ChatMessage(role=<MessageRole.ASSISTANT: 'assistant'>, content=" My name is Captain Parrot Beard. I'm known for my love of rum, parrots and walking the plank. But don't let my light-hearted appearance fool you; I'm a fierce pirate leader with great strategic skills. Arrrr!", additional_kwargs={}), raw={'model': 'neural-chat', 'created_at': '2024-02-19T15:30:20.822078Z', 'message': {'role': 'assistant', 'content': '!'}, 'done': False}, delta='!', additional_kwargs={'model': 'neural-chat', 'created_at': '2024-02-19T15:30:20.822078Z', 'done': False})

llama_index

来源：https://github.com/run-llama/llama_index/issues/10992

6条答案

按热度按时间

gmxoilav1#

🤖*

Hey @chigkim,很高兴看到你重新回到代码世界和海盗的世界！ 🦜
你在 complete 、 chat 、 stream_complete 和 stream_chat 方法在 OpenAI 和 Ollama 之间的返回值不一致是由于这些方法在各自的库中以不同的方式实现。
在 OpenAI 中，complete 和 chat 方法返回使用信息，因为它们被设计成与 OpenAI API 交互并返回 API 的响应，其中包括生成的文本和其他关于 API 请求的信息。然而，stream_chat 和 stream_complete 方法不返回使用信息，因为它们被设计成从 API 返回一系列响应，而这些响应中的使用信息并不包含在每个单独的响应中。
在 Ollama 中，complete 、 chat 和 stream_complete 方法返回诸如 total_duration 、 prompt_eval_count 等的附加信息，因为它们是以处理 API 响应并提取这些附加信息的方式实现的。然而，stream_chat 不返回这种附加信息，因为它被设计成从 API 返回一系列响应，而这些响应中的附加信息并不包含在每个单独的响应中。
这不是 LlamaIndex 代码库中的不一致性，而是 OpenAI 和 Ollama 库中这些方法的设计和实现之间的差异。如果你需要 Ollama 中 complete 、 chat 和 stream_complete 方法为 stream_chat 返回的附加信息，你可能需要修改 Ollama 库中的 stream_chat 方法以提取并返回这些信息。
要了解这些方法如何在 LlamaIndex 中实现，你可以参考 LlamaIndex repository 中的源代码。具体来说，你可以查看 llama_index/llms/cohere/base.py 文件中的 complete 类，该类实现了 chat 、 stream_complete 、 stream_chat 等方法。
我希望这有助于澄清这些方法在 OpenAI 和 Ollama 之间返回值之间的差异。如果你有任何进一步的问题，请随时提问。

参考资料

关于 Dosu

这个回复是为了帮助你节省时间并且有用。它不是精确的解决方案,而是一个你自己研究的起点。
帮助我变得更有用吧！如果这个有帮助请给我一个👍，如果不相关就给一个👎。
如果你想继续对话，请用 @dosu-bot 作为你的回复开头。

赞(0）回复(0）举报 5个月前

kse8i1jr2#

Just commenting on @dosu-bot response which is partially correct.
OpenAI library doesn't seem to return token usage when streaming.
However Ollama returns total_duration, prompt_eval_count, etc. when using complete and chat api with streaming. However, llamaindex returns the info when using stream_complete, but not stream_chat for some reason.

赞(0）回复(0）举报 5个月前

c3frrgcw3#

🤖*

您在OpenAI和Ollama库之间观察到的不一致性可能是由于这些库的不同设计和实现所致。每个库都针对不同的用例进行了设计，因此它们可能提供不同的方法并返回不同类型的信息。
要修改Ollama库中的stream_chat方法以返回类似于LlamaIndex中的stream_complete方法的额外信息，如total_duration和prompt_eval_count,您可以将这些字段添加到由stream_chat方法生成的ChatResponse对象中。然而，这需要Ollama API服务器在其响应中返回这些信息。如果Ollama API服务器不返回这些信息，您需要修改服务器以实现这一点。
以下是修改stream_chat方法的示例：
在这个修改中，由stream_chat方法生成的ChatResponse对象包含两个新字段：total_duration和prompt_eval_count。这些字段使用从Ollama API服务器响应的每一行创建的chunk字典中的相应值填充。如果服务器不返回这些值，chunk.get("total_duration")和chunk.get("prompt_eval_count")将返回None。
请注意，此修改假定Ollama API服务器在其响应中返回total_duration和prompt_eval_count。如果没有，您需要修改服务器以实现这一点。
您可以在LlamaIndex代码库中的stream_chat方法中找到当前的结构。
希望这对您有所帮助！如果您有任何进一步的问题，请随时提问。

参考资料

关于Dosu

此响应旨在帮助您节省时间并提高效率。它不是精确解决方案,而是您自己研究的起点。
帮助我变得更有用！如果您觉得有帮助请留下一个👍，如果无关请留下一个👎。
如果您想继续对话，请以@dosu-bot开头回复。

赞(0）回复(0）举报 5个月前

rmbxnbpk4#

如果在llama_index/llms/ollama/base.py文件的第150行和151行注解掉stream_chat函数，它就可以正常工作。我能够获取到所有的统计信息。我想知道为什么这是故意被忽略的？如果没有危害，能否修复它？

赞(0）回复(0）举报 5个月前

6qftjkof5#

你好，@chigkim

我正在帮助LlamaIndex团队管理他们的积压任务，并将此问题标记为过时。从我的理解来看，你提出的问题突显了OpenAI和Llama API中不同方法返回的响应之间的不一致性。Dosubot提供了关于这些差异的见解，并建议修改Llama库中的stream_chat方法以返回更多信息。你确认在stream_chat方法中注解掉某些行后，可以获取响应中的所有统计信息，并质疑为什么故意选择忽略这一点。

请问这个问题是否仍然与LlamaIndex仓库的最新版本相关？如果是，请通过评论此问题告知LlamaIndex团队。否则，请随时关闭此问题，或者它将在7天后自动关闭。谢谢！

赞(0）回复(0）举报 5个月前

xqkwcwgp6#

据我所知，这仍然是个问题。

赞(0）回复(0）举报 5个月前

我来回答

llama_index [问题]:complete、chat、stream_complete和stream_chat之间的不一致性？

问题验证

问题

6条答案

参考资料

关于 Dosu

参考资料

关于Dosu

相关问题

热门标签

最新问答