java 根据json-schema从长文本(gpt答案)中提取json？

nnt7mjpx 于 2023-06-28 发布在 Java

关注(0)|答案(1)|浏览(168)

有一些文本包含一个json结果，如（这一行可能会改变）：

{
  "foo":{ "bar" : ["1"] }
}

当我得到文本时，我需要从文本中提取JSON，并确保JSON与我通过示例指示GPT在提示符中返回的格式完全一致。
目前我的解决方案是通过regexp和greedy模式匹配json，然后解析json并通过json-schema检查json是否有效。
有更好的办法吗？

Java

来源：https://stackoverflow.com/questions/76553851/extract-json-from-a-long-text-gpt-answer-according-to-a-json-schema

1条答案

按热度按时间

ddrv8njm1#

您的代码必须始终为ChatGPT不符合您的请求的事件做好准备。但是，如果你提示正确，这种情况很少会发生。
您应该考虑放弃文本和JSON输出的混合。让ChatGPT只使用JSON响应，并要求它将文本写入JSON中的字符串字段。
要让模型生成JSON，您应该通过API使用当前模型之一（gpt-4-0613或gpt-3.5-turbo-0613）。这些模型更重视系统消息，因此您可以要求模型仅以JSON响应。请看下面的例子：

curl https://api.openai.com/v1/chat/completions -u :$OPENAI_API_KEY -H 'Content-Type: application/json' -d '{
   "model": "gpt-3.5-turbo-0613",
   "messages": [
      {"role": "system", "content": "You are a friendly assistant. Your answers are JSON only."},
      {"role": "assistant", "content": "{\"message\": \"Understood. I will output my answers in JSON format.\" }" },
      {"role": "user", "content": "List three attractions in London." }
   ]
}'

聊天GPT将回复漂亮的JSON：

{
  "attractions": [
    {
      "name": "The British Museum",
      "description": "A world-famous museum containing a vast collection of art and artifacts from around the globe."
    },
    {
      "name": "The Tower of London",
      "description": "A historic castle that has served various purposes over the centuries, including a royal palace, prison, and treasury."
    },
    {
      "name": "The London Eye",
      "description": "A gigantic ferris wheel offering panoramic views of the city skyline."
    }
  ]
}

然而，更好的方法是使用ChatGPT新引入的函数调用功能。这允许您定义函数及其参数，然后ChatGPT将为JSON提供所需的结构来调用这些函数。请注意，即使使用这种方法，ChatGPT也可能会偏离您的请求，但这种情况非常罕见。

curl https://api.openai.com/v1/chat/completions -u :$OPENAI_API_KEY -H 'Content-Type: application/json' -d '{
  "model": "gpt-3.5-turbo-0613",
  "messages": [ 
    { "role": "system", "content": "You are a friendly assistant and you will always call one of the provided functions." },
    { "role": "user", "content": "List three attractions in London." }
  ],
  "functions": [{
      "name":"presentAttractions",
      "description": "Presents the attractions to the user.",
      "parameters": {
        "type": "object",
        "properties": {
          "message": {
            "type": "string",
            "description": "A message to display to the user."
          },
          "attractions": {
            "type": "array",
            "description": "A list of attractions.",
            "items": {
              "type": "string"
            }
          }
        },
        "required": [ "message","attractions" ]
      }
    }]
}'

这导致以下响应：

"message": {
        "role": "assistant",
        "content": null,
        "function_call": {
          "name": "presentAttractions",
          "arguments": "{\n  \"message\": \"Here are three attractions in London:\",\n  \"attractions\": [\"Big Ben\", \"Buckingham Palace\", \"Tower Bridge\"]\n}"
        }
      },

通过这种方式，您可以很好地为进一步处理准备JSON。当然，您也可以指定几个不同的函数，让ChatGPT决定调用哪个函数。您可以使用function_call属性限制允许的函数。
有关API的完整描述，请参阅OpenAI API documentation。要了解有关函数调用的更多信息，请阅读OpenAI blog和OpenAI Cookbook advanced examples。

赞(0）回复(0）举报 2023-06-28

我来回答

java 根据json-schema从长文本(gpt答案)中提取json？

1条答案

相关问题

热门标签

最新问答