在Elasticsearch中表示每个搜索输出文档中找到的关键字

elcex8rz  于 2023-03-22  发布在  ElasticSearch
关注(0)|答案(1)|浏览(92)

我有一个索引与下面的Map:

"result": {
          "properties": {
            "duration": {
              "type": "float"
            },
            "endTime": {
              "type": "float"
            },
            "results": {
              "properties": {
                "offsets": {
                  "type": "long"
                },
                "output": {
                  "type": "text",
                  "fields": {
                    "keyword": {
                      "type": "keyword",
                      "ignore_above": 256
                    }
                  }
                }
              }
            },
            "startTime": {
              "type": "float"
            }
          }
        },
        "timestamp": {
          "type": "date"
        }

我搜索的数据,在与下面的请求体:

{
  "query": {
    "bool": {
      "must": [
        {
          "range": {
            "timestamp": {
              "gte": "now-5d"
            }
          }
        }
      ],
      "filter": [
        {
          "terms": {
            "result.results.output": [
              "keyword 1",
              "keyword 2"
            ]
          }
        }
      ]
    }
  }
}

但我想得到的结果与一个额外的搜索时间字段,代表发现关键字列表中的每个文件。
我使用了突出显示选项,突出显示关键字在我的结果文本是2行左右,但我只想在该结果中出现的关键字。
编辑:我想有如下成果(考虑到我搜索关键字-1和关键字-2):

{
  "hits": [
    {
      "_index": "indx",
      "_id": "V3bd34YBoGQu2_2fF-La",
      "_score": 23.308632,
      "_source": {
        "result": {
          "results": {
            "output": "Text for test keyword-6 and keyword-2",
            "offsets": [
              3,
              4,
              6,
              7
            ]
          },
          "startTime": 1678793770.9358897,
          "endTime": 1678793772.0446372,
          "duration": 1.1087474822998047
        },
        "timestamp": "2023-03-14T11:24:25.942390",
        "found_keywords": [
          "keyword-2"
        ]
      }
    },
    {
      "_index": "indx",
      "_id": "V3bN34YBoGQu2_2fJ-GA",
      "_score": 7.1683946,
      "_source": {
        "result": {
          "results": {
            "output": "Text for test keyword-1 and keyword-3",
            "offsets": [
              0,
              6,
              9
            ]
          },
          "startTime": 1678792726.6787088,
          "endTime": 1678792727.4770997,
          "duration": 0.7983908653259277
        },
        "timestamp": "2023-03-14T11:07:01.381388",
        "found_keywords": [
          "keyword-1"
        ]
      }
    },
    {
      "_index": "indx",
      "_id": "I3bZ34YBoGQu2_2f6-I7",
      "_score": 6.0239806,
      "_source": {
        "result": {
          "results": {
            "output": "Text for test keyword-3",
            "offsets": [
              0,
              5,
              9,
              12,
              14
            ]
          },
          "startTime": 1678793563.2363102,
          "endTime": 1678793564.0036342,
          "duration": 0.7673239707946777
        },
        "timestamp": "2023-03-14T11:20:57.909010",
        "found_keywords": []
      }
    },
    {
      "_index": "indx",
      "_id": "Jnba34YBoGQu2_2fFuKa",
      "_score": 5.947863,
      "_source": {
        "result": {
          "results": {
            "output": "Text for test keyword-2",
            "offsets": [
              0,
              5
            ]
          },
          "startTime": 1678793574.3921273,
          "endTime": 1678793575.1250415,
          "duration": 0.7329142093658447
        },
        "timestamp": "2023-03-14T11:21:09.015369",
        "found_keywords": [
          "keyword-2"
        ]
      }
    },
    {
      "_index": "indx",
      "_id": "VHbN34YBoGQu2_2fC-EM",
      "_score": 5.7249584,
      "_source": {
        "result": {
          "results": {
            "output": "Text for test keyword-1",
            "offsets": [
              0,
              5,
              11,
              12
            ]
          },
          "startTime": 1678792716.0380695,
          "endTime": 1678792719.9789963,
          "duration": 3.9409267902374268
        },
        "timestamp": "2023-03-14T11:06:54.097004",
        "found_keywords": [
          "keyword-1"
        ]
      }
    },
    {
      "_index": "indx",
      "_id": "pnbR34YBoGQu2_2f7-GV",
      "_score": 5.6651397,
      "_source": {
        "result": {
          "results": {
            "output": "Text for test keyword-1 and keyword-2",
            "offsets": [
              4
            ]
          },
          "startTime": 1678793040.0872865,
          "endTime": 1678793040.8126702,
          "duration": 0.7253837585449219
        },
        "timestamp": "2023-03-14T11:12:14.743225",
        "found_keywords": [
          "keyword-1",
          "keyword-2"
        ]
      }
    }
  ]
}

我不想在搜索结果中突出显示关键字。我需要在搜索时使用相同的scripted_fields。

7z5jn7bk

7z5jn7bk1#

如果我理解正确的话,你想在结果中突出显示搜索词。你使用的是术语字段,而不是关键字字段,这是正确的方法。如果你想过滤,你应该传递完整的搜索词,而不仅仅是“关键字1”。下面我做了一个例子,当搜索词是“关键字1”时,突出显示将被应用。

POST idx_index/_doc
{
  "result": {
    "results": {
      "output": "this is keyword 1"
    }
  }
}

POST idx_index/_search
{
  "query": {
    "match": {
      "result.results.output": "keyword 1"
    }
  },
  "highlight": {
    "fields": {
      "result.results.output": {}
    }
  }
}

相关问题