ElasticSearch:是否可以使用正则表达式字段查询?

jum4pzuy  于 2022-11-02  发布在  ElasticSearch
关注(0)|答案(1)|浏览(155)

我已经使用以下索引设置将数据编入ElasticSearch的索引:

KNN_INDEX = {
    "settings": {
        "index.knn": True,
        "index.knn.space_type": "cosinesimil",
        "index.mapping.total_fields.limit": 10000,
        "analysis": {
          "analyzer": {
            "default": {
              "type": "standard",
              "stopwords": "_english_"
            }
          }
        }
    },
    "mappings": {
        "dynamic_templates": [
            {
                "sentence_vector_template": {
                    "match": "sent_vec*",
                    "mapping": {
                        "type": "knn_vector",
                        "dimension": 384,
                        "store": True
                    }
                }
            },
            {
                "sentence_template": {
                    "match": "sentence*",
                    "mapping": {
                        "type": "text",
                        "store": True
                    }
                }
            }
        ],
        'properties': {
            "metadata": {
                "type": "object"
            }
        }
    }
}

下面是我在ElasticSearch中建立索引的几个示例文档:

{
    # DOC 1
    "sentence_0": "Machine learning for aquatic plastic litter detection, classification and quantification (APLASTIC-Q)Large quantities of mismanaged plastic waste are polluting and threatening the health of the blue planet."
    "sentence_1": "As such, vast amounts of this plastic waste found in the oceans originates from land."
    "sentence_2": "It finds its way to the open ocean through rivers, waterways and estuarine systems."
},
{
    # DOC 2
    "sentence_0": "What predicts persistent early conduct problems?"
    "sentence_1": "Evidence from the Growing Up in Scotland cohortBackground There is a strong case for early identification of factors predicting life-course-persistent conduct disorder."
    "sentence_2": "The authors aimed to identify factors associated with repeated parental reports of preschool conduct problems."
    "sentence_3": "Method Nested caseecontrol study of Scottish children who had behavioural data reported by parents at 3, 4 and 5 years."
    "sentence_4": "Results 79 children had abnormal conduct scores at all three time points ('persistent conduct problems') and 434 at one or two points ('inconsistent conduct problems')."
}

每个索引文档的句子数可能不同。对于查询,我希望搜索所有文档中的所有句子。我可以使用以下查询在所有文档中搜索特定的“句子数”:

query_body = {
        "query": {
            "match": {
                "sentence_0": "persistent"
            }
        }
    }
    result = client.search(index=INDEX_NAME, body=query_body)
    print(result)

但我正在寻找的是像下面这样的东西:

query_body = {
        "query": {
            "match": {
                "sentence_*": "persistent"
            }
        }
    }
result = client.search(index=INDEX_NAME, body=query_body)
print(result)

上面的查询不起作用。是否有可能执行这样的查询搜索?谢谢。

slmsl1lt

slmsl1lt1#

使用query_string,它支持字段名中的regex

{
  "query": {
   "query_string": {
     "fields": ["sentence*"],
     "query": "persistent"
   }
  }
}

相关问题