用elasticsearch索引openxml结构化文档

332nm8kg 于 2021-06-10 发布在 ElasticSearch

关注(0)|答案(0)|浏览(304)

我们有一套结构化文件。该结构受到openxml数据模型的极大启发。简而言之，文档是由一组有序的段落组成的，每个段落本身都有一个id和一组有序的运行，每个运行都有一个文本内容和一些元数据。
例如，下面的示例文档包含两个[“lorem ipsum”，“dolor sit amet”]段落。

{
    id: 1
    paragraphs : [
        {
            title: "De finibus"
            runs: [
                {text: "Lorem i", metadata: {} }, 
                {text: "psu", metadata: {bold: true} }, 
                {text: "m", metadata: {} }, 
            ] 
        },
        {
            id: 2
            runs: [
                {text: "dolor sit amet", metadata: {} }, 
            ] 
        },
    ]
}

当然，我们希望通过elasticsearch对文档进行索引，以便它能够回答以下查询：
查询： dolor sit 预期答案： in the document with title="De finibus", in the paragraph with id=2, from the 1th character of the 1s run to the 9th character of the 1rd run 查询： ipsum 预期答案： in the document with title="De finibus", in the paragraph with id=1, from the 7th character of the 1s run to the 1st character of the 3rd run 查询： ipsum dolor 预期答案： in the document with title="De finibus", from the 7th character of the 1s run of the paragraph with id=1 to the 5th character of the 1rd run of the paragraph with id=2 我熟悉弹性体中的嵌套域。它可能满足第一个查询。但是，我们应该如何Map文档以将连续的运行和段落连接在一起，并灵活地回答后面的两个查询呢？

elasticsearch elasticsearch-mapping openxml elasticsearch-highlight

来源：https://stackoverflow.com/questions/64697745/index-openxml-structured-documents-with-elasticsearch

暂无答案！

目前还没有任何答案，快来回答吧！

我来回答

用elasticsearch索引openxml结构化文档

暂无答案！

相关问题

热门标签

最新问答