我试着为solr设置一个sugester。我有多个包含信息的字段。这是一个示例(字段:值):
基因:EGFR
cac:约4c 3C
CCD:文件夹.dlgfX
主题:EGFR -此变更
编号:7390
现在,我希望solr能够在输入时就已经获得文档,无论用户是否开始输入基因名称或ID或...
solrconfig.xml中的sugester如下所示(或多或少从示例中复制/粘贴):
<requestHandler name="/suggest" class="solr.SearchHandler" startup="lazy">
<lst name="defaults">
<str name="suggest">true</str>
<str name="suggest.count">10</str>
<str name="suggest.dictionary">suggest_muripedia</str>
</lst>
<arr name="components">
<str>suggest</str>
</arr>
</requestHandler>
<!-- Suggester component -->
<searchComponent name="suggest" class="solr.SuggestComponent">
<lst name="suggester">
<str name="name">suggest_muripedia</str>
<str name="lookupImpl">FuzzyLookupFactory</str>
<str name="dictionaryImpl">DocumentDictionaryFactory</str>
<str name="field">_text_cf</str>
<str name="weightField">subject</str>
<str name="suggestAnalyzerFieldType">string</str>
<str name="buildOnStartup">false</str>
</lst>
</searchComponent>
_text_cf是由上述字段的复制规则填充的字段,定义如下:
{
"name":"_text_cf",
"type":"mytext",
"multiValued":true,
"indexed":true,
"stored":true},
字段类型mytext
如下所示
{
"name":"mytext",
"class":"solr.TextField",
"positionIncrementGap":"100",
"multiValued":true,
"indexAnalyzer":{
"tokenizer":{
"class":"solr.PatternTokenizerFactory",
"pattern":"-"},
"filters":[{
"class":"solr.TrimFilterFactory"},
{
"class":"solr.StopFilterFactory",
"words":"stopwords.txt",
"ignoreCase":"true"},
{
"class":"solr.LowerCaseFilterFactory"}]},
"queryAnalyzer":{
"tokenizer":{
"class":"solr.PatternTokenizerFactory",
"pattern":"-"},
"filters":[{
"class":"solr.TrimFilterFactory"},
{
"class":"solr.StopFilterFactory",
"words":"stopwords.txt",
"ignoreCase":"true"},
{
"class":"solr.SynonymGraphFilterFactory",
"expand":"true",
"ignoreCase":"true",
"synonyms":"synonyms.txt"},
{
"class":"solr.LowerCaseFilterFactory"}]}},
我尝试的查询没有返回任何结果:
suggest?q=egfr
我不知道如何排除故障,我想我还没有完全理解建议请求会发生什么。
1条答案
按热度按时间bxfogqkk1#
建议者实际上返回了结果,但由于
suggestAnalyzerFieldType
已设置为string
,因此查询区分大小写。将字段类型更改为我自己定义的字段类型mytext
解决了此问题。