org.apache.lucene.analysis.Analyzer.normalize()方法的使用及代码示例

x33g5p2x  于2022-01-15 转载在 其他  
字(14.6k)|赞(0)|评价(0)|浏览(108)

本文整理了Java中org.apache.lucene.analysis.Analyzer.normalize()方法的一些代码示例,展示了Analyzer.normalize()的具体用法。这些代码示例主要来源于Github/Stackoverflow/Maven等平台,是从一些精选项目中提取出来的代码,具有较强的参考意义,能在一定程度帮忙到你。Analyzer.normalize()方法的具体详情如下:
包路径:org.apache.lucene.analysis.Analyzer
类名称:Analyzer
方法名:normalize

Analyzer.normalize介绍

[英]Normalize a string down to the representation that it would have in the index.

This is typically used by query parsers in order to generate a query on a given term, without tokenizing or stemming, which are undesirable if the string to analyze is a partial word (eg. in case of a wildcard or fuzzy query).

This method uses #initReaderForNormalization(String,Reader) in order to apply necessary character-level normalization and then #normalize(String,TokenStream) in order to apply the normalizing token filters.
[中]将字符串规格化为它在索引中的表示形式。
这通常由查询解析器使用,以便在给定术语上生成查询,而不进行标记化或词干分析,如果要分析的字符串是部分词(例如,在通配符或模糊查询的情况下),则不需要进行标记化或词干分析。
此方法使用#initReaderFormalization(字符串、读取器)来应用必要的字符级规范化,然后使用#normalize(字符串、令牌流)来应用规范化令牌过滤器。

代码示例

代码示例来源:origin: org.apache.lucene/lucene-core

@Override
protected final TokenStream normalize(String fieldName, TokenStream in) {
 return wrapTokenStreamForNormalization(fieldName, getWrappedAnalyzer(fieldName).normalize(fieldName, in));
}

代码示例来源:origin: org.apache.lucene/lucene-core

try (TokenStream ts = normalize(fieldName,
  new StringTokenStream(attributeFactory, filteredText, text.length()))) {
 final TermToBytesRefAttribute termAtt = ts.addAttribute(TermToBytesRefAttribute.class);

代码示例来源:origin: org.elasticsearch/elasticsearch

@Override
public Query newFuzzyQuery(String text, int fuzziness) {
  List<Query> disjuncts = new ArrayList<>();
  for (Map.Entry<String,Float> entry : weights.entrySet()) {
    final String fieldName = entry.getKey();
    final MappedFieldType ft = context.fieldMapper(fieldName);
    if (ft == null) {
      disjuncts.add(newUnmappedFieldQuery(fieldName));
      continue;
    }
    try {
      final BytesRef term = getAnalyzer(ft).normalize(fieldName, text);
      Query query = ft.fuzzyQuery(term, Fuzziness.fromEdits(fuzziness), settings.fuzzyPrefixLength,
        settings.fuzzyMaxExpansions, settings.fuzzyTranspositions);
      disjuncts.add(wrapWithBoost(query, entry.getValue()));
    } catch (RuntimeException e) {
      disjuncts.add(rethrowUnlessLenient(e));
    }
  }
  if (disjuncts.size() == 1) {
    return disjuncts.get(0);
  }
  return new DisjunctionMaxQuery(disjuncts, 1.0f);
}

代码示例来源:origin: org.elasticsearch/elasticsearch

private Query getRangeQuerySingle(String field, String part1, String part2,
                 boolean startInclusive, boolean endInclusive, QueryShardContext context) {
  currentFieldType = context.fieldMapper(field);
  if (currentFieldType == null) {
    return newUnmappedFieldQuery(field);
  }
  try {
    Analyzer normalizer = forceAnalyzer == null ? queryBuilder.context.getSearchAnalyzer(currentFieldType) : forceAnalyzer;
    BytesRef part1Binary = part1 == null ? null : normalizer.normalize(field, part1);
    BytesRef part2Binary = part2 == null ? null : normalizer.normalize(field, part2);
    Query rangeQuery = currentFieldType.rangeQuery(part1Binary, part2Binary,
      startInclusive, endInclusive, null, timeZone, null, context);
    return rangeQuery;
  } catch (RuntimeException e) {
    if (lenient) {
      return newLenientFieldQuery(field, e);
    }
    throw e;
  }
}

代码示例来源:origin: org.elasticsearch/elasticsearch

@Override
public Query newPrefixQuery(String text) {
  List<Query> disjuncts = new ArrayList<>();
  for (Map.Entry<String,Float> entry : weights.entrySet()) {
    final String fieldName = entry.getKey();
    final MappedFieldType ft = context.fieldMapper(fieldName);
    if (ft == null) {
      disjuncts.add(newUnmappedFieldQuery(fieldName));
      continue;
    }
    try {
      if (settings.analyzeWildcard()) {
        Query analyzedQuery = newPossiblyAnalyzedQuery(fieldName, text, getAnalyzer(ft));
        if (analyzedQuery != null) {
          disjuncts.add(wrapWithBoost(analyzedQuery, entry.getValue()));
        }
      } else {
        BytesRef term = getAnalyzer(ft).normalize(fieldName, text);
        Query query = ft.prefixQuery(term.utf8ToString(), null, context);
        disjuncts.add(wrapWithBoost(query, entry.getValue()));
      }
    } catch (RuntimeException e) {
      disjuncts.add(rethrowUnlessLenient(e));
    }
  }
  if (disjuncts.size() == 1) {
    return disjuncts.get(0);
  }
  return new DisjunctionMaxQuery(disjuncts, 1.0f);
}

代码示例来源:origin: org.elasticsearch/elasticsearch

currentPos = new ArrayList<>();
final BytesRef term = analyzer.normalize(field, termAtt.toString());
currentPos.add(term);
hasMoreTokens = source.incrementToken();

代码示例来源:origin: org.elasticsearch/elasticsearch

private Query getFuzzyQuerySingle(String field, String termStr, float minSimilarity) throws ParseException {
  currentFieldType = context.fieldMapper(field);
  if (currentFieldType == null) {
    return newUnmappedFieldQuery(field);
  }
  try {
    Analyzer normalizer = forceAnalyzer == null ? queryBuilder.context.getSearchAnalyzer(currentFieldType) : forceAnalyzer;
    BytesRef term = termStr == null ? null : normalizer.normalize(field, termStr);
    return currentFieldType.fuzzyQuery(term, Fuzziness.fromEdits((int) minSimilarity),
      getFuzzyPrefixLength(), fuzzyMaxExpansions, fuzzyTranspositions);
  } catch (RuntimeException e) {
    if (lenient) {
      return newLenientFieldQuery(field, e);
    }
    throw e;
  }
}

代码示例来源:origin: org.apache.servicemix.bundles/org.apache.servicemix.bundles.lucene

@Override
protected final TokenStream normalize(String fieldName, TokenStream in) {
 return wrapTokenStreamForNormalization(fieldName, getWrappedAnalyzer(fieldName).normalize(fieldName, in));
}

代码示例来源:origin: org.apache.servicemix.bundles/org.apache.servicemix.bundles.elasticsearch

private Query getRangeQuerySingle(String field, String part1, String part2,
                 boolean startInclusive, boolean endInclusive, QueryShardContext context) {
  currentFieldType = context.fieldMapper(field);
  if (currentFieldType == null) {
    return newUnmappedFieldQuery(field);
  }
  try {
    Analyzer normalizer = forceAnalyzer == null ? queryBuilder.context.getSearchAnalyzer(currentFieldType) : forceAnalyzer;
    BytesRef part1Binary = part1 == null ? null : normalizer.normalize(field, part1);
    BytesRef part2Binary = part2 == null ? null : normalizer.normalize(field, part2);
    Query rangeQuery = currentFieldType.rangeQuery(part1Binary, part2Binary,
      startInclusive, endInclusive, null, timeZone, null, context);
    return rangeQuery;
  } catch (RuntimeException e) {
    if (lenient) {
      return newLenientFieldQuery(field, e);
    }
    throw e;
  }
}

代码示例来源:origin: apache/servicemix-bundles

private Query getRangeQuerySingle(String field, String part1, String part2,
                 boolean startInclusive, boolean endInclusive, QueryShardContext context) {
  currentFieldType = context.fieldMapper(field);
  if (currentFieldType == null) {
    return newUnmappedFieldQuery(field);
  }
  try {
    Analyzer normalizer = forceAnalyzer == null ? queryBuilder.context.getSearchAnalyzer(currentFieldType) : forceAnalyzer;
    BytesRef part1Binary = part1 == null ? null : normalizer.normalize(field, part1);
    BytesRef part2Binary = part2 == null ? null : normalizer.normalize(field, part2);
    Query rangeQuery = currentFieldType.rangeQuery(part1Binary, part2Binary,
      startInclusive, endInclusive, null, timeZone, null, context);
    return rangeQuery;
  } catch (RuntimeException e) {
    if (lenient) {
      return newLenientFieldQuery(field, e);
    }
    throw e;
  }
}

代码示例来源:origin: apache/servicemix-bundles

@Override
public Query newFuzzyQuery(String text, int fuzziness) {
  List<Query> disjuncts = new ArrayList<>();
  for (Map.Entry<String,Float> entry : weights.entrySet()) {
    final String fieldName = entry.getKey();
    final MappedFieldType ft = context.fieldMapper(fieldName);
    if (ft == null) {
      disjuncts.add(newUnmappedFieldQuery(fieldName));
      continue;
    }
    try {
      final BytesRef term = getAnalyzer(ft).normalize(fieldName, text);
      Query query = ft.fuzzyQuery(term, Fuzziness.fromEdits(fuzziness), settings.fuzzyPrefixLength,
        settings.fuzzyMaxExpansions, settings.fuzzyTranspositions);
      disjuncts.add(wrapWithBoost(query, entry.getValue()));
    } catch (RuntimeException e) {
      disjuncts.add(rethrowUnlessLenient(e));
    }
  }
  if (disjuncts.size() == 1) {
    return disjuncts.get(0);
  }
  return new DisjunctionMaxQuery(disjuncts, 1.0f);
}

代码示例来源:origin: org.apache.servicemix.bundles/org.apache.servicemix.bundles.elasticsearch

@Override
public Query newFuzzyQuery(String text, int fuzziness) {
  List<Query> disjuncts = new ArrayList<>();
  for (Map.Entry<String,Float> entry : weights.entrySet()) {
    final String fieldName = entry.getKey();
    final MappedFieldType ft = context.fieldMapper(fieldName);
    if (ft == null) {
      disjuncts.add(newUnmappedFieldQuery(fieldName));
      continue;
    }
    try {
      final BytesRef term = getAnalyzer(ft).normalize(fieldName, text);
      Query query = ft.fuzzyQuery(term, Fuzziness.fromEdits(fuzziness), settings.fuzzyPrefixLength,
        settings.fuzzyMaxExpansions, settings.fuzzyTranspositions);
      disjuncts.add(wrapWithBoost(query, entry.getValue()));
    } catch (RuntimeException e) {
      disjuncts.add(rethrowUnlessLenient(e));
    }
  }
  if (disjuncts.size() == 1) {
    return disjuncts.get(0);
  }
  return new DisjunctionMaxQuery(disjuncts, 1.0f);
}

代码示例来源:origin: com.strapdata.elasticsearch/elasticsearch

/**
 * Dispatches to Lucene's SimpleQueryParser's newFuzzyQuery, optionally
 * lowercasing the term first
 */
@Override
public Query newFuzzyQuery(String text, int fuzziness) {
  BooleanQuery.Builder bq = new BooleanQuery.Builder();
  bq.setDisableCoord(true);
  for (Map.Entry<String,Float> entry : weights.entrySet()) {
    final String fieldName = entry.getKey();
    try {
      final BytesRef term = getAnalyzer().normalize(fieldName, text);
      Query query = new FuzzyQuery(new Term(fieldName, term), fuzziness);
      bq.add(wrapWithBoost(query, entry.getValue()), BooleanClause.Occur.SHOULD);
    } catch (RuntimeException e) {
      rethrowUnlessLenient(e);
    }
  }
  return super.simplify(bq.build());
}

代码示例来源:origin: org.apache.servicemix.bundles/org.apache.servicemix.bundles.elasticsearch

@Override
public Query newPrefixQuery(String text) {
  List<Query> disjuncts = new ArrayList<>();
  for (Map.Entry<String,Float> entry : weights.entrySet()) {
    final String fieldName = entry.getKey();
    final MappedFieldType ft = context.fieldMapper(fieldName);
    if (ft == null) {
      disjuncts.add(newUnmappedFieldQuery(fieldName));
      continue;
    }
    try {
      if (settings.analyzeWildcard()) {
        Query analyzedQuery = newPossiblyAnalyzedQuery(fieldName, text, getAnalyzer(ft));
        if (analyzedQuery != null) {
          disjuncts.add(wrapWithBoost(analyzedQuery, entry.getValue()));
        }
      } else {
        BytesRef term = getAnalyzer(ft).normalize(fieldName, text);
        Query query = ft.prefixQuery(term.utf8ToString(), null, context);
        disjuncts.add(wrapWithBoost(query, entry.getValue()));
      }
    } catch (RuntimeException e) {
      disjuncts.add(rethrowUnlessLenient(e));
    }
  }
  if (disjuncts.size() == 1) {
    return disjuncts.get(0);
  }
  return new DisjunctionMaxQuery(disjuncts, 1.0f);
}

代码示例来源:origin: org.codelibs/elasticsearch-querybuilders

/**
 * Dispatches to Lucene's SimpleQueryParser's newFuzzyQuery, optionally
 * lowercasing the term first
 */
@Override
public Query newFuzzyQuery(String text, int fuzziness) {
  BooleanQuery.Builder bq = new BooleanQuery.Builder();
  bq.setDisableCoord(true);
  for (Map.Entry<String,Float> entry : weights.entrySet()) {
    final String fieldName = entry.getKey();
    try {
      final BytesRef term = getAnalyzer().normalize(fieldName, text);
      Query query = new FuzzyQuery(new Term(fieldName, term), fuzziness);
      bq.add(wrapWithBoost(query, entry.getValue()), BooleanClause.Occur.SHOULD);
    } catch (RuntimeException e) {
      rethrowUnlessLenient(e);
    }
  }
  return super.simplify(bq.build());
}

代码示例来源:origin: com.strapdata.elasticsearch/elasticsearch

/**
 * Dispatches to Lucene's SimpleQueryParser's newPrefixQuery, optionally
 * lowercasing the term first or trying to analyze terms
 */
@Override
public Query newPrefixQuery(String text) {
  BooleanQuery.Builder bq = new BooleanQuery.Builder();
  bq.setDisableCoord(true);
  for (Map.Entry<String,Float> entry : weights.entrySet()) {
    final String fieldName = entry.getKey();
    try {
      if (settings.analyzeWildcard()) {
        Query analyzedQuery = newPossiblyAnalyzedQuery(fieldName, text);
        if (analyzedQuery != null) {
          bq.add(wrapWithBoost(analyzedQuery, entry.getValue()), BooleanClause.Occur.SHOULD);
        }
      } else {
        Term term = new Term(fieldName, getAnalyzer().normalize(fieldName, text));
        Query query = new PrefixQuery(term);
        bq.add(wrapWithBoost(query, entry.getValue()), BooleanClause.Occur.SHOULD);
      }
    } catch (RuntimeException e) {
      return rethrowUnlessLenient(e);
    }
  }
  return super.simplify(bq.build());
}

代码示例来源:origin: com.strapdata.elasticsearch/elasticsearch

private Query getFuzzyQuerySingle(String field, String termStr, String minSimilarity) throws ParseException {
  currentFieldType = context.fieldMapper(field);
  if (currentFieldType != null) {
    try {
      BytesRef term = termStr == null ? null : getAnalyzer().normalize(field, termStr);
      return currentFieldType.fuzzyQuery(term, Fuzziness.build(minSimilarity),
        getFuzzyPrefixLength(), settings.fuzzyMaxExpansions(), FuzzyQuery.defaultTranspositions);
    } catch (RuntimeException e) {
      if (settings.lenient()) {
        return null;
      }
      throw e;
    }
  }
  return super.getFuzzyQuery(field, termStr, Float.parseFloat(minSimilarity));
}

代码示例来源:origin: org.apache.servicemix.bundles/org.apache.servicemix.bundles.elasticsearch

private Query getFuzzyQuerySingle(String field, String termStr, float minSimilarity) throws ParseException {
  currentFieldType = context.fieldMapper(field);
  if (currentFieldType == null) {
    return newUnmappedFieldQuery(field);
  }
  try {
    Analyzer normalizer = forceAnalyzer == null ? queryBuilder.context.getSearchAnalyzer(currentFieldType) : forceAnalyzer;
    BytesRef term = termStr == null ? null : normalizer.normalize(field, termStr);
    return currentFieldType.fuzzyQuery(term, Fuzziness.fromEdits((int) minSimilarity),
      getFuzzyPrefixLength(), fuzzyMaxExpansions, fuzzyTranspositions);
  } catch (RuntimeException e) {
    if (lenient) {
      return newLenientFieldQuery(field, e);
    }
    throw e;
  }
}

代码示例来源:origin: com.strapdata.elasticsearch/elasticsearch

private Query getRangeQuerySingle(String field, String part1, String part2,
    boolean startInclusive, boolean endInclusive, QueryShardContext context) {
  currentFieldType = context.fieldMapper(field);
  if (currentFieldType != null) {
    try {
      BytesRef part1Binary = part1 == null ? null : getAnalyzer().normalize(field, part1);
      BytesRef part2Binary = part2 == null ? null : getAnalyzer().normalize(field, part2);
      Query rangeQuery;
      if (currentFieldType instanceof LegacyDateFieldMapper.DateFieldType && settings.timeZone() != null) {
        LegacyDateFieldMapper.DateFieldType dateFieldType = (LegacyDateFieldMapper.DateFieldType) this.currentFieldType;
        rangeQuery = dateFieldType.rangeQuery(part1Binary, part2Binary,
            startInclusive, endInclusive, settings.timeZone(), null, context);
      } else if (currentFieldType instanceof DateFieldMapper.DateFieldType && settings.timeZone() != null) {
        DateFieldMapper.DateFieldType dateFieldType = (DateFieldMapper.DateFieldType) this.currentFieldType;
        rangeQuery = dateFieldType.rangeQuery(part1Binary, part2Binary,
            startInclusive, endInclusive, settings.timeZone(), null, context);
      } else {
        rangeQuery = currentFieldType.rangeQuery(part1Binary, part2Binary, startInclusive, endInclusive, context);
      }
      return rangeQuery;
    } catch (RuntimeException e) {
      if (settings.lenient()) {
        return null;
      }
      throw e;
    }
  }
  return newRangeQuery(field, part1, part2, startInclusive, endInclusive);
}

代码示例来源:origin: apache/servicemix-bundles

private Query getFuzzyQuerySingle(String field, String termStr, float minSimilarity) throws ParseException {
  currentFieldType = context.fieldMapper(field);
  if (currentFieldType == null) {
    return newUnmappedFieldQuery(field);
  }
  try {
    Analyzer normalizer = forceAnalyzer == null ? queryBuilder.context.getSearchAnalyzer(currentFieldType) : forceAnalyzer;
    BytesRef term = termStr == null ? null : normalizer.normalize(field, termStr);
    return currentFieldType.fuzzyQuery(term, Fuzziness.fromEdits((int) minSimilarity),
      getFuzzyPrefixLength(), fuzzyMaxExpansions, fuzzyTranspositions);
  } catch (RuntimeException e) {
    if (lenient) {
      return newLenientFieldQuery(field, e);
    }
    throw e;
  }
}

相关文章