zemberek.core.collections.Histogram.set()方法的使用及代码示例

x33g5p2x  于2022-01-20 转载在 其他  
字(1.4k)|赞(0)|评价(0)|浏览(158)

本文整理了Java中zemberek.core.collections.Histogram.set()方法的一些代码示例,展示了Histogram.set()的具体用法。这些代码示例主要来源于Github/Stackoverflow/Maven等平台,是从一些精选项目中提取出来的代码,具有较强的参考意义,能在一定程度帮忙到你。Histogram.set()方法的具体详情如下:
包路径:zemberek.core.collections.Histogram
类名称:Histogram
方法名:set

Histogram.set介绍

[英]inserts the element and its value. it overrides the current count
[中]插入元素及其值。它覆盖当前计数

代码示例

代码示例来源:origin: ahmetaa/zemberek-nlp

public static Histogram<String> deserializeStringHistogram(DataInputStream dis)
  throws IOException {
 int size = dis.readInt();
 if (size < 0) {
  throw new IllegalStateException(
    "Cannot deserialize String histogram. Count value is negative : " + size);
 }
 Histogram<String> result = new Histogram<>(size);
 for (int i = 0; i < size; i++) {
  result.set(dis.readUTF(), dis.readInt());
 }
 return result;
}

代码示例来源:origin: ahmetaa/zemberek-nlp

static void multipleLetterRepetitionWords(Path in, Path out) throws IOException {
 Histogram<String> noisyWords = Histogram.loadFromUtf8File(in, ' ');
 Histogram<String> repetitionWords = new Histogram<>();
 for (String w : noisyWords) {
  if (w.length() == 1) {
   continue;
  }
  int maxRepetitionCount = 1;
  int repetitionCount = 1;
  char lastChar = w.charAt(0);
  for (int i = 1; i < w.length(); i++) {
   char c = w.charAt(i);
   if (c == lastChar) {
    repetitionCount++;
   } else {
    if (repetitionCount > maxRepetitionCount) {
     maxRepetitionCount = repetitionCount;
    }
    repetitionCount = 0;
   }
   lastChar = c;
  }
  if (maxRepetitionCount > 1) {
   repetitionWords.set(w, noisyWords.getCount(w));
  }
 }
 repetitionWords.saveSortedByCounts(out, " ");
}

相关文章