de.tudarmstadt.ukp.wikipedia.api.Wikipedia.getArticles()方法的使用及代码示例

x33g5p2x  于2022-02-03 转载在 其他  
字(2.6k)|赞(0)|评价(0)|浏览(97)

本文整理了Java中de.tudarmstadt.ukp.wikipedia.api.Wikipedia.getArticles()方法的一些代码示例,展示了Wikipedia.getArticles()的具体用法。这些代码示例主要来源于Github/Stackoverflow/Maven等平台,是从一些精选项目中提取出来的代码,具有较强的参考意义,能在一定程度帮忙到你。Wikipedia.getArticles()方法的具体详情如下:
包路径:de.tudarmstadt.ukp.wikipedia.api.Wikipedia
类名称:Wikipedia
方法名:getArticles

Wikipedia.getArticles介绍

[英]Get all articles (pages MINUS disambiguationPages MINUS redirects). Returns only an iterable, as a collection may not fit into memory for a large wikipedia.
[中]获取所有文章(页面减去消歧页面减去重定向)。只返回一个iterable,因为一个集合可能无法放入大型wikipedia的内存中。

代码示例

代码示例来源:origin: de.tudarmstadt.ukp.dkpro.lexsemresource/de.tudarmstadt.ukp.dkpro.lexsemresource.wikipedia-asl

public WikipediaArticleEntityIterable(Wikipedia wiki, boolean isCaseSensitive) {
  this.wikiPageIterable = wiki.getArticles();
  this.wiki = wiki;
  this.isCaseSensitive = isCaseSensitive;
}

代码示例来源:origin: dkpro/dkpro-similarity

public void fillInLinkCache() throws WikiApiException, FileNotFoundException, IOException, ClassNotFoundException {
  logger.info("Filling InlinkCache ...");
  if (!isCacheEmpty()) {
    return;
  }
  File serializedCacheFile = getSerializedCacheFile(wiki);
  if (serializedCacheFile.exists()) {
    cachedInLinks = (TIntObjectHashMap<int[]>) deserializeObject(serializedCacheFile);
    return;
  }
  int i=0;
  for (Page article : wiki.getArticles()) {
    int[] inlinkIdIntArray = WikiLinkComparator.getInlinkIds(article);
    int pageId = article.getPageId();
    cachedInLinks.put(pageId, inlinkIdIntArray);
    if (i % 10000 == 0) {
      System.out.print(".");
    }
    i++;
  }
  System.out.println();
  serializeObject(cachedInLinks, serializedCacheFile);
}

代码示例来源:origin: de.tudarmstadt.ukp.similarity.algorithms/de.tudarmstadt.ukp.similarity.algorithms.wikipedia-asl

public void fillInLinkCache() throws WikiApiException, FileNotFoundException, IOException, ClassNotFoundException {
  logger.info("Filling InlinkCache ...");
  if (!isCacheEmpty()) {
    return;
  }
  File serializedCacheFile = getSerializedCacheFile(wiki);
  if (serializedCacheFile.exists()) {
    cachedInLinks = (TIntObjectHashMap<int[]>) deserializeObject(serializedCacheFile);
    return;
  }
  int i=0;
  for (Page article : wiki.getArticles()) {
    int[] inlinkIdIntArray = WikiLinkComparator.getInlinkIds(article);
    int pageId = article.getPageId();
    cachedInLinks.put(pageId, inlinkIdIntArray);
    if (i % 10000 == 0) {
      System.out.print(".");
    }
    i++;
  }
  System.out.println();
  serializeObject(cachedInLinks, serializedCacheFile);
}

代码示例来源:origin: dkpro/dkpro-jwpl

Iterator<Page> pageIt = wiki.getArticles().iterator();

代码示例来源:origin: hltfbk/Excitement-Open-Platform

StopWatch allStopwatch = new StopWatch();
allStopwatch.start();
Iterable<Page> pagesIterable =  wikipedia.getArticles();
int doneCount=0;

代码示例来源:origin: dkpro/dkpro-jwpl

Iterator<Page> pageIt = wiki.getArticles().iterator();

相关文章