java—相当于deeplearning4j的word2vec实现中的gensim.models.keyedvectors?

nbysray5  于 2021-07-03  发布在  Java
关注(0)|答案(0)|浏览(476)

gensim的keyedvectors本质上是键和向量之间的Map。每个向量都由其查找键标识,通常是一个短字符串标记,因此这通常是{str=>1dnumpy array}之间的Map。
除了提供对该Map的访问之外,keyedvectors还提供了更小的ram占用空间。请参见此处的比较:

CAPABILITY                          |   KeyedVectors    |   FULL MODEL  |   NOTE
continue training vectors           |   ❌               |   ✅           |   You need the full model to train or update vectors.
smaller objects                     |   ✅               |   ❌           |   KeyedVectors are smaller and need less RAM, because they don’t need to store the model state that enables training.
save/load fasttext/word2vec fmt     |   ✅               |   ❌           |   Vectors exported by the Facebook and Google tools do not support further training, but you can still load them into KeyedVectors.
append new vectors                  |   ✅               |   ✅           |   Add new-vector entries to the mapping dynamically.
concurrency                         |   ✅               |   ✅           |   Thread-safe, allows concurrent vector queries.
shared RAM                          |   ✅               |   ✅           |   Multiple processes can re-use the same data, keeping only a single copy in RAM using mmap.
fast load                           |   ✅               |   ✅           |   Supports mmap to load data from disk instantaneously.

java的deeplearning4jword2vec实现中是否有一个等价的结构?我在文件中找不到类似的东西。

暂无答案!

目前还没有任何答案,快来回答吧!

相关问题