Spring Boot Java语言中的句子、字符串相似性比较

dkqlctbz  于 2022-11-05  发布在  Spring
关注(0)|答案(1)|浏览(139)

我有一个关于两个句子之间的单词相似性的问题。我想知道相似度的代码,即相同的单词数除以长句的单词数。我应该用哪个库来做这个,谢谢。

There are many kinds of similarity as similarity. I want to know which similarity title this similarity belongs to.
irtuqstp

irtuqstp1#

您可以使用一些类似的文档技术,如Cosine similariy
在这里,我已经根据您的描述实现了一个解决方案。

double findSimilarityRatio (String sentence1, String sentence2) {

    HashMap<String, Integer> firstSentenceMap = new HashMap<>();
    HashMap<String, Integer> secondSentenceMap = new HashMap<>();

    String[] firstSentenceWords = sentence1.split(" ");
    String[] secondSentenceWords = sentence2.split(" ");

    for (String word : firstSentenceWords) {
        if (firstSentenceMap.containsKey(word)) {
            firstSentenceMap.put(word, firstSentenceMap.get(word) + 1);
        }
        else {
            firstSentenceMap.put(word, 1);
        }
    }

    for (String word : secondSentenceWords) {
        if (secondSentenceMap.containsKey(word)) {
            secondSentenceMap.put(word, secondSentenceMap.get(word) + 1);
        }
        else {
            secondSentenceMap.put(word, 1);
        }
    }

    double totalWords = 0;
    double totalHits = 0;

    if (firstSentenceWords.length >= secondSentenceWords.length) {
        totalWords = firstSentenceWords.length;
        for (Map.Entry<String, Integer> entry : firstSentenceMap.entrySet()) {
            String key = entry.getKey();

            if (secondSentenceMap.containsKey(key)) {
                totalHits = totalHits + Math.min(secondSentenceMap.get(key), firstSentenceMap.get(key)); 
            }
        }
    }
    else {
        totalWords = secondSentenceWords.length;
        for (Map.Entry<String, Integer> entry : secondSentenceMap.entrySet()) {
            String key = entry.getKey();

            if (firstSentenceMap.containsKey(key)) {
                totalHits = totalHits + Math.min(secondSentenceMap.get(key), firstSentenceMap.get(key)); 
            }
        }

    }

    return totalHits/totalWords;
}

希望能有所帮助,干杯!

相关问题