如何在字符串中搜索JavaScript中最常见的字符或单词

amrnrhlw  于 2023-05-05  发布在  Java
关注(0)|答案(2)|浏览(109)

我有一个问题问我:

  • 清理以下文本并找到最常见的单词:“const sentence = '%I $am@% a %tea@cher%, &and& I lo%#ve %te@a@ching%;. The@re $is no@th@ing; &as& mo@re rewarding as educa@ting &and& @emp%o@weri@ng peo@ple. ;I found tea@ching m%o@re interesting tha@n any ot#her %jo@bs. %Do@es thi%s mo@tiv#ate yo@u to be a tea@cher!? %Th#is 30#Days&OfJavaScript &is al@so $the $resu@lt of &love& of tea&ching'“*

我使用regEx清理字符串如下:

const sentence = '%I $am@% a %tea@cher%, &and& I lo%#ve %te@a@ching%;. The@re $is no@th@ing; && mo@re rewarding than educa@ting &and& @emp%o@weri@ng peo@ple. ;I find tea@ching m%o@re interesting tha@n any ot#her %jo@bs. %Do@es thi%s mo@tiv#ate yo@u to be a tea@cher!? %Th#is 30#Days&OfJavaScript &is al@so $the $resu@lt of &love& of tea&ching';
const sentReg = /\W(?<!1)/g;
let sent = sentence.replace(/ /g, "1");

let finalSent = sent.replace(sentReg, ""), finalfinalSent = finalSent.replace(/1/g, " ");

我意识到我不知道如何(或者可能没有办法)使用match()函数来按单词搜索字符串,所以我尝试将其拆分为一个数组:

let senArr = finalfinalSent.split(" "), wordOccur = [];

for (const x of senArr) {
    var re = new RegExp(x, "g");
    var y = finalfinalSent.match(re);

    wordOccur = wordOccur.concat([y.length]);
};

...现在我被卡住了,因为我不知道如何在JavaScript中搜索数组,只知道在Python中,我觉得搜索字符串的方法会比这容易得多。我会很感激你的指点。

iqih9akk

iqih9akk1#

我不会把空格改成“1”。相反,使用在清理时不会删除空格的正则表达式。
然后你可以在清理过的字符串上调用match,并使用reduce开始计数单词并维护对最频繁的单词的引用:

const sentence = '%I $am@% a %tea@cher%, &and& I lo%#ve %te@a@ching%;. The@re $is no@th@ing; && mo@re rewarding than educa@ting &and& @emp%o@weri@ng peo@ple. ;I find tea@ching m%o@re interesting tha@n any ot#her %jo@bs. %Do@es thi%s mo@tiv#ate yo@u to be a tea@cher!? %Th#is 30#Days&OfJavaScript &is al@so $the $resu@lt of &love& of tea&ching';

let word = sentence.replace(/[^\w\s]/g, "")
            .match(/\w+/g)
            .reduce((acc, word) => {
                acc[word] = (acc[word] || 0) + 1;
                if (!(acc[word] < acc[acc.$])) acc.$ = word;
                return acc;
            }, {}).$;
            
console.log(word);

注意,acc将是单词的“字典”,其中对应的值是计数。在同一个字典中创建一个特殊的$条目,它将保存最频繁的单词。
如果有一个以上的单词具有最大频率,并且您希望获得 * 所有 * 这些单词,而不仅仅是一个,那么返回一个数组而不是字符串:

const sentence = '%I $am@% a %tea@cher%, &and& I lo%#ve %te@a@ching%;. The@re $is no@th@ing; && mo@re rewarding than educa@ting &and& @emp%o@weri@ng peo@ple. ;I find tea@ching m%o@re interesting tha@n any ot#her %jo@bs. %Do@es thi%s mo@tiv#ate yo@u to be a tea@cher!? %Th#is 30#Days&OfJavaScript &is al@so $the $resu@lt of &love& of tea&ching';

let word = sentence.replace(/[^\w\s]/g, "")
            .match(/\w+/g)
            .reduce((acc, word) => {
                acc[word] = (acc[word] || 0) + 1;
                if (!(acc[word] <= acc[acc.$])) acc.$ = [word];
                else if (acc[word] === acc[acc.$]) acc.$.push(word);
                return acc;
            }, {}).$;
            
console.log(word);
pgvzfuti

pgvzfuti2#

console.log(sentence.replace(/[%$@&#;!]/g, ''))

对于简短的版本,只需在句子中使用特殊字符。因为它不太复杂也不太长。当然可以为正则表达式声明一个变量

相关问题