regex 给定一个 * 任意 * 的单词列表，如何删除所有带有双字母的单词？

nhjlsmyf 于 2023-08-08 发布在其他

关注(0)|答案(6)|浏览(114)

类似于this post，但我想对外部源提供的任意列表或单词执行此任务，这些列表或单词可能会根据用户输入而更改。例如，我可能有：

input = ['annotate','color','october','cellular','wingding','appeasement','sorta']

字符串
输出应该是

output = ['color','october','wingding','sorta']

型
任何帮助将不胜感激！

regex

来源：https://stackoverflow.com/questions/76696452/given-a-list-of-arbitrary-words-how-to-remove-all-words-with-double-letters

6条答案

按热度按时间

xyhw6mcr1#

您可以使用像这样简单的正则表达式。这个答案扩展了你提到的帖子，以满足你的要求。

import re

arr = ['annotate','color','october','cellular','wingding','appeasement','sorta']
result = [w for w in arr if not re.search(r'(\w)\1', w)]

print(result)

字符串
它适用于word字符，其中包括：a-z, A-Z, 0-9, _的。

赞(0）回复(0）举报 2023-08-08

rqcrx0a62#

你可以使用正则表达式来实现这个.

list(filter(lambda x: not  re.search(r'(\w)\1', x), input))

字符串

赞(0）回复(0）举报 2023-08-08

m528fe3b3#

使用嵌套循环检查每个单词是否有双字母：

该解决方案可能比基于RegEx的解决方案更容易理解。

words = ['annotate','color','october','cellular','wingding','appeasement','sorta']

for index, word in enumerate(words):
    last = None
    for letter in word:
        if letter == last:
            del words[index]  # Double letters, so delete from the list.
        last = letter
print(words)

字符串
输出量：

['color', 'october', 'wingding', 'sorta']

型
这段代码只是检查每个单词是否有双字母，如果确实有双字母，则将其从words列表中删除。* 请注意，如果您将变量命名为input*，则会出现问题。

赞(0）回复(0）举报 2023-08-08

50pmv0ei4#

您可以通过压缩单词本身来比较相邻字符。这使得一个相当简洁的列表理解：

words = ['annotate','color','october','cellular','wingding','appeasement','sorta']

[w for w in words if all(a != b for a, b in zip(w, w[1:]))]    
# ['color', 'october', 'wingding', 'sorta']

字符串
zip(w, w[1:])将生成像[('a', 'n'),('n', 'n'), ('n', 'o')...]这样的字母元组，然后您可以比较，任何相等的字母都表示一行中的同一个字母。

赞(0）回复(0）举报 2023-08-08

798qvoo85#

import re

def filter_words(input_list):
    output_list = []
    pattern = r'(.)\1'  # Regex pattern to match consecutive identical characters
    
    for word in input_list:
        if not re.search(pattern, word):
            output_list.append(word)
    
    return output_list

# Example usage:
input = ['annotate', 'color', 'october', 'cellular', 'wingding', 'appeasement', 'sorta']
output = filter_words(input)
print(output)

字符串
输出量：

['color', 'october', 'wingding', 'sorta']

型

赞(0）回复(0）举报 2023-08-08

tv6aics16#

即使不使用Regex，也可以使用Python语言的一个特性来构造answer using nested loops的变体，即在循环中使用else子句。
在Python中，循环语句可以有一个else子句，该子句在循环结束时通过耗尽iterable来执行，但在循环被break语句终止时不执行。这允许以下算法：

input = ['annotate','color','october','cellular','wingding','appeasement','sorta', 'watt']
output = []

for word in input:                            
    for idx, char in enumerate(word[1:]):     # Iterate over the enumeration from the second to the last character of the word
        if word[idx] == char:                 # Check if the char is equal to the previous character in the word 
            break                             # It interrupts the loop and the `else` clause is not executed
    else:
        output.append(word)
      
print(output)
#['color', 'october', 'wingding', 'sorta']

字符串
online

另请参见Python内置函数枚举（）

赞(0）回复(0）举报 2023-08-08

我来回答

regex 给定一个 * 任意 * 的单词列表，如何删除所有带有双字母的单词？

6条答案

相关问题

热门标签

最新问答