regex 从特定模式的字符串中提取单词

k5ifujac 于 2023-08-08 发布在其他

关注(0)|答案(1)|浏览(116)

在给定的字符串中，只提取字母数字，除了'：：'之间的单词，而不管'：'和字母数字之间的空格，它应该能够提取它。下面是代码示例

import re

message = "ass :gifs_e4VLc8f2_galabingo: ass dof:stickers_t3B0l2J7_galabingo:dor"
message1 = ":gifs_e4VLc8f2_galabingo::stickers_t3B0l2J7_galabingo:"
# Regex pattern to extract words that do not start and end with colons
pattern = r'(?<!:)(?::[^:]+:)*([^:]+)(?::[^:]+:)*(?!:)'

# Find all occurrences of words in the message that do not start and end with colons
words_without_colons = re.findall(pattern, message)
words_without_colons1 = re.findall(pattern, message1)
print(words_without_colons)
print(words_without_colons1 )

字符串
实际产量：
['ass'，'ass dof'，' or ']['ifs_e4VLc8f2_galabing'，'tickers_t3B0l2J7_galabing']
预期的输出：op1：['ass '，'ass dof'，'dor']
op2：[] #空列表

regex

来源：https://stackoverflow.com/questions/76771942/extract-words-from-string-of-specific-pattern

1条答案

按热度按时间

klh5stk11#

也许使用re.split会更容易，因为它使用了一个由冒号之间的不间断字符组成的分隔符（可选的前导/尾随空格）：

import re

pattern  = r" ?:[^ :]*?: ?"

message  = "ass :gifs_e4VLc8f2_galabingo: ass dof:stickers_t3B0l2J7_galabingo:dor"
message1 = ":gifs_e4VLc8f2_galabingo::stickers_t3B0l2J7_galabingo:"

*words,  = filter(None,re.split(pattern,message))
*words1, = filter(None,re.split(pattern,message1))

print(words)  # ['ass', 'ass dof', 'dor']
print(words1) # []

字符串

赞(0）回复(0）举报 2023-08-08

我来回答

regex 从特定模式的字符串中提取单词

1条答案

相关问题

热门标签

最新问答