regex 正则表达式匹配列表中的项+尾随N个数字(Python)

0h4hbjxa  于 2023-01-14  发布在  Python
关注(0)|答案(2)|浏览(109)

我准备了一份动物名单:

expectedAnimals = ['cat-', 'snake-', 'hedgehog-']

然后,我有一个用户输入(字符串格式),它包含上面列表中的一些或所有预期动物,后面跟着N个数字。这些动物由随机分隔符号(非整数)分隔:
示例:
x一个一个一个一个x一个一个二个x
我的目标(我正在努力实现)是编写函数filterAnimals,它应该返回以下正确的结果:
批准动物1 =筛选动物(输入字符串1)

['cat-235', 'snake-1', 'snake-22', 'cat-8844']

批准动物2 =筛选动物(输入字符串2):

['hedgehog-2', 'cat-1', 'snake-22', 'cat-2', 'snake-93242522', 'cat-3', 'snake-22', 'cat-8844']

我目前的实现工作部分,但老实说,我想从头开始重写它:

def filterAnimals(inputString):
    expectedAnimals = ['cat-', 'snake-', 'hedgehog-']
    start_indexes = []
    end_indexes = []
    for animal in expectedAnimals:
        temp_start_indexes = [i for i in range(len(inputString)) if inputString.startswith(animal, i)]
        if len(temp_start_indexes) > 0:
            start_indexes.append(temp_start_indexes)
            for start_ind in temp_start_indexes:
                for i in range(start_ind + len(animal), len(inputString)):
                    if inputString[i].isdigit() and i == len(inputString) - 1:
                        end_indexes.append(i + 1)
                        break
                    if not inputString[i].isdigit():
                        end_indexes.append(i)
                        break
        start_indexes_flat = [item for sublist in start_indexes for item in sublist]
        list_size = min(len(start_indexes_flat), len(end_indexes))
        approvedAnimals = []
        if list_size > 0:
            for x in range(list_size):
                approvedAnimals.append(inputString[start_indexes_flat[x]:end_indexes[x]])
    return approvedAnimals
iszxjhcz

iszxjhcz1#

您可以从expectedAnimals构建一个交替模式,并使用re.findall以列表形式查找所有匹配项:

import re

def filterAnimals(inputString):
    return re.findall(rf"(?:{'|'.join(expectedAnimals)})\d+", inputString)

演示:https://replit.com/@blhsing/OffensiveEveryWebportal

mqxuamgl

mqxuamgl2#

import re

# matches expected animals followed by N numbers
pattern=re.compile("(cat|snake|hedgehog)-\d+")

inputString1 = 'cat-235##randomtext-123...snake-1,dog-2:snake-22~!cat-8844'
inputString2 = 'hedgehog-2>cat-1|snake-22#cat-2<$dog-55 snake-93242522. cat-3 .rat-2 snake-22 cat-8844'

animals_1 = [i.group() for i in pattern.finditer(inputString1)]
# will return ['cat-235', 'snake-1', 'snake-22', 'cat-8844']

animals_2 = [i.group() for i in pattern.finditer(inputString2)]
# will return ['hedgehog-2', 'cat-1', 'snake-22', 'cat-2', 'snake-93242522', 'cat-3', 'snake-22', 'cat-8844']

相关问题