regex 如何在python中的递归函数调用中保存先前的“输出”或先前输出的状态？

我使用递归函数通过RegEx匹配生成文本，它根据方括号（pattern = '\[.*?\]'）中的同义词组合查找单词模式，方括号（pattern = '\[.*?\]'）由字符串分隔符（我定义了_STRING_SEPARATOR =#lkmkmksdmf###）分隔。）
函数的初始语句参数类似于：
[decreasing#lkmkmksdmf###shrinking#lkmkmksdmf###falling#lkmkmksdmf###contracting#lkmkmksdmf###faltering#lkmkmksdmf###the contraction in] exports of services will drive national economy to a 0.3% real GDP [decline#lkmkmksdmf###decrease#lkmkmksdmf###contraction] in 2023 from an estimated 5.0% [decline#lkmkmksdmf###decrease#lkmkmksdmf###contraction] in 2022个
和
该函数如下所示：

def all_combinations(self,sentence,sentence_list:list):
        pattern = '\[.*?\]'

        if not re.findall(pattern, sentence, flags = re.IGNORECASE):
            if sentence not in sentence_list:
                sentence_list.append(sentence)
        else:
            for single_match in re.finditer(pattern, sentence, flags = re.IGNORECASE):
                repl=single_match.group(0)[1:-1]
                start_span = single_match.span()[0]
                end_span = single_match.span()[1]
                for candidate_word in repl.split(self._STRING_SEPARATOR):
                    tmp_sentence = (
                        sentence[0: start_span] +
                        candidate_word +
                        sentence[end_span:]
                    )
                    new_sentence = deepcopy(tmp_sentence)
                    self.all_combinations(new_sentence,sentence_list)

字符串
因此，sentence_list变量像DFS树一样不断追加句子，sentence_list中的连续句子如下所示：

0: "decreasing exports in services will drive national economy to a 0.5% real GDP decline in 2023 from an estimated 5.0% decline in 2022"

1: "decreasing exports in services will drive national economy to a 0.5% real GDP decline in 2023 from an estimated 5.0% decrease in 2022"

2: "decreasing exports in services will drive national economy to a 0.5% real GDP decline in 2023 from an estimated 5.0% contraction in 2022"

3: "decreasing exports in services will drive national economy to a 0.5% real GDP decrease in 2023 from an estimated 5.0% decline in 2022"

4: "decreasing exports in services will drive national economy to a 0.5% real GDP decrease in 2023 from an estimated 5.0% decrease in 2022"

型
等等......
我想避免两次使用相同的单词--例如，如果我使用了单词“decline”，那么在递归调用后的内部for循环中选择下一组单词时，就不应该再次使用它。当第二个方括号模式中的单词被解析时，有没有一种方法可以“存储”第一个方括号中的单词所使用的单词，等等？

它就像一个DFS树，其中每个节点都必须存储其父节点的状态。* 如何修改函数，使sentence_list的单个句子中不再使用相同的单词？

我尝试使用一个名为“avoid_words”的参数：将“list”添加到all_combinations，all_combinations将存储父节点字的列表。但是，当我必须从第一个方括号（或从不同的“根”开始）移动到下一个单词时，我如何删除它？

正如Tim所指出的，如果真的没有其他方法来输入字符串和它的参数（我对此表示怀疑），你应该使用split()函数将初始句子分为单词（同义词）和纯句子。
Bellow是我会使用的注解代码，如果我必须解决这样的情况。

def all_combinations(sentence) -> list:
    pattern = r'\[(.*?)\]'
    synonyms = []
    resulting_sentences = []

#Put all of the synonyms into synonyms list
    list_of_synonyms = re.findall(pattern, sentence, flags = re.IGNORECASE)
#Remove synonyms from the origingal sentence
    sentence = re.sub(pattern, '[]', sentence)

#split sinynonyms into dictionaries containing tuple and clock
    for i, x in enumerate(list_of_synonyms):
            synonyms.append(tuple(x.split('#lkmkmksdmf###')))
 
#Create combinations and put those into list of sets. 
# Sets can hold only unique elements, thus in case of duplicity thwy will be shorter.
# The set will be removed if it's length is <3.
    synonym_combinations = list(set(combinations) for combinations in itertools.product(*synonyms) if len(set(combinations)) == 3)

#iterate over combinations
    for combination in synonym_combinations:
#iterate over words in combinations
        formatted_sentence = sentence
        for synonym in combination: formatted_sentence = formatted_sentence.replace('[]',synonym,1)
#append formatted sentence to resulting senteces
        resulting_sentences.append(formatted_sentence)
    return resulting_sentences

字符串

regex 如何在python中的递归函数调用中保存先前的“输出”或先前输出的状态？

1条答案

相关问题

热门标签

最新问答