regex 正则表达式仅在前后不带标点符号时进行替换

jpfvwuh4  于 2023-03-31  发布在  其他
关注(0)|答案(2)|浏览(130)

我想用点来替换文本中所有出现的\n,但前提是它们之前或之后没有任何标点符号。如果它们之前或之后有任何标点符号,则应替换为''

text = "This is a sample sentence with bullets:\n    Item 1\n    Item 2\nThis is the next sentence. This is the third sentence\n."

预期输出:

"This is a sample sentence with bullets:    Item 1.    Item 2.This is the next sentence. This is the third sentence."

任何在这方面的帮助都非常感谢。
谢谢大家!

6ovsh4lw

6ovsh4lw1#

试试看:

import re

text = "This is a sample sentence with bullets:\n    Item 1\n    Item 2\nThis is the next sentence. This is the third sentence\n."
punct = set('.,;!?\'":-')

def replace(g):
    if g[1] in punct or g[2] in punct:
        return f'{g[1]}{g[2]}'
    else:
        return f'{g[1]}.{g[2]}'

text = re.sub(r'(.)\n+(.)', replace, text.strip())
print(text)

图纸:

This is a sample sentence with bullets:    Item 1.    Item 2.This is the next sentence. This is the third sentence.
j1dl9f46

j1dl9f462#

import re

result = re.sub(r'\n', '', re.sub(r'(?<=[^.,:;!?])\n(?=[^.,:;!?])', '.', input))

在这里,我们用.替换非标点符号包围的\n,然后用空字符串替换所有其他\n
但你需要检查清单的非标点符号,以适合您的需要。

相关问题