python 从字符串中删除数字模式

p3rjfoxz 于 2023-01-16 发布在 Python

关注(0)|答案(5)|浏览(194)

我的对话如下所示：

s = "1) Person Alpha:\nHello, how are you doing?\n\n1) Human:\nGreat, thank you.\n\n2) Person Alpha:\nHow is the weather?\n\n2) Human:\nThe weather is good."

1) Person Alpha:
Hello, how are you doing?

1) Human:
Great, thank you.

2) Person Alpha:
How is the weather?

2) Human:
The weather is good.

我想删除开头的枚举以获得以下结果：

s = "Person Alpha:\nHello, how are you doing?\n\nHuman:\nGreat, thank you.\n\nPerson Alpha:\nHow is the weather?\n\nHuman:\nThe weather is good."

Person Alpha:
Hello, how are you doing?

Human:
Great, thank you.

Person Alpha:
How is the weather?

Human:
The weather is good.

我的想法是在文本中搜索1），2），3），...并将其替换为空字符串。这可能会起作用，但效率很低（例如，如果1）出现在对话的文本中，可能会成为一个问题）。
有没有更好/更优雅的方法来做到这一点？

python

来源：https://stackoverflow.com/questions/75122503/remove-number-patterns-from-string

5条答案

按热度按时间

czq61nw11#

一种方法是使用split()方法，通过换行符字符分割输入字符串。然后，您可以遍历得到的行列表，并检查每行是否以开头，a a a是一个数字，后跟右括号和空格**。如果是，您可以删除该前缀。最后，你可以用换行符把所有修改过的行连接起来，得到最终的输出。

s = "1) Person Alpha:\nHello, how are you doing?\n\n1) Human:\nGreat, thank you.\n\n2) Person Alpha:\nHow is the weather?\n\n2) Human:\nThe weather is good."

lines = s.split("\n")
for i in range(len(lines)):
    if re.match(r"^\d+\) ", lines[i]):
        lines[i] = lines[i][4:]

s = "\n".join(lines)
print(s)

赞(0）回复(0）举报 2023-01-16

8qgya5xd2#

使用正则表达式替换每个后跟括号的数字

import re
s = re.sub("[0-9]\) ", "", s)

将输出到：

Person Alpha:
Hello, how are you doing?

Human:
Great, thank you.

Person Alpha:
How is the weather?

Human:
The weather is good.

或者，如果您不想冒险替换对话中的某些内容，则可以在每个数字模式前面使用\n

import re
s = re.sub("\n[0-9]\) ", "\n", s)[3:]

请注意，由于字符串的开头没有\n，因此通过剪切前3个字符手动剪切了第一个模式。
输出与上述相同。

赞(0）回复(0）举报 2023-01-16

5rgfhyps3#

你说效率低是什么意思？
您不想使用循环来避免性能下降吗？请详细说明您已尝试的操作以及您希望和不希望执行的操作

赞(0）回复(0）举报 2023-01-16

d7v8vwbk4#

我建议使用一个类似于@Always Sunny的版本，但是使用re.sub，这更容易阅读，并且适用于括号前的任意数量的数字：

s = "1) Person Alpha:\nHello, how are you doing?\n\n1) Human:\nGreat, thank you.\n\n2) Person Alpha:\nHow is the weather?\n\n2) Human:\nThe weather is good."

lines = s.split("\n")
for i in range(len(lines)):
    lines[i] = re.sub("^[0-9]+\)\ ", "", line)

s = "\n".join(lines)
print(s)

赞(0）回复(0）举报 2023-01-16

iezvtpos5#

这可以通过使用re模块的正则表达式来实现，如下所示：

import re
s = re.sub(r'^\d+\)\s*', '', s, 0, re.M)

这一行使用多行regex标志使^也在每一个换行符后匹配，通常匹配字符串的开头。首先regex查找一个或多个数字（\d+），然后是右括号（\)），然后是零个或多个空格（\s*）。然后它用空字符串替换所有出现的该模式。

赞(0）回复(0）举报 2023-01-16

我来回答

python 从字符串中删除数字模式

5条答案

相关问题

热门标签

最新问答