因此,我尝试在将字符串拆分为单词和标点符号后使用join(),但它在单词和标点符号之间用空格连接字符串。 b = ['Hello', ',', 'who', 'are', 'you', '?'] c = " ".join(b) 但这会带来: c = 'Hello , who are you ?' 我想要: c = 'Hello, who are you?'
def join_punctuation(seq, characters='.,;?!'):
characters = set(characters)
seq = iter(seq)
current = next(seq)
for nxt in seq:
if nxt in characters:
current += nxt
else:
yield current
current = nxt
yield current
c = ' '.join(join_punctuation(b))
join_punctuation生成器生成的字符串中已加入以下任何标点符号:
>>> b = ['Hello', ',', 'who', 'are', 'you', '?']
>>> list(join_punctuation(b))
['Hello,', 'who', 'are', 'you?']
>>> ' '.join(join_punctuation(b))
'Hello, who are you?'
>>> from string import punctuation
>>> punc = set(punctuation) # or whatever special chars you want
>>> b = ['Hello', ',', 'who', 'are', 'you', '?']
>>> ''.join(w if set(w) <= punc else ' '+w for w in b).lstrip()
'Hello, who are you?'
from string import punctuation
def join_punctuation(
seq,
characters_after=punctuation,
characters_before="¡¿"
):
characters_after = set(characters_after)
characters_before = set(characters_before)
seq = iter(seq)
current = next(seq)
for nxt in seq:
if current in characters_before:
current += nxt
elif nxt in characters_after:
current += nxt
else:
yield current
current = nxt
yield current
5条答案
按热度按时间aemubtdh1#
你可以先加上标点符号:
join_punctuation
生成器生成的字符串中已加入以下任何标点符号:4ioopgfo2#
可能是这样的:
这将在
b
中不完全由标点符号组成的单词前添加一个空格。swvgeqrz3#
这样做后,你得到的结果,不充分,但工程...
输出:
vsmadaxz4#
基于the answer of Martijn Pieters♦,我对标点符号也可以出现在单词开头的语言做了一点归纳。
它的工作原理是一样的:
pkln4tw65#
怎么样