python 在正确的位置用标点符号连接拆分的单词和标点符号

v1l68za4  于 2023-02-21  发布在  Python
关注(0)|答案(5)|浏览(178)

因此,我尝试在将字符串拆分为单词和标点符号后使用join(),但它在单词和标点符号之间用空格连接字符串。
b = ['Hello', ',', 'who', 'are', 'you', '?'] c = " ".join(b)
但这会带来:
c = 'Hello , who are you ?'
我想要:
c = 'Hello, who are you?'

aemubtdh

aemubtdh1#

你可以先加上标点符号:

def join_punctuation(seq, characters='.,;?!'):
    characters = set(characters)
    seq = iter(seq)
    current = next(seq)

    for nxt in seq:
        if nxt in characters:
            current += nxt
        else:
            yield current
            current = nxt

    yield current

c = ' '.join(join_punctuation(b))

join_punctuation生成器生成的字符串中已加入以下任何标点符号:

>>> b = ['Hello', ',', 'who', 'are', 'you', '?']
>>> list(join_punctuation(b))
['Hello,', 'who', 'are', 'you?']
>>> ' '.join(join_punctuation(b))
'Hello, who are you?'
4ioopgfo

4ioopgfo2#

可能是这样的:

>>> from string import punctuation
>>> punc = set(punctuation) # or whatever special chars you want
>>> b = ['Hello', ',', 'who', 'are', 'you', '?']
>>> ''.join(w if set(w) <= punc else ' '+w for w in b).lstrip()
'Hello, who are you?'

这将在b中不完全由标点符号组成的单词前添加一个空格。

swvgeqrz

swvgeqrz3#

这样做后,你得到的结果,不充分,但工程...

c = re.sub(r' ([^A-Za-z0-9])', r'\1', c)

输出:

c = 'Hello , who are you ?'
>>> c = re.sub(r' ([^A-Za-z0-9])', r'\1', c)
>>> c
'Hello, who are you?'
>>>
vsmadaxz

vsmadaxz4#

基于the answer of Martijn Pieters♦,我对标点符号也可以出现在单词开头的语言做了一点归纳。

from string import punctuation

def join_punctuation(
    seq,
    characters_after=punctuation,
    characters_before="¡¿"
):
    characters_after = set(characters_after)
    characters_before = set(characters_before)
    seq = iter(seq)
    current = next(seq)

    for nxt in seq:
        if current in characters_before:
            current += nxt
        elif nxt in characters_after:
            current += nxt
        else:
            yield current
            current = nxt

    yield current

它的工作原理是一样的:

>>> b = ["Hola", ",", "¿", "Qué", "tal", "?"]
>>> list(join_punctuation(b))
['Hola,', '¿Qué', 'tal?']
>>> " ".join(join_punctuation(b))
'Hola, ¿Qué tal?'
pkln4tw6

pkln4tw65#

怎么样

c = " ".join(b).replace(" ,", ",")

相关问题