regex 在 Python 中 , 将 随机 位置 的 带 引号 的 单词 与 最 后 一 个 单词 交换

wz1wpwve  于 2022-11-18  发布在  Python
关注(0)|答案(4)|浏览(111)

我有一个txt文件,其中包含这样的文本行,我想将引号中的单词与最后一个用制表符分隔的单词进行交换:

它看起来像这样:

This "is" a person    are
She was not "here"    right
"The" pencil is not sharpened    a

所需输出:

This "are" a person   is
She was not "right"   here

一些想法:

#1:使用麻木

1.用numpy-〉['This','"is"','a','person',\t,'are']以空格分隔所有单词

问题:

1.如何告诉python引号中单词的位置
1.如何将列表转换回普通文本。是否全部连接?

#2:使用正则表达式

1.使用正则表达式并在""中查找单词

with open('readme.txt','r') as x:
    x = x.readlines()
swap = x[-1]
re.findall(\"(\w+)\", swap)

问题:

1.我不知道用正则表达式读取txt文件的内容。我在这里看到的大多数例子都会把整个句子赋给一个变量。是不是这样的?

with open('readme.txt') as f:
    lines = f.readlines()

    lines.findall(....)

谢谢你们

vmdwslir

vmdwslir1#

请尝试:

import re

pat = re.compile(r'"([^"]*)"(.*\t)(.*)')

with open("your_file.txt", "r") as f_in:
    for line in f_in:
        print(pat.sub(r'"\3"\2\1', line.rstrip()))

印刷品:

This "are" a person     is
She was not "right"     here
"a" pencil is not sharpened     The
8fsztsew

8fsztsew2#

你真的不需要 re 为这样的小事。
假设您要重写该文件:

with open('foo.txt', 'r+') as txt:
    lines = txt.readlines()
    for k, line in enumerate(lines):
        words = line.split()
        for i, word in enumerate(words[:-1]):
            if word[0] == '"' and word[-1] == '"':
                words[i] = f'"{words[-1]}"'
                words[-1] = word[1:-1]
                break
        lines[k] = ' '.join(words[:-1]) + f'\t{words[-1]}'
    txt.seek(0)
    print(*lines, sep='\n', file=txt)
    txt.truncate()
kx1ctssn

kx1ctssn3#

这就是我的解决方案:
正则表达式= r

import re
file1 = open('test.txt', 'r')
count = 0

while True:

    # Get next line from file
    line = file1.readline()

    # if line is empty
    # end of file is reached
    if not line:
        break
    
    get_tab = line.strip().split('\t')[1]
    regex = r'\"[\s\S]*\"'
    print("original: {} mod ----> {}".format(line.strip(), re.sub(regex, get_tab, line.strip().split('\t')[0])))
mqkwyuun

mqkwyuun4#

我猜这也是解决它的一个办法:
输入readme.txt内容:

This "is" a person  are
She was not "here"  right
"The" pencil is not sharpened   a

编码:

import re
changed_text = []
with open('readme.txt') as x:
    for line in x:
        splitted_text = line.strip().split("\t") # ['This "is" a person', 'are'] etc.
        if re.search(r'\".*\"', line.strip()): # If a quote is found
            qouted_text = re.search(r'\"(.*)\"', line.strip()).group(1)
            changed_text.append(splitted_text[0].replace(qouted_text, splitted_text[1])+"\t"+qouted_text)
with open('readme.txt.modified', 'w') as x:
    for line in changed_text:
        print(line)
        x.write(line+"\n")

结果(自述文件.txt.已修改):

Thare "are" a person    is
She was not "right" here
"a" pencil is not sharpened The

相关问题