python 检查字符串是否与txt文件的任何行匹配,如果不匹配,则将其添加到该txt文件中

mznpcxlj  于 2023-02-11  发布在  Python
关注(0)|答案(1)|浏览(158)
import os

if (os.path.isfile('data_file.txt')): 
    data_memory_file_path = 'data_file.txt'
else:
    open('data_file.txt', "w").close()
    data_memory_file_path = 'data_file.txt'

#Example input list with info in sublists
reordered_input_info_lists = [
    [['corre'], ['en el patio'], ['2023-02-05 00:00 am']], 
    [['corre'], ['en el patio'], ['2022-12-29 12:33 am _--_ 2023-01-25 19:13 pm']], 
    [['salta'], ['en el bosque'], ['2023-02-05 00:00 am']], 
    [['salta'], ['en el patio'], ['2023-02-05 00:00 am']], 
    [['dibuja'], ['en el bosque'], ['2023-02-05 00:00 am']], 
    [['dibuja'], ['en el patio'], ['2022-12-29 12:33 am _--_ 2023-01-25 19:13 pm']]]

#I decompose the main list into the sublists that compose it, and each sublist will be a string
# that will be evaluated if it matches any of the already existing lines in the .txt
for info_list in reordered_input_info_lists:
    #I convert the list to string to have it ready to compare it with the lines of the txt file
    info_list_str = repr(info_list)

    #THIS IS WHERE I HAVE THE PROBLEM, AND IT IS WHERE THE CHECK OF THE TXT LINES SHOULD BE

这是包含在data_file.txt中的文本内容(假设在本例中已经创建了它)

[['analiza'], ['en la oficina'], ['2022-02-05 00:00 am']]
[['corre'], ['en el bosque'], ['2023-02-05 00:00 am']]
[['corre'], ['en el bosque'], ['2022-12-29 12:33 am _--_ 2023-01-25 19:13 pm']]
[['corre'], ['en el patio'], ['2023-02-05 00:00 am']]
[['corre'], ['en el patio'], ['2022-12-29 12:33 am _--_ 2023-01-25 19:13 pm']]
[['dibuja'], ['en el estudio de animación'], ['2023-02-05 00:00 am']]
[['dibuja'], ['en el estudio de animación'], ['2022-12-29 12:33 am _--_ 2023-01-25 19:13 pm']]
[['dibuja'], ['en la escuela'], ['2023-02-05 00:00 am']]
[['dibuja'], ['en la escuela'], ['2022-12-29 12:33 am _--_ 2023-01-25 19:13 pm']]]

添加data_file.txt中不存在的所有行之后,文件的内容将如下所示:

[['analiza'], ['en la oficina'], ['2022-02-05 00:00 am']]
[['corre'], ['en el bosque'], ['2023-02-05 00:00 am']]
[['corre'], ['en el bosque'], ['2022-12-29 12:33 am _--_ 2023-01-25 19:13 pm']]
[['corre'], ['en el patio'], ['2023-02-05 00:00 am']]
[['corre'], ['en el patio'], ['2022-12-29 12:33 am _--_ 2023-01-25 19:13 pm']]
[['dibuja'], ['en el bosque'], ['2023-02-05 00:00 am']], 
[['dibuja'], ['en el estudio de animación'], ['2023-02-05 00:00 am']]
[['dibuja'], ['en el estudio de animación'], ['2022-12-29 12:33 am _--_ 2023-01-25 19:13 pm']]
[['dibuja'], ['en el patio'], ['2022-12-29 12:33 am _--_ 2023-01-25 19:13 pm']]]
[['dibuja'], ['en la escuela'], ['2023-02-05 00:00 am']]
[['dibuja'], ['en la escuela'], ['2022-12-29 12:33 am _--_ 2023-01-25 19:13 pm']]]
[['salta'], ['en el bosque'], ['2023-02-05 00:00 am']]
[['salta'], ['en el patio'], ['2023-02-05 00:00 am']]

有一件事很重要,那就是文件中的行必须按字母顺序排列,出于代码速度的原因,我不知道在末尾按字母顺序排列行是否方便(即在添加完所有必要的行之后),或者如果程序按字母顺序逐行排列是否更好,假设文件中前面的行已经排序。

data_memory_file = open(data_memory_file_path)
for line in sorted(data_memory_file.readlines()): print (line)
wwwo4jvm

wwwo4jvm1#

将文件的内容转换成set,然后将其与列表中的行组合起来,添加所有不存在的行,最后按字母顺序将其写回文件。

with open(data_memory_file_path) as f:
    file_contents = set(map(str.strip, f)) # str.stripe to remove newlines before merging

new_file_contents = file_contents.union(map(repr, reordered_input_info_lists))

with open(data_memory_file_path, 'w') as f:
    for line in sorted(new_file_contents):
        f.write(line + '\n')

相关问题