numpy 如何从python中的数组列表中更快地访问字典的值?

sxissh06  于 11个月前  发布在  Python
关注(0)|答案(1)|浏览(99)

我想知道如何从数组列表中快速访问字典的值。
下面是我的玩具例子:

my_list = [np.array([ 1,  2,  3,  4,  5,  6,  8,  9, 10]), np.array([ 1,  3,  5,  6,  7, 10]), np.array([ 1,  2,  3,  4,  6,  8,  9, 10]), np.array([ 1,  3,  4,  7, 15]), np.array([ 1,  2,  4,  5, 10, 16]), np.array([6, 10, 15])]

my_dict = {1: 0, 2: 0, 3: 0, 4: 0, 5: 1, 6: 1, 7: 1, 8: 1, 9: 1, 10: 2, 11: 2, 12: 2, 13: 2, 14: 2, 15: 3, 16: 3}

字符串

  • *my_dict中的每个键对应于名为my_list**的列表中的值

我使用以下代码来获得所需的输出集:

unique_string_parents = {' '.join(map(str, set(map(my_dict.get, sublist)))) for sublist in my_list}
# output: {'0 1 2', '0 1 3', '0 1 2 3', '1 2 3'}


假设我有一个很大的维度my_listmy_dict,可以在这里找到:https://gitlab.com/Schrodinger168/practice/-/tree/master/practice_dictionary
下面是读取真实的文件的代码:

import ast

file_list = 'list_array.txt' 
with open(file_list, 'r') as file:
    lines = file.readlines()

my_list= [np.array(list(map(int, line.split()))) for line in lines]

file_dictionary = "dictionary_example.txt"
with open(file_dictionary, 'r') as file_dict:
    content = file_dict.read()

my_dict = ast.literal_eval(content)


我使用了上面提供的一行代码;大约需要16.70秒才能得到所需的输出。我想知道如何加快这个算法,或者有没有其他算法可以让我在这种情况下更快地得到结果?
任何帮助或建议,请!非常感谢!

9w11ddsr

9w11ddsr1#

看看你的代码和my_dict,我认为你应该使用np.array而不是字典。另外,在结尾处将数字转换为字符串:

# convert my_dict to np.array
arr = np.array([my_dict[i] for i in range(1, max(my_dict) + 1)])

# create the final set:
out = {frozenset(arr[l - 1]) for l in my_list}

# optionally, convert to string:
out = [" ".join(map(str, s)) for s in out]
print(out)

字符串
打印:

['1 2 3', '0 1 3', '0 1 2', '0 1 2 3']


编辑:与您的文件张贴在问题中,你可以尝试(在我的机器上,这需要11秒对19秒的原始答案):

import ast

file_list = "list_array.txt"
file_dictionary = "dictionary_example.txt"

with open(file_dictionary, "r") as file_dict:
    my_dict = ast.literal_eval(file_dict.read())

# convert keys back to string (to not convert the lines to int)
my_dict = {str(k): v for k, v in my_dict.items()}

out = set()
with open(file_list, "r") as file:
    for line in file:
        out.add(frozenset(my_dict[i] for i in line.split()))

out = [" ".join(map(str, s)) for s in out]
print(out)


编辑2:两种方法的比较:

import ast
from timeit import timeit

# read sample data:

file_list = "list_array.txt"
file_dictionary = "dictionary_example.txt"

with open(file_dictionary, "r") as file_dict:
    my_dict = ast.literal_eval(file_dict.read())

my_list = []
with open(file_list, "r") as file:
    for line in file:
        my_list.append(np.array(list(map(int, line.split()))))

def fn1(my_list, my_dict):
    arr = np.array([my_dict[i] for i in range(1, max(my_dict) + 1)])
    out = {frozenset(np.unique(arr[l - 1])) for l in my_list}
    return [" ".join(map(str, s)) for s in out]

def fn2(my_list, my_dict):
    out = {frozenset(np.unique(np.vectorize(my_dict.get)(l))) for l in my_list}
    return [" ".join(map(str, s)) for s in out]

assert len(fn1(my_list, my_dict)) == len(fn2(my_list, my_dict))

t1 = timeit("fn1(my_list, my_dict)", number=1, globals=globals())
t2 = timeit("fn2(my_list, my_dict)", number=1, globals=globals())

print(t1)
print(t2)


打印:

1.7291587160434574
11.316183750052005

相关问题