我正在尝试使用numpy创建一个one hot编码函数:
def one_hot(indices):
mapping = dict([(value, key) for key, value in dict(enumerate([y for x in np.unique(np.vstack({tuple(row) for row in indices}), axis=0).tolist() for y in x])).items()])
for key in mapping.keys():
indices[indices == key] = mapping[key]
print(indices)
但是,我得到了以下错误:
machine-learning% python3 driver.py
Shape of train set is (216, 13)
Shape of test set is (54, 13)
Shape of train label is (216, 1)
Shape of test labels is (54, 1)
Traceback (most recent call last):
File "/home/user/Documents/IKP-HomomorphicEncryption/driver.py", line 1109, in <module>
main()
File "/home/user/Documents/IKP-HomomorphicEncryption/driver.py", line 1082, in main
one_hot(X)
File "/home/user/Documents/IKP-HomomorphicEncryption/driver.py", line 52, in one_hot
mapping_reversed = dict(enumerate([y for x in np.unique(np.vstack({tuple(row) for row in indices}), axis=0).tolist() for y in x]))
File "<__array_function__ internals>", line 200, in vstack
File "/home/user/.local/lib/python3.9/site-packages/numpy/core/shape_base.py", line 296, in vstack
return _nx.concatenate(arrs, 0, dtype=dtype, casting=casting)
File "<__array_function__ internals>", line 200, in concatenate
ValueError: all the input array dimensions except for the concatenation axis must match exactly, but along dimension 1, the array at index 0 has size 23 and the array at index 1 has size 4
我意识到这意味着维度不匹配,但当我打印数据时,似乎所有行的长度都相同。
1条答案
按热度按时间ar7v8xwq1#
您可以使用
return_inverse=True
作为np.unique
的参数作为起点:输出: