numpy 将np数组拆分为n个大小随机的部分

yx2lnoni  于 2023-03-30  发布在  其他
关注(0)|答案(3)|浏览(211)

尝试了几种方法将np数组或列表拆分为n个随机大小的部分,但没有成功。例如:

x = np.arange(1, 100, 1)
n = 10

如何将x拆分为n个不等数组?
尝试了Bernulli分布。这是可能的,但需要太多的手工拟合。

b4lqfgs4

b4lqfgs41#

你可以用random.choice,然后split对n-1个随机分裂点进行采样:

x = np.arange(1, 100, 1)
n = 10

out = np.split(x, np.sort(np.random.choice(np.arange(1, len(x)),
                                           size=n-1, replace=False)))

示例(种子=0):

[array([1, 2, 3]),
 array([ 4,  5,  6,  7,  8,  9, 10, 11, 12, 13, 14, 15, 16, 17]),
 array([18, 19, 20, 21, 22, 23, 24, 25, 26, 27]),
 array([28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44,
        45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55]),
 array([56]),
 array([57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69]),
 array([70, 71, 72, 73, 74, 75, 76, 77, 78, 79]),
 array([80, 81, 82, 83]),
 array([84, 85, 86]),
 array([87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99])]
bkhjykvo

bkhjykvo2#

我很确定这个脚本运行得很好:

import numpy as np

x = np.arange(1, 100, 1)
n = 10

# random calculation
sizes = np.random.rand(n)
sizes /= np.sum(sizes)
sizes *= len(x)

# split points
idx = [0] + [int(np.round(np.sum(sizes[:i+1]))) for i in range(n-1)] + [len(x)]

parts = [x[idx[i]:idx[i+1]] for i in range(n)]

print(parts)

输出示例:

[array([ 1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11]), array([12, 13, 14, 15, 16, 17, 18, 19, 20]), array([21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35]), array([36, 37]), array([38, 39, 40, 41, 42, 43, 44, 45]), array([46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58]), array([59, 60, 61, 62, 63, 64, 
65, 66, 67, 68, 69]), array([70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81]), array([82, 83, 84, 85, 86, 87]), array([88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 
98, 99])]
hfwmuf9z

hfwmuf9z3#

你可以使用Dirichlet分布乘以数组大小来获得块大小,并累积以获得拆分索引:

import numpy as np

a = np.arange(100)
n = 10

r = np.cumsum(np.random.dirichlet(np.ones(n))*a.size)
chunks = np.split(a,r[:-1].astype(int))

print(*chunks,sep="\n")
[0 1 2 3]
[4]
[5 6 7 8]
[ 9 10 11 12 13 14 15 16 17 18 19]
[20 21 22 23 24]
[25 26 27 28 29]
[30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48]
[49 50 51 52 53 54 55 56]
[57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80
 81 82 83 84]
[85 86 87 88 89 90 91 92 93 94 95 96 97 98 99]

如果你希望每个块至少包含一个元素,你可以将大小因子减少n,然后将所有块加1:

r = np.cumsum(np.random.dirichlet(np.ones(n))*(a.size-n)+1)

相关问题