打印截断的numpy数组

xhv8bpkk  于 11个月前  发布在  其他
关注(0)|答案(2)|浏览(68)

print({n: np.random.randn(50, 50, 50) for n in 'abcd'})

字符串
打印出一个很大的混乱。在MATLAB中,它显示

"a" ⟼ {50×50×50 double}
    "b" ⟼ {50×50×50 double}
    "c" ⟼ {50×50×50 double}
    "d" ⟼ {50×50×50 double}


是否有一个选项来显示数组在“汇总模式”?如果没有,限制其最大大小,在字符数?
我能想到的最简单的办法就是

s = np.array2string(np.random.randn(50, 50, 50))
print(s[:200] + ' ...\n' + s[-200:])


这给

[[[-2.20155640e+00  9.73402046e-01 -1.57076829e+00 ...  1.71643279e-01
   -6.92943943e-01 -1.8014576
...
141624e+00 -3.51233625e-01  8.39485094e-01 ...  5.81172472e-01
   -1.59288538e+00  6.68331170e-01]]]


但这就像把一页纸撕成两半。

[[[-2.20155640e+00  9.73402046e-01 -1.57076829e+00 ...  1.71643279e-01
   -6.92943943e-01 ...
   ...
 [[ ... -3.51233625e-01  8.39485094e-01 ...  5.81172472e-01
   -1.59288538e+00  6.68331170e-01]]]


但这将需要在array2string的输出上手动编写大量代码。
是否有内置的支持来做这样的截断?我在np.printoptions中没有找到它。我会接受“no”作为答案(如果它是答案)。

gudnpqoy

gudnpqoy1#

np.printoptions不支持沿沿着不同维度以不同方式截断数组,因此不可能通过 * 仅 * 设置打印选项来复制所需的截断行为。
要使用np.printoptions接近,可以设置edgeitems来控制在每个维度的开始和结束处显示的项目数。
举例来说:

>>> with np.printoptions(edgeitems=1):
...    print(np.random.randn(50, 50, 50))

[[[ 0.16888291 ... -0.35992041]
  ...
  [-0.07732171 ...  0.33903697]]

 ...

 [[-0.53985808 ...  1.55376801]
  ...
  [ 0.06535131 ... -0.00767678]]

>>> with np.printoptions(edgeitems=2):
...    print(np.random.randn(50, 50, 50))

[[[ 0.99024987  1.33767117 ...  1.1933827   0.64425047]
  [ 0.74387108 -0.24986084 ... -1.12225516 -0.31704093]
  ...
  [ 1.1628716   0.1387328  ... -2.40458115  0.56666761]
  [ 0.52462295  1.11381941 ... -0.72229662  0.27599356]]

 [[ 0.56495444 -0.59955944 ...  0.18377487 -0.71647396]
  [ 1.10136148 -0.16406338 ... -0.18504942 -0.11149548]
  ...
  [ 0.55318837  0.72525215 ...  0.67091088  0.9966966 ]
  [ 1.79004676  0.01862477 ... -1.61361466 -0.95263891]]

 ...

 [[ 0.14693228  0.01678601 ...  1.10225895 -0.37959402]
  [-4.39970535 -0.31171844 ...  1.76869713 -1.64067533]
  ...
  [ 1.45012138 -1.68056634 ... -1.39608527  1.45411732]
  [-0.87913625  0.25986135 ... -1.37739491 -1.54259549]]

 [[ 0.3165832   1.05780335 ...  0.11364661 -1.67169579]
  [ 3.95507723 -1.23626597 ... -0.0894148   0.67999621]
  ...
  [-1.18570837 -1.02207284 ... -0.28707352 -0.52404522]
  [-0.76953579  0.62183723 ... -0.5335504  -1.14544502]]]

字符串
对于高维数组来说,这仍然会很快变得混乱,但至少是一个渐进的改进。我们可以通过使用re.sub追溯删除一些内部行来改进这一点。要删除第一个维度>1和<N的内部“块”,可以使用以下模式:

>>> pattern = re.compile("\n\n.+\n\n", flags=re.S)
>>> with np.printoptions(edgeitems=2):
...    arr_str = str(np.random.randn(50, 50, 50))
...    print(pattern.sub("\n[...]\n", arr_str))

[[[ 0.54732654  2.11736086 ...  0.345935   -0.03855715]
  [ 0.43733732  0.730437   ...  1.69832943  0.69454482]
  ...
  [ 0.14856728 -1.43059916 ... -0.9364902   1.38182198]
  [-0.36910186  1.99054811 ... -1.19901682 -0.10087023]]
[...]
 [[-2.28694583  0.10067145 ...  0.47628708 -0.04448207]
  [-0.81220052 -1.96778607 ...  0.48162966  0.17146554]
  ...
  [-1.44967264 -1.06812799 ...  1.26674746  0.95295582]
  [ 0.71260197 -0.02449307 ...  0.30994267 -1.54361915]]]


或者,要沿第N-1维删除第一个和最后一个...沿着之间的所有内容,可以使用

>>> pattern = re.compile("\n\s+\.{3}.+\.{3}\n", flags=re.S)
>>> with np.printoptions(edgeitems=2):
...    arr_str = str(np.random.randn(50, 50, 50))
...    print(pattern.sub("\n[...]\n", arr_str))

[[[ 1.31481194  0.00205681 ...  1.49923888  0.44396607]
  [-0.65932235 -1.98175913 ...  1.18866652  1.53928303]
[...]
  [-0.25260129 -0.62445413 ... -0.92797904 -0.38465579]
  [ 1.83589279  1.32519424 ...  1.36038647  0.59325272]]]


这很好地推广到具有更多边项的更高维数组:

>>> with np.printoptions(edgeitems=4, linewidth=150):
...     arr_str = str(np.random.randn(10, 10, 10, 10))
...     print(pattern.sub("\n[...]\n", arr_str))

[[[[ 2.84367030e-01 -2.68861749e+00 -1.81613662e-01 -1.58049185e+00 ... -1.66046613e+00  1.76374397e-01  2.09184653e-01  6.55774750e-01]
   [ 1.12658231e+00 -4.39641554e-02 -6.19825170e-02  1.39851803e-01 ... -1.28005014e+00 -9.58511029e-02 -9.07004418e-01  1.50398097e+00]
   [-6.46860690e-01  1.20651042e+00 -8.94439256e-01  9.37924901e-01 ... -4.48419465e-01  8.14743932e-01  9.43309755e-02  8.98070854e-01]
   [-1.80455076e+00  1.35381386e+00  4.60792911e-01 -1.12388585e+00 ...  1.02440416e+00  6.06096973e-02 -1.09055043e+00  2.41618204e+00]
[...]
   [ 1.08294706e+00  3.25499559e-01  1.86171711e-01  1.36320122e+00 ...  6.32899882e-01 -8.67114894e-01 -7.83615859e-01 -1.93786133e+00]
   [ 1.21593920e+00  9.65084801e-01  1.89380927e+00 -8.30346195e-01 ...  8.37714070e-01  2.51154919e+00 -1.90444752e+00  3.53478260e-01]
   [-7.08397850e-02  6.49501524e-01  3.41776783e-01  7.45236588e-01 ... -6.61968529e-01 -3.76863845e-01  1.63689205e-01  3.55405530e-01]
   [-1.71393756e+00 -5.38834862e-01  8.85792032e-01  2.04801373e-01 ...  6.38890159e-02  1.19260207e+00  2.37645246e+00 -1.16429928e+00]]]]


或者,如果你只想在数组的最开始和最结束处显示少数项目,你可以在显示数组之前将其展平:

>>> with np.printoptions(edgeitems=3):
...    print(np.random.randn(50, 50, 50).reshape(-1))

[ 1.30726326  0.38948658 -0.69750772 ...  1.26024398 -0.00820665
 -0.07553464]


这放弃了关于数组形状的上下文提示,但允许更简洁的输出。

防脏变体(可能):在可能牺牲美观的情况下,我们可以保证打印不超过5行,如下:

pattern = re.compile("\n\s+\.{3}.+\.{3}\n", flags=re.S)
for shape in [(50, 50, 50), (2, 3, 50), (2, 50, 3), (50, 2, 3)]:
    with np.printoptions(edgeitems=2, threshold=1):
        arr_str = str(np.random.randn(*shape))
        s = pattern.sub("\n[...]\n", arr_str)
        sl = s.splitlines()
        if len(sl) > 5:
            s = '\n'.join(sl[:2]) + '\n[...]\n' + '\n'.join(sl[-2:])
        print(s)
    print()

vwoqyblh

vwoqyblh2#

将合并numpy.array2string与最小edgeitemsthreshold参数+ textwrap.wrap组合,以实现可调 Package :

import textwrap

arr = np.random.randn(50, 50, 50)
arr_str = np.array2string(arr, edgeitems=1, threshold=1)
print('\n'.join(textwrap.wrap(arr_str, width=70)))

字符串
示例输出:

[[[-1.48182115 ...  1.68781245]   ...   [-0.89012522 ... -0.13382862]]
...   [[ 1.34977298 ...  0.07242849]   ...   [-0.09114062 ...
-0.43768213]]]

相关问题