python numpy append将int更改为float并添加零

qv7cva1a  于 2023-02-18  发布在  Python
关注(0)|答案(3)|浏览(234)

我需要将范围转换为连续的数字。范围是int型的,结果应该是相同的。这是我目前所做的:

import numpy as np

mydata = np.array (
[49123400, 49123499],
[33554333, 33554337])

numbers_list = np.empty((0))
base_dir = "/foo.csv"

for x in mydata:
    numbers = np.arange(x[0], x[1]+1)
    numbers_list = np.append(numbers_list, numbers, axis=0)
np.savetxt(base_dir, numbers_list, delimiter=";")

我想看到的是这样一份清单:

49123400,
49123401,
49123402,...
49123499,
33554333,
33554334,...
33554399

但我得到的是:

4.912340000000000000e+11 and so on...

我哪里出错了?为什么在我做追加的时候,从int变成了float?

cdmah0mi

cdmah0mi1#

需要学习的一个重要教训是,你应该总是为你的问题选择正确的数据结构,在大多数情况下,如果你想追加/连接,那么numpy是错误的选择,除非你可以简单地设置最终的数组(带有它的最终形状),并通过设置切片来改变它。
在这种情况下,显而易见的选择是使用常规的pythonlistrange

mydata = [[49123400, 49123499],
          [33554333, 33554337]]

mynewdata = []
for sublist in mydata:
    mynewdata.extend(range(sublist[0], sublist[1]+1))

>>> mynewdata
  [49123400, 49123401, 49123402, 49123403, 49123404, 49123405,
   49123406, 49123407, 49123408, 49123409, 49123410, 49123411,
   49123412, 49123413, 49123414, 49123415, 49123416, 49123417,
   49123418, 49123419, 49123420, 49123421, 49123422, 49123423,
   49123424, 49123425, 49123426, 49123427, 49123428, 49123429,
   49123430, 49123431, 49123432, 49123433, 49123434, 49123435,
   49123436, 49123437, 49123438, 49123439, 49123440, 49123441,
   49123442, 49123443, 49123444, 49123445, 49123446, 49123447,
   49123448, 49123449, 49123450, 49123451, 49123452, 49123453,
   49123454, 49123455, 49123456, 49123457, 49123458, 49123459,
   49123460, 49123461, 49123462, 49123463, 49123464, 49123465,
   49123466, 49123467, 49123468, 49123469, 49123470, 49123471,
   49123472, 49123473, 49123474, 49123475, 49123476, 49123477,
   49123478, 49123479, 49123480, 49123481, 49123482, 49123483,
   49123484, 49123485, 49123486, 49123487, 49123488, 49123489,
   49123490, 49123491, 49123492, 49123493, 49123494, 49123495,
   49123496, 49123497, 49123498, 49123499, 33554333, 33554334,
   33554335, 33554336, 33554337]

这可以简单地转换为numpy.array

>>> np.array(mynewdata)
array([49123400, 49123401, 49123402, 49123403, 49123404, 49123405,
       49123406, 49123407, 49123408, 49123409, 49123410, 49123411,
       49123412, 49123413, 49123414, 49123415, 49123416, 49123417,
       49123418, 49123419, 49123420, 49123421, 49123422, 49123423,
       49123424, 49123425, 49123426, 49123427, 49123428, 49123429,
       49123430, 49123431, 49123432, 49123433, 49123434, 49123435,
       49123436, 49123437, 49123438, 49123439, 49123440, 49123441,
       49123442, 49123443, 49123444, 49123445, 49123446, 49123447,
       49123448, 49123449, 49123450, 49123451, 49123452, 49123453,
       49123454, 49123455, 49123456, 49123457, 49123458, 49123459,
       49123460, 49123461, 49123462, 49123463, 49123464, 49123465,
       49123466, 49123467, 49123468, 49123469, 49123470, 49123471,
       49123472, 49123473, 49123474, 49123475, 49123476, 49123477,
       49123478, 49123479, 49123480, 49123481, 49123482, 49123483,
       49123484, 49123485, 49123486, 49123487, 49123488, 49123489,
       49123490, 49123491, 49123492, 49123493, 49123494, 49123495,
       49123496, 49123497, 49123498, 49123499, 33554333, 33554334,
       33554335, 33554336, 33554337])

或者直接写入文件而不需要考虑数组:

with open('yourfile', 'w') as file:
    file.write(str(mynewdata).replace(',', ';'))

最后,请注意为什么要将整数转换为floats

>>> np.empty((0))
array([], dtype=float64)

np.empty创建了一个浮点数组,因此append/concatenate将始终生成float数组。如果需要整数数组,请使用np.empty(0, int)

>>> np.empty(0, int)
array([], dtype=int64)
bqf10yzr

bqf10yzr2#

在这种情况下,在一个迭代会话中逐步执行并在每一步查看shapedtype会有所帮助。

In [254]: mydata = np.array( [
     ...: [49123400, 49123499],
     ...: [33554333, 33554337]])
In [255]: mydata
Out[255]: 
array([[49123400, 49123499],
       [33554333, 33554337]])
In [256]: mydata.shape
Out[256]: (2, 2)
In [257]: mydata.dtype
Out[257]: dtype('int32')
In [258]: numbers_list = np.empty((0))
In [259]: numbers_list
Out[259]: array([], dtype=float64)

注意,numbers_list是一个浮点数组。考虑为empty提供一个dtype

In [260]: x=mydata[0]
In [261]: numbers = np.arange(x[0],x[1]+1)
In [262]: numbers.dtype
Out[262]: dtype('int32')
In [263]: numbers.shape
Out[263]: (100,)
In [264]: numbers_list = np.append(numbers_list, numbers, axis=0)
In [265]: numbers_list.shape
Out[265]: (100,)
In [266]: numbers_list.dtype
Out[266]: dtype('float64')

连接这两个数组后,结果的dtype为numbers_list
因此,更改empty数据类型应保留int数据类型。
我一直在努力反对np.append,这是另一个误用它的例子,它只是np.concatenate的一种形式,通常是列表追加的一个糟糕的替代品
我建议构建一个列表并使用一个连接

In [267]: numbers_list = [np.arange(x[0],x[1]+1) for x in mydata]
In [268]: len(numbers_list)
Out[268]: 2
In [269]: np.concatenate(numbers_list)
Out[269]: 
array([49123400, 49123401, 49123402, 49123403, 49123404, 49123405,
       49123406, 49123407, 49123408, 49123409, 49123410, 49123411,
       49123412, 49123413, 49123414, 49123415, 49123416, 49123417,
       49123418, 49123419, 49123420, 49123421, 49123422, 49123423,
       49123424, 49123425, 49123426, 49123427, 49123428, 49123429,
  ...
       49123496, 49123497, 49123498, 49123499, 33554333, 33554334,
       33554335, 33554336, 33554337])
In [270]: _.shape
Out[270]: (105,)

因为你使用savetxt来写数字,所以看看它的fmt参数,默认值是科学符号。
使用正确的fmt,您将得到整数:

In [272]: arr=np.concatenate(numbers_list)
In [273]: np.savetxt('test.txt',arr,fmt='%d',delimiter=',')
In [274]: cat test.txt
49123400
49123401
49123402
49123403
49123404
yhqotfr8

yhqotfr83#

我在numpy数组中追加列时遇到了同样的问题。我使用np.arange()函数创建了一个只有一列的示例数组,然后我向其追加列,但数据变得混乱,如您所见:

[[  0.00000000e+00  -1.56000000e+00]
[  1.00000000e+00   2.43000000e+00]
[  2.00000000e+00  -9.40000000e-01]
..., 
[  4.99700000e+03  -1.99000000e+00]
[  4.99800000e+03   4.10000000e-01]
[  4.99900000e+03  -7.00000000e-02]]

即使确保了dtypes的相等,问题也没有得到解决,但最终通过使用np.zeros()而不是np.arange()得到了解决。

相关问题