我有两个数据集要连接起来。它们各自包含两个阵列,一个二维 * 强度 * 阵列(尺寸=时间 * 波长)和一个具有 * 通道名称 * 的不相关1D阵列,这两个数据集中的通道名称相同。当我沿着时间维度连接时,一个额外的时间维度被添加到通道名称数组中。这有点道理,文档中也提到了,但我希望结果中的通道名称保持不变。我怎样才能避免这个额外的维度呢?
下面的例子演示了我想要的。
import numpy as np
import xarray as xr
DIM_TIME = "time"
DIM_CHANNEL_NR = 'channel_number'
DIM_WAVELENGTH = "wavelength"
def create_ds(intensity, time_local):
wavelength = np.linspace(700.0, 800.0, 8)
da_wav = xr.DataArray(wavelength, dims=[DIM_WAVELENGTH])
da_time = xr.DataArray(time_local, dims=[DIM_TIME])
da_chan_nr = xr.DataArray(np.array([1, 2]), dims=[DIM_CHANNEL_NR])
da_intensity = xr.DataArray(
intensity, name='intensity',
dims=[DIM_TIME, DIM_WAVELENGTH],
coords={DIM_TIME: da_time, DIM_WAVELENGTH: da_wav})
da_chan_name = xr.DataArray(
data = np.array(['UV', 'VIS']),
name = 'chan_name',
dims = [DIM_CHANNEL_NR])
ds = xr.Dataset(
data_vars={da_intensity.name: da_intensity},
coords={
DIM_TIME: da_time,
DIM_WAVELENGTH: da_wav,
DIM_CHANNEL_NR: da_chan_nr})
ds[da_chan_name.name] = da_chan_name
return ds
def main():
ds1 = create_ds(
intensity=np.arange(24).reshape((3, 8)),
time_local = np.array([1e17, 2e17, 3e17]).astype('datetime64[ns]'))
ds2 = create_ds(
intensity=np.arange(24).reshape((3, 8)) + 24,
time_local = np.array([4e17, 5e17, 6e17]).astype('datetime64[ns]'))
print("---- concat ----\n{}\n".format(xr.concat([ds1, ds2], dim=DIM_TIME)))
print("---- merged ----\n{}\n".format(xr.merge([ds1, ds2])))
if __name__ == "__main__":
main()
字符串
当我运行这个程序时,连接的数据集如下所示。
---- concat ----
<xarray.Dataset>
Dimensions: (time: 6, wavelength: 8, channel_number: 2)
Coordinates:
* time (time) datetime64[ns] 1973-03-03T09:46:40 ... 1989-01-05T...
* wavelength (wavelength) float64 700.0 714.3 728.6 ... 771.4 785.7 800.0
* channel_number (channel_number) int32 1 2
Data variables:
intensity (time, wavelength) int32 0 1 2 3 4 5 6 ... 42 43 44 45 46 47
chan_name (time, channel_number) <U3 'UV' 'VIS' 'UV' ... 'UV' 'VIS'
型
如您所见,chan_name
数组现在是二维的; time
维度已被前置。“
当我合并数据集时,结果与我想要的完全一样:
---- merged ----
<xarray.Dataset>
Dimensions: (time: 6, wavelength: 8, channel_number: 2)
Coordinates:
* time (time) datetime64[ns] 1973-03-03T09:46:40 ... 1989-01-05T...
* wavelength (wavelength) float64 700.0 714.3 728.6 ... 771.4 785.7 800.0
* channel_number (channel_number) int32 1 2
Data variables:
intensity (time, wavelength) float64 0.0 1.0 2.0 ... 45.0 46.0 47.0
chan_name (channel_number) <U3 'UV' 'VIS'
型
这里的chan_name
数组与原始数据集中的相同,是一维数组。
不幸的是,xr.merge
比xr.concat
要慢得多。有没有一种方法可以连接不相关的数组?
1条答案
按热度按时间7jmck4yq1#
试试这个:
字符串