在matlab中为缺失数据添加0而不是排除类别

pobjuy32 于 2023-01-31 发布在 Matlab

关注(0)|答案(2)|浏览(270)

我有下面两个数据表，一个名为data1，另一个名为data2。左手列是分类变量，右手列是频率。我想重写这些表，但如果左手列中有缺失的类别，我希望它放入正确的缺失类别，然后在右手频率列中放入“0”。

为了更清楚，我将用一个例子来解释。在data1中，8、12和13在左边的列中丢失。我希望matlab重新创建这个表，但是8的值为0。12和13，所以它看起来如下。我还希望它在“14”之后有额外的空类别，因为data2更长，有更多的类别。我还介绍了data2填充值后的样子。

我有一些数据集，它们通常都以1、2、3、4、5等开头，但它们在左侧列中的类别略有不同，因为在缺少值的地方，它只是省略了类别，而不是放置0。我如何编写代码，以便它自动用0填充任何空白。如果代码能够识别"最高“在所有数据集之间的类别数，然后在此基础上填空。
我的目标是用长度相同的数据系列组合成一个分组条形图。

matlab

来源：https://stackoverflow.com/questions/75256960/adding-0-for-missing-data-rather-than-excluding-the-category-in-matlab

2条答案

按热度按时间

qzwqbdag1#

您可以将数据集转换为表，然后使用outerjoin。然后，您可以使用fillmissing将NaN替换为您想要的任何内容。

table1 = array2table(data1);
table1.Properties.VariableNames = {'A', 'B'};
table2 = array2table(data2);
table2.Properties.VariableNames = {'A', 'B'};

newTable = outerjoin(table1, table2, 'LeftKeys', {'A'}, 'RightKeys', {'A'}, 'MergeKeys', true)

其产生：

A     B_table1    B_table2
__    ________    ________

 1      170         240   
 2      120         200   
 3      100         180   
 4       40          60   
 5       30          50   
 6       20          40   
 7       10          30   
 8      NaN          20   
 9        8           8   
10        2           2   
11        1         NaN   
12      NaN           1   
14        1         NaN   
19      NaN           1

然后用newTable2 = fillmissing(newTable, 'constant', 0)得到零值，它将输出：

A     B_table1    B_table2
__    ________    ________

 1      170         240   
 2      120         200   
 3      100         180   
 4       40          60   
 5       30          50   
 6       20          40   
 7       10          30   
 8        0          20   
 9        8           8   
10        2           2   
11        1           0   
12        0           1   
14        1           0   
19        0           1

- 更新**

要合并多个表，可以嵌套outerjoin，也可以编写一个函数来遍历它（参见similar Matlab forum question）。
给定OP中的data1和data2，加上新的data3：

myscript.m的内容：

table1 = MakeTable(data1);
table2 = MakeTable(data2);
table3 = MakeTable(data3);

AllJoins = MultiOuterJoin(table1, table2, table3);

% Functions

function Table = MakeTable(Array)
    Table = array2table(Array);
    Table.Properties.VariableNames = {'A', 'B'}; % set your column names, e.g. {'freq', 'count'}
end

function Joined = MultiOuterJoin(varargin)
    Joined = varargin{1};
    Joined.Properties.VariableNames{end} = inputname(1); % set #2 column name to be based on table name
    for k = 2:nargin
      Joined = outerjoin(Joined, varargin{k}, 'LeftKeys', {'A'}, 'RightKeys', {'A'}, 'MergeKeys', true);
      name = inputname(k);
      Joined.Properties.VariableNames{end} = name; % set merged column name to be based on table name
    end
end

返回AllJoins：

A     table1    table2    table3
__    ______    ______    ______

 1     170       240       2400 
 2     120       200       2000 
 3     100       180        NaN 
 4      40        60        NaN 
 5      30        50        NaN 
 6      20        40        NaN 
 7      10        30        NaN 
 8       0        20        NaN 
 9       8         8        NaN 
10       2         2        NaN 
11       1         0        NaN 
12       0         1        NaN 
13       0         0        NaN 
14       1         0        NaN 
15       0         0        NaN 
16       0         0        NaN 
17       0         0        NaN 
18       0         0        NaN 
19       0         1        NaN 
20     NaN       NaN       1800

赞(0）回复(0）举报 2023-01-31

8i9zcol22#

你可以随意改变数组的最大长度，这是一个通用的答案。最大长度是max(data1(:,1))，但是你可以用任何方式来计算它，例如多个数组的最大值。

% make new data
new_data1=zeros(max(data1(:,1),2));
new_data(:,1)=1:max(data1(:,1));    

% Fill data. You can do this in a loop if its easier for you to understand.
% in essence, it says: in all the data1(:,1) indices of new_data's second column, put data1(:,2)
new_data(data1(:,1),2)=data1(:,2);

赞(0）回复(0）举报 2023-01-31

我来回答

在matlab中为缺失数据添加0而不是排除类别

2条答案

相关问题

热门标签

最新问答