在Python中使用scipy.io.loadmat加载matlab表

ktecyv1j  于 2023-10-23  发布在  Matlab
关注(0)|答案(5)|浏览(142)

是否可以在python中使用scipy.io.loadmat加载matlab表?
我在做什么
在Matlab中:

tab = table((1:500)')
save('tab.mat', 'tab')

在Python中:

import scipy.io
mat = scipy.io.loadmat('m:/tab.mat')

但是我不能在Python中使用match 'tab'访问表选项卡]

fdbelqdn

fdbelqdn1#

你的问题的答案是不。许多matlab对象可以在python中加载。其中,表无法加载。参见Handle Data Returned from MATLAB to Python

w1e3prcc

w1e3prcc2#

loadmat函数不加载MATLAB表。相反,可以做一个小的工作区。这些表可以在MATLAB中保存为.csv文件,然后可以在Python中使用pandas读取。

在MATLAB中

writetable(table_name, file_name)

Python中

df = pd.read_csv(file_name)

最后,DataFrame df将包含table_name的内容

qyuhtwio

qyuhtwio3#

我已经为我正在做的一个项目研究过这个问题,作为一种变通方法,您可以尝试以下方法。
在MATLAB中,首先将@table对象转换为结构体,并使用以下命令检索列名:

table_struct = struct(table_object);
table_columns = table_struct.varDim.labels;
save table_as_struct table_struct table_columns;

然后你可以在Python中尝试以下代码:

import numpy
import pandas as pd
import scipy.io

# function to load table variable from MAT-file
def loadtablefrommat(matfilename, tablevarname, columnnamesvarname):
    """
    read a struct-ified table variable (and column names) from a MAT-file
    and return pandas.DataFrame object.
    """

    # load file
    mat = scipy.io.loadmat(matfilename)

    # get table (struct) variable
    tvar = mat.get(tablevarname)
    data_desc = mat.get(columnnamesvarname)
    types = tvar.dtype
    fieldnames = types.names

    # extract data (from table struct)
    data = None
    for idx in range(len(fieldnames)):
        if fieldnames[idx] == 'data':
            data = tvar[0][0][idx]
            break;

    # get number of columns and rows
    numcols = data.shape[1]
    numrows = data[0, 0].shape[0]

    # and get column headers as a list (array)
    data_cols = []
    for idx in range(numcols):
        data_cols.append(data_desc[0, idx][0])

    # create dict out of original table
    table_dict = {}
    for colidx in range(numcols):
        rowvals = []
        for rowidx in range(numrows):
            rowval = data[0,colidx][rowidx][0]
            if type(rowval) == numpy.ndarray and rowval.size > 0:
                rowvals.append(rowval[0])
            else:
                rowvals.append(rowval)
        table_dict[data_cols[colidx]] = rowvals
    return pd.DataFrame(table_dict)
hzbexzde

hzbexzde4#

基于Joyce的回答,我提出了一个不同的变体,它对我来说很好。我写了一个Matlab脚本来自动准备m-file(参见我的GitLab Repositroy示例)。它执行以下操作:

在Matlab中,对于类table

与Joint示例相同,但将数据绑定在一起。因此,加载多个变量更容易。名称“table”和“columns”对于下一部分是强制性的。

YourVariableName = struct('table', struct(TableYouWantToLoad), 'columns', {struct(TableYouWantToLoad).varDim.labels})
save('YourFileName', 'YourVariableName')

在Matlab中,对于类dataset

如果您必须处理旧的数据集类型,则可以选择此选项。

YourVariableName = struct('table', struct(DatasetYouWantToLoad), 'columns', {get(DatasetYouWantToLoad,'VarNames')})
save('YourFileName', 'YourVariableName')

Python中

import scipy.io as sio
mdata = sio.loadmat('YourFileName')
mtable = load_table_from_struct(mdata['YourVariableName'])

import pandas as pd

def load_table_from_struct(table_structure) -> pd.DataFrame():

    # get prepared data structure
    data = table_structure[0, 0]['table']['data']
    # get prepared column names
    data_cols = [name[0] for name in table_structure[0, 0]['columns'][0]]

    # create dict out of original table
    table_dict = {}
    for colidx in range(len(data_cols)):
        table_dict[data_cols[colidx]] = [val[0] for val in data[0, 0][0, colidx]]

    return pd.DataFrame(table_dict)

它独立于文件的加载,但基本上是一个最小化版本的Joint Code。所以,请给予他的荣誉,他的职位。

kadbb459

kadbb4595#

正如其他人所提到的,这是目前不可能的,因为Matlab还没有记录这种文件格式。人们正试图逆向工程的文件格式,但这是一个正在进行的工作。
解决方法是将表写入CSV格式,并使用Python加载。表中的条目可以是可变长度的数组,这些条目将被拆分到编号列中。我写了一个简短的函数来从这个CSV文件中加载标量和数组。
write the table to CSV in matlab:

writetable(table_name, filename)

在Python中读取CSV文件:

def load_matlab_csv(filename):
    """Read CSV written by matlab tablewrite into DataFrames

    Each entry in the table can be a scalar or a variable length array.
    If it is a variable length array, then Matlab generates a set of
    columns, long enough to hold the longest array. These columns have
    the variable name with an index appended.

    This function infers which entries are scalars and which are arrays.
    Arrays are grouped together and sorted by their index.

    Returns: scalar_df, array_df
        scalar_df : DataFrame of scalar values from the table
        array_df : DataFrame with MultiIndex on columns
            The first level is the array name
            The second level is the index within that array
    """
    # Read the CSV file
    tdf = pandas.read_table(filename, sep=',')
    cols = list(tdf.columns)

    # Figure out which columns correspond to scalars and which to arrays
    scalar_cols = [] # scalar column names
    arr_cols = [] # array column names, without index
    arrname2idxs = {} # dict of array column name to list of integer indices
    arrname2colnames = {} # dict of array column name to list of full names

    # Iterate over columns
    for col in cols:
        # If the name ends in "_" plus space plus digits, it's probably
        # from an array
        if col[-1] in '0123456789' and '_' in col:
            # Array col
            # Infer the array name and index
            colsplit = col.split('_')
            arr_idx = int(colsplit[-1])
            arr_name = '_'.join(colsplit[:-1])

            # Store
            if arr_name in arrname2idxs:
                arrname2idxs[arr_name].append(arr_idx)
                arrname2colnames[arr_name].append(col)
            else:
                arrname2idxs[arr_name] = [arr_idx]
                arrname2colnames[arr_name] = [col]
                arr_cols.append(arr_name)

        else:
            # Scalar col
            scalar_cols.append(col)

    # Extract all scalar columns
    scalar_df = tdf[scalar_cols]

    # Extract each set of array columns into its own dataframe
    array_df_d = {}
    for arrname in arr_cols:
        adf = tdf[arrname2colnames[arrname]].copy()
        adf.columns = arrname2idxs[arrname]
        array_df_d[arrname] = adf

    # Concatenate array dataframes
    array_df = pandas.concat(array_df_d, axis=1)

    return scalar_df, array_df

scalar_df, array_df = load_matlab_csv(filename)

相关问题