从没有头文件的csv文件创建字典,字典键是输入中给出的列表

krcsximq  于 2022-12-15  发布在  其他
关注(0)|答案(3)|浏览(131)

我试过几种不同的方法,但都不管用。
例如

data = readData('energy_2.csv', ['M', 'V', 'H'])

应返回:

{'M': [150.270685, 150.062813, 150.090797, 150.050383, 150.065112, 149.968068, 149.915192, 150.060597, 149.798183, 150.074012, 150.052881, 149.9411, 150.01887, 149.924113, 149.906676], 'V': [12.977528, 12.595397, 13.489379, 13.802984, 12.841754, 12.651333, 13.346861, 11.646957, 11.92044, 12.43258, 12.695264, 12.583452, 12.592251, 12.903853, 12.53648], 'H': [75.638787, 75.329646, 74.502896, 74.24593, 74.056594, 75.484752, 74.883227, 76.901755, 75.238127, 76.996652, 74.006737, 75.1968, 73.863355, 75.000366, 76.025984]}

我的回答是:

{'M': ['150.270685;12.977528;75.638787', '150.062813;12.595397;75.329646', '150.090797;13.489379;74.502896', '150.050383;13.802984;74.24593', '150.065112;12.841754;74.056594', '149.968068;12.651333;75.484752', '149.915192;13.346861;74.883227', '150.060597;11.646957;76.901755', '149.798183;11.92044;75.238127', '150.074012;12.43258;76.996652', '150.052881;12.695264;74.006737', '149.9411;12.583452;75.1968', '150.01887;12.592251;73.863355', '149.924113;12.903853;75.000366', '149.906676;12.53648;76.025984']}

我的密码

def readData(filename, labels):
    import pandas as pd
    df = pd.read_csv(filename, header=None)
    return {k: list(v) for k, v in zip(labels, df.values.T)}

CSV文件:

150.270685;12.977528;75.638787
150.062813;12.595397;75.329646
150.090797;13.489379;74.502896
150.050383;13.802984;74.24593
150.065112;12.841754;74.056594
149.968068;12.651333;75.484752
149.915192;13.346861;74.883227
150.060597;11.646957;76.901755
149.798183;11.92044;75.238127
150.074012;12.43258;76.996652
150.052881;12.695264;74.006737
149.9411;12.583452;75.1968
150.01887;12.592251;73.863355
149.924113;12.903853;75.000366
149.906676;12.53648;76.025984
q8l4jmvw

q8l4jmvw1#

你不需要Pandas,用csv.reader就行了!
根据您的问题,您的CSV文件似乎由 * 分号 * 分隔。您需要指定此选项,因为默认分隔符是逗号
首先,使用labels中的键创建一个字典,其中的值是空列表。然后,将每行中的值追加到正确的列表中。由于您需要float值,请记住在追加之前将它们转换为float

import csv

def readData(filename, labels):
    data = {lbl: [] for lbl in labels}
    with open(filename, "r") as f:
        reader = csv.reader(f, delimiter=";")
        for row in reader:
            for lbl, value in zip(labels, row):
                data[lbl].append(float(value))
    return data

它会给出所需的data

{'M': [150.270685,
  150.062813,
  150.090797,
  150.050383,
  150.065112,
  149.968068,
  149.915192,
  150.060597,
  149.798183,
  150.074012,
  150.052881,
  149.9411,
  150.01887,
  149.924113,
  149.906676],
 'V': [12.977528,
  12.595397,
  13.489379,
  13.802984,
  12.841754,
  12.651333,
  13.346861,
  11.646957,
  11.92044,
  12.43258,
  12.695264,
  12.583452,
  12.592251,
  12.903853,
  12.53648],
 'H': [75.638787,
  75.329646,
  74.502896,
  74.24593,
  74.056594,
  75.484752,
  74.883227,
  76.901755,
  75.238127,
  76.996652,
  74.006737,
  75.1968,
  73.863355,
  75.000366,
  76.025984]}
j8yoct9x

j8yoct9x2#

默认分隔符(sep=',')有问题。请尝试设置sep=';'而不是使用默认分隔符。您也可以将names设置为输入列表labels

例如

import pandas as pd

def readData(filename, labels):
    df = pd.read_csv(filename, header=None, sep=";", names=labels)
    return list(df['M'])

data = readData('energy_2.csv', ['M', 'V', 'H'])
print(data)

输出

[150.270685, 150.062813, 150.090797, 150.050383, 150.065112, 149.968068, 149.915192, 150.060597, 149.798183, 150.074012, 150.052881, 149.9411, 150.01887, 149.924113, 149.906676]

来源pandas.read_csv (docs)
旁注:上面的答案假设energy_2.csv看起来类似于:

150.270685;12.977528;75.638787
150.062813;12.595397;75.329646
150.090797;13.489379;74.502896
150.050383;13.802984;74.24593
150.065112;12.841754;74.056594
149.968068;12.651333;75.484752
149.915192;13.346861;74.883227
150.060597;11.646957;76.901755
149.798183;11.92044;75.238127
150.074012;12.43258;76.996652
150.052881;12.695264;74.006737
149.9411;12.583452;75.1968
150.01887;12.592251;73.863355
149.924113;12.903853;75.000366
149.906676;12.53648;76.025984
lnlaulya

lnlaulya3#

如前所述,主要问题是.csv中的;分隔符(默认分隔符是,,正如名称“csv”所暗示的那样)。
我首先将.csv读作一个三元组列表(在本例中),然后将其转换为三个值列表,最后转换为字典。

def readData(filename, labels):
    import csv
    with open(filename) as f:
        data = list(csv.reader(f, delimiter = ';'))
        return dict([[ labels[i], [d[i] for d in data if d]] for i in range(len(labels))])
        
headers = ['M', 'V', 'H']

print(readData('test.csv', headers))

# {'M': ['150.270685', '150.062813', '150.090797', '150.050383', '150.065112', '149.968068', '149.915192', '150.060597', '149.798183', '150.074012', '150.052881', '149.9411', '150.01887', '149.924113', '149.906676'], 'V': ['12.977528', '12.595397', '13.489379', '13.802984', '12.841754', '12.651333', '13.346861', '11.646957', '11.92044', '12.43258', '12.695264', '12.583452', '12.592251', '12.903853', '12.53648'], 'H': ['75.638787', '75.329646', '74.502896', '74.24593', '74.056594', '75.484752', '74.883227', '76.901755', '75.238127', '76.996652', '74.006737', '75.1968', '73.863355', '75.000366', '76.025984']}

相关问题