csv panda read_table,带有停止字符串,用于分隔要分配的不同 Dataframe

dbf7pr2w  于 2023-03-05  发布在  其他
关注(0)|答案(1)|浏览(87)

我有一个csv文件的形式:

LINE 1 to SKIP
LINE 2 to SKIP
2.13999987 0.139999986 -0.398405492 1
2.61999989 6.0000062E-2 0.450082362 1
2.74000001 5.99999428E-2 1.04403841 1
2.84000015 4.00000811E-2 6.17375337E-2 1
IGN IGN IGN IGN 
21.4200001 0.420000076 1.53572667 1
22.3199997 0.479999542 -0.595370948 1
23.3199997 0.520000458 0.136062101 1
24.3600006 0.519999504 -0.520044923 1
25.3999996 0.520000458 2.45230961 1
26.4399986 0.519999504 -2.08248448 1
27.4799995 0.520000458 -0.263438225 1
IGN IGN IGN IGN 
58.6800003 0.520000458 -0.789233088 1
59.7200012 0.520000458 -1.02961564 1
60.7600021 0.51999855 -0.889572859 1
61.7999992 0.520000458 -1.03346229 1
62.8400002 0.520000458 4.94940579E-2 1

我想读到Pandas的故事,比如:

df_first = pd.read_table('file.txt', names=names, delimiter=' ', skiprows=3, nrows=4)

(其中names是file.txt中每一列的名称)我想将每一系列行分配给一个指定了给定名称的df(可能是一个名称数组),直到遇到IGN IGN IGN IGN字符串,然后将其余行再次分配给后面的df,直到遇到下一个IGN IGN IGN IGN字符串,直到文件结束。
有什么好办法做到这一点?

8ljdwjyq

8ljdwjyq1#

几年前我就遇到过这个问题。我的解决方案是:

names =['1','2', '3', '4']
df = pd.read_table('file.txt', names=names, delimiter=' ', skiprows=3) # Read the data
index = list(df.loc[df['1']=='IGN'].index) # Getting the index, where IGN ocures
df_list = [] # Defining the dataframe-List ot store the dataframes
start = df.index.min() # Defining the start index
for end in index: # looping through all indeces
    df_list.append(df.loc[start:end-1])
    start = end+1
else:
    df_list.append(df.loc[start:]) # Getting the last slice of the main dataframe

可按如下方式调用单个 Dataframe :

df_list[0]
df_list[1]
...
df_list[n]

问候

相关问题