pandas 如何从路径中获取不带路径的文件名

bvn4nwqk  于 2023-04-19  发布在  其他
关注(0)|答案(2)|浏览(112)

我有一个“pickle”文件列表(见图1)。我想在Pandas中使用文件名作为索引。但到目前为止,我有所有的路径(很长)+文件名。
我找到了这个链接:How to get the filename without the extension from a path in Python?
答案是在我的代码中的某个地方使用“.stem”。但我只是不知道在哪里。而且我的文件没有扩展名。

import pandas as pd
import glob
from pathlib import Path

# This is the path to the folder which contains all the "pickle" files
dir_path = Path(r'C:\Users\OneDrive\Projects\II\Coral\Classification\inference_time')
files = dir_path.glob('**/file_inference_time*')  

df_list = list()  #This is an empty list

for file in files:
    df = pd.DataFrame(pd.read_pickle(file)) #storing the "pickle" files in a dataframe

    df_list['file'] = file  #creating a column 'file' which has the path + file

    df_list.append(df)  #sending all dataframes into a list

df_list_all = pd.concat(df_list).reset_index(drop=True) #merging all dataframes into a single one

df_list_all

这就是我得到的:

Inference_Time  file
0   2.86    C:\Users\OneDrive\Projects\Classification\inference_time\inference_time_InceptionV1
1   30.96   C:\Users\OneDrive\Projects\Classification\inference_time\inference_time_mobileNetV2
2   11.04   C:\Users\OneDrive\Projects\Classification\inference_time\inference_time_efficientNet

我想要这个:

Inference_Time        file
InceptionV1    2.86  C:\Users\OneDrive\Projects\Classification\inference_time\inference_time_InceptionV1
mobilenetV2    30.96    C:\Users\OneDrive\Projects\Classification\inference_time\inference_time_mobileNetV2
efficientNet   11.04    C:\Users\OneDrive\Projects\Classification\inference_time\inference_time_efficientNet

图1

62o28rlo

62o28rlo1#

您可以将输出转换为:

In [1603]: df                                                                                                                                                                                               
Out[1603]: 
   Inference_Time                                               file
0            2.86  C:\Users\OneDrive\Projects\Classification\infe...
1           30.96  C:\Users\OneDrive\Projects\Classification\infe...
2           11.04  C:\Users\OneDrive\Projects\Classification\infe...

In [1607]: df = df.set_index(df['file'].str.split('inference_time_').str[-1])   

In [1610]: del df.index.name

In [1608]: df                                                                                                                                                                                               
Out[1608]: 
              Inference_Time                                               file

InceptionV1             2.86  C:\Users\OneDrive\Projects\Classification\infe...
mobileNetV2            30.96  C:\Users\OneDrive\Projects\Classification\infe...
efficientNet           11.04  C:\Users\OneDrive\Projects\Classification\infe...
56lgkhnf

56lgkhnf2#

查看pandas-path,它为您提供了Series上的.path访问器,该访问器公开了所有普通的pathlib方法和属性。

import pandas as pd
from pandas_path import path

# can be windows paths; only posix paths because i am on posix machine
data = [
    ("folder/inference_time_InceptionV1", 10),
    ("folder2/inference_time_mobileNetV2", 20),
    ("folder4/inference_time_efficientNet", 30),
]

df = pd.DataFrame(data, columns=['file', 'time'])
(
    df.file.path.name  # use path accessor from pandas_path to get just the filename
     .str.split('_')   # split into components based on "_" 
     .str[-1]          # select last component
)
#> 0     InceptionV1
#> 1     mobileNetV2
#> 2    efficientNet
#> Name: file, dtype: object

创建于2021-03-06 10:57:59 PST由reprexlite v0.4.2

相关问题