pandas Python:删除成对列名

ctehm74n  于 2023-03-28  发布在  Python
关注(0)|答案(3)|浏览(134)

我有一个DataFrame,其中的列看起来像这样:

df=pd.DataFrame(columns=['(NYSE_close, close)','(NYSE_close, open)','(NYSE_close, volume)', '(NASDAQ_close, close)','(NASDAQ_close, open)','(NASDAQ_close, volume)'])

df:
(NYSE_close, close) (NYSE_close, open) (NYSE_close, volume) (NASDAQ_close, close) (NASDAQ_close, open) (NASDAQ_close, volume)

我想删除下划线后面的所有内容,并添加逗号后面的内容,以获得以下内容:

df:
NYSE_close  NYSE_open  NYSE_volume  NASDAQ_close  NASDAQ_open  NASDAQ_volume

我试着去掉列名,但它用nan代替了它。有什么建议吗?
先谢谢你。

vc9ivgsu

vc9ivgsu1#

可以使用re.sub提取列名的相应部分,并将其替换为:

import re

df=pd.DataFrame(columns=['(NYSE_close, close)','(NYSE_close, open)','(NYSE_close, volume)', '(NASDAQ_close, close)','(NASDAQ_close, open)','(NASDAQ_close, volume)'])
df.columns = [re.sub(r'\(([^_]+_)\w+, (\w+)\)', r'\1\2', c) for c in df.columns]

输出:

Empty DataFrame
Columns: [NYSE_close, NYSE_open, NYSE_volume, NASDAQ_close, NASDAQ_open, NASDAQ_volume]
Index: []
3zwtqj6y

3zwtqj6y2#

您可以:

import re

def cvt_col(x):
    s = re.sub('[()_,]', ' ', x).split()
    return s[0] + '_' + s[2] 

df.rename(columns = cvt_col)

Empty DataFrame
Columns: [NYSE_close, NYSE_open, NYSE_volume, NASDAQ_close, NASDAQ_open, NASDAQ_volume]
Index: []
lmvvr0a8

lmvvr0a83#

使用列表解析,两次:

step1 = [ent.strip('()').split(',') for ent  in df]

df.columns = ["_".join([left.split('_')[0], right.strip()]) 
              for left, right  in step1]

df

Empty DataFrame
Columns: [NYSE_close, NYSE_open, NYSE_volume, NASDAQ_close, NASDAQ_open, NASDAQ_volume]
Index: []

相关问题