pandas 使用Regex在 Dataframe 中提取ID特定的列

dzjeubhm 于 2023-04-10 发布在其他

关注(0)|答案(1)|浏览(146)

我有一个来自组织切片数据集的 Dataframe ，其中包含以下列

图像
姓名
tumor_stroma_epi_nsclc_v2：上皮%
tumor_stroma_epi_nsclc_v2：上皮面积µm^2
tumor_stroma_epi_nsclc_v2：坏死%
tumor_stroma_epi_nsclc_v2：坏死面积µm^2
tumor_stroma_epi_nsclc_v2：基质%
tumor_stroma_epi_nsclc_v2：基质面积µm^2
tumor_stroma_epi_nsclc_v2：肿瘤%
tumor_stroma_epi_nsclc_v2：肿瘤面积µm^2
Area µm^2

列的nsclc_v2成分在多个不同的数据集上是可变的，具体取决于不同的组织类型。我想创建一个正则表达式来删除%列，它可以识别所有格式相同但组织类型不同的列。到目前为止，这是我所能想到的全部内容。

tumor_temp.drop(columns=['Image','Name',
                         '^tumor_stroma_epi_[a-z0-9_]: Epithelium %$',
                         '^tumor_stroma_epi_[a-z0-9_]: Necrosis %$',
                         '^tumor_stroma_epi_[a-z0-9_]: Stroma %$',
                         '^tumor_stroma_epi_[a-z0-9_]: Tumor %$',
                         'Area Âµ?m^2'], inplace=True)

如果这是一个基本的，我道歉，我大多有一个R背景。

pandas

来源：https://stackoverflow.com/questions/75952400/using-regex-to-pull-id-specific-columns-in-a-dataframe

1条答案

按热度按时间

42fyovps1#

你可以使用pandas中的filter()函数：

import re

pattern = re.compile("^tumor_stroma_epi_[a-z0-9_]+:.*%$")  # regular expression to match columns with %
cols_to_drop = df.filter(regex=pattern).columns
df.drop(columns=cols_to_drop, inplace=True)

赞(0）回复(0）举报 2023-04-10

我来回答

pandas 使用Regex在 Dataframe 中提取ID特定的列

1条答案

相关问题

热门标签

最新问答