我正在尝试优化我过去编写的一些混乱的代码,并创建一些函数,这些函数可用于在多个 Dataframe 中运行相同的代码,每个 Dataframe 具有不同的列名。
我正在处理的 Dataframe 具有以下结构:
structure(list( PTM = c( "AAAS_T27_M1", "AAK1_T606_M1", "AAK1_T620_M1", "AASDH_S649_M1", "ABCC3_S911_M1", "ABCC4_S655_M1", "ABCC4_S665_M2", "ABCC4_S668_M1", "ABCC4_S668_M2", "ABCC4_T646_M1", "ABCC5_S505_M1", "ABCC5_S505_M2", "ABCC5_S509_M1", "ABCF1_S105_M1", "ABCF1_S105_M2", "ABCF1_S109_M1", "ABCF1_S166_M1", "ABCF1_T108_M1", "ABI1_S183_M1", "ABI2_S183_M1" ), logFC_A = c( NA, NA, -0.797823, 1.04461, NA, NA, NA, NA, NA, NA, NA, NA, 3.83343, NA, -1.37837, 0.943688, NA, 0.813075, NA, 0.474918 ), logFC_B = c( -0.755209, 0.845812, -0.435721, 1.60958, -0.935074, 0.536129, -1.88669, 1.01129, -1.31134, NA, NA, -0.680194, NA, NA, NA, NA, 0.540836, NA, 0.890831, 0.782319 ), logFC_C = c( NA, NA, -0.681984, 1.5103, NA, 0.595031, -1.62621, NA, -1.07332, 0.669169, 0.427444, NA, NA, 0.957807, NA, NA, NA, NA, 0.812133, 0.794539 )))
我想创建一个额外的列,报告A、B和C条件下每个PTM的状态(阳性、阴性或无变化(NA))。
所需的输出如下所示:|PTM系统|A-B-C||:----|:——————:||AAAS_T27_M1|不变-阴性-不变||AAK1_T606_M1型|不变-阳性-不变|
目前,我通过为每个条件(A、B和C)创建一个Status列,然后将它们合并来完成此操作。
data %>%
mutate(Status_A = ifelse(is.na(logFC_A),"Unaffected", ifelse(logFC_A <0, "Down","Up"))%>%
mutate(Status_B = ifelse(is.na(logFC_B),"Unaffected", ifelse(logFC_B <0, "Down","Up"))%>%
mutate(Status_C = ifelse(is.na(logFC_C),"Unaffected", ifelse(logFC_C <0, "Down","Up"))%>%
unite(A_B_C, Status_A,Status_B,Status_C)
然而,我必须对20多个 Dataframe 进行此操作,其中每个 Dataframe 都有唯一的列名。
1条答案
按热度按时间fjaof16o1#
可以使用
lapply()
将函数应用于输入数据中的所有列(第一列除外,因此使用data[, -1]
)。然后使用do.call()
将这些列粘贴到1列中。最后使用cbind()
将第一列联接回去。如果您需要
combined
列中的连字符,则将paste
替换为对此的自定义函数。