我有非常大的尺寸csv文件和搜索存在缺席数据的独特基因,以及他们的计数。我的数据如下
df <- data.frame(
A = c("G1", "G2", "G3", "G4", "G5","G6","G7", "G8", "G9","G10"),
B = c(1, 0, 1, 0, 1, 1, 1, 0, 0, 0),
C = c(1, 0, 1, 0, 0, 0, 0, 1, 1, 0),
D = c(1, 1, 0, 0, 0, 0, 0, 0, 0, 1),
E = c(1, 1, 1, 1, 0, 0, 0, 0, 0, 0))
输出如下:第一个是具有唯一基因的 Dataframe
df_uniq <- data.frame(
A = c("G4", "G5","G6","G7", "G8", "G9","G10"),
B = c(0, 1, 1, 1, 0, 0, 0),
C = c(0, 0, 0, 0, 1, 1, 0),
D = c(0, 0, 0, 0, 0, 0, 1),
E = c(1, 0, 0, 0, 0, 0, 0))
感谢你的帮助谢谢!
2条答案
按热度按时间pn9klfpd1#
一个
dplyr
解决方案或者在
base R
中mlmc2os52#
dplyr
解决方案:base
中的等效值: