我有一个dataframe如下:
dput(trans_eqtl[1:3,1:10])
structure(list(Gene = c("ENSG00000132819", "ENSG00000101162",
"ENSG00000132819"), `Gene-Chr` = c(20, 20, 20), `Gene-Pos` = c(55975426,
57598009, 55975426), RsId = c("rs6084653", "rs156356", "rs1741314"
), `SNP-Chr` = c(20, 20, 20), `SNP-Pos` = c(4157072, 1819280,
4155193), start = c(57391407, 59019254, 57391407), end = c(57409333,
59025466, 57409333), Ds_cismb = c(56391407, 58019254, 56391407
), De_cismb = c(58409333, 60025466, 58409333)), row.names = c(NA,
3L), class = "data.frame")
我尝试只保留那些列符合以下条件的行:
我想根据它的位置过滤snps:如果SNP位置大于De_cismb或小于Ds_cismb,则考虑它并将其添加到表trans_snp。
我试过这段代码,但它没有给予我正确的子集:
检查trans_Snp
trans_snp <- NULL
for(i in 1:dim(trans_eqtl)[1]){
if((trans_eqtl$`SNP-Pos`[i] > trans_eqtl$De_cismb[i])==TRUE | (trans_eqtl$`SNP-Pos`[i] < trans_eqtl$Ds_cismb[i])==TRUE){
x <- which(trans_eqtl$`SNP-Pos`[i] > trans_eqtl$De_cismb[i])
y <- which(trans_eqtl$`SNP-Pos`[i] < trans_eqtl$Ds_cismb[i])
value <- trans_eqtl[x,]
value <- trans_eqtl[y,]
}
trans_snp <- rbind(trans_snp,value)
}
这是我得到的输出 Dataframe :
dput(trans_snp[1:4,1:10])
structure(list(Gene = c("ENSG00000132819", "ENSG00000132819",
"ENSG00000132819", "ENSG00000132819"), `Gene-Chr` = c(20, 20,
20, 20), `Gene-Pos` = c(55975426, 55975426, 55975426, 55975426
), RsId = c("rs6084653", "rs6084653", "rs6084653", "rs6084653"
), `SNP-Chr` = c(20, 20, 20, 20), `SNP-Pos` = c(4157072, 4157072,
4157072, 4157072), start = c(57391407, 57391407, 57391407, 57391407
), end = c(57409333, 57409333, 57409333, 57409333), Ds_cismb = c(56391407,
56391407, 56391407, 56391407), De_cismb = c(58409333, 58409333,
58409333, 58409333)), row.names = c(NA, 4L), class = "data.frame")
它只填充输入 Dataframe 的第一个值。有谁知道我在哪里犯了错误。
2条答案
按热度按时间6kkfgxo01#
在
dplyr
中:j7dteeu82#
如果我没理解错的话,就不需要循环。R是向量化的,向量化的比较将给予你逻辑索引向量。将这些向量与所需的逻辑条件组合,并从原始数据集中提取这些行。
或者,等价地,