我有一个数据框,其中有相同患者的多个患者记录,前两列是他们的就诊时间,接下来的两列是他们就诊时所处的疾病阶段,接下来的两列对应于他们正在接受的治疗。我想根据访视1时给予的首次治疗计算每例患者的开始和结束时间。我能够根据另一篇文章中的解决方案计算出开始时间,但正在努力寻找结束时间。
Modifying function
我在想我会尝试使用“ifelse”函数找到开始时间的方法,但是我需要考虑很多条件。如果患者记录了开始时间,则记录患者的结束治疗,然后进入他们的第二患者记录并查看R2是“响应”还是“死亡”,如果是,则检查T1和T2是否彼此相等,如果满足所有这些要求,则记录结束时间,其可以是V2,并且如果满足条件,则可能保持重复。
下面是一个可重复的例子
df <- data.frame(
Patient = c('Dave', 'Dave', 'Dave', "Angel", "Angel", "Angel", "Joe", "Joe", "Joe", "Cara", "Cara"),
V1 = c(1, 150, 375, 1, 150, 375, 1, 150, 375, 1, 150),
V2 = c(150, 375,568,150, 375, 568, 150, 375, 568, 150,375),
R1 = c("Disease","Response","Response", "Disease","Disease", "Response","Disease", "Response", "Response", "Disease", "Response"),
R2 = c("Response", "Response", "Response", "Disease", "Response", "Death", "Response", "Disease", "Response", "Response", "Death"),
T1 = c("A","A", "A", "A","B","B", "A","A","C", "A", "A"),
T2 = c("A", "A","B", "B","B","B", "A","C","C" , "A", "A"))
df$start <- NULL
df$start <- ifelse(df$V1 == 1 & df$T1 == df$T2 & df$R2 == "Response", df$V2, NA)
Dave的结束时间将是568,因为从技术上讲,直到访视568,他的治疗是A,然后改变了。Angel将没有开始时间,因为他们在第一次治疗时从未看到React。Joe的结束时间将是150,因为他在访问375时停止看到响应,因此结束时间将与开始时间相同。最后,Cara的结束时间是375,因为我们假设她在死亡前一直有React。
我觉得这很难理解,所以我可以在评论中回答问题。提前感谢!
编辑:
df <- data.frame(
Patient = c('Dave', 'Dave', 'Dave', "Angel", "Angel", "Angel", "Joe", "Joe", "Joe", "Cara", "Cara", "Tanya", "Tanya", "Tanya", "Tanya"),
V1 = c(1, 150, 375, 1, 150, 375, 1, 150, 375, 1, 150, 1,150, 375,568),
V2 = c(150, 375,568,150, 375, 568, 150, 375, 568, 150,375, 150, 375, 568, 600),
R1 = c("Disease","Response","Response", "Disease","Disease", "Response","Disease", "Response", "Response", "Disease", "Response", "Disease", "Response", "Response", "Disease"),
R2 = c("Response", "Response", "Response", "Disease", "Response", "Death", "Response", "Disease", "Response", "Response", "Death", "Response", "Response", "Disease", "Response"),
T1 = c("A","A", "A", "A","B","B", "A","A","C", "A", "A", "B", "B", "A","B" ),
T2 = c("A", "A","B", "B","B","B", "A","C","C" , "A", "A", "B", "A", "B", "B"))
用你的代码tanya得到了600而它应该是375
1条答案
按热度按时间n6lpvg4x1#
下面是一个使用dplyr的方法:
编辑--基于OP增加的让某人从原来的治疗中来回切换的场景,我们可能会修改它,如下所示:
这输出了每个患者在其接受首次治疗的每个“时期”内的总结。因此,在本例中,基于处理B中的两次单独运行,我有Tanya的两个输出。如果你只想要第一个纪元的结果,我们可以添加
|> filter(era == 1)
,或者甚至将其添加到第一个过滤器中,以从分析中丢弃后续的结果。