R语言 创建一个字段,用于根据条件跟踪项目在序列中的顺序

mwngjboj  于 2022-12-20  发布在  其他
关注(0)|答案(3)|浏览(86)

我希望跟踪项目在序列中的顺序。例如,最终产品应如下所示。

df <- data.frame(paitent=c('Sally', 'Josh', 'Josh', 'Abram','Sally', 'Josh'),
                 visit=mdy(c('2/10/2022', '2/11/2022', '2/12/2022', '2/13/2022', '2/14/2022', '2/15/2022')),
                visit_count=c(1,1,2,1,2,3))
paitent      visit
1   Sally 2022-02-10
2    Josh 2022-02-11
3    Josh 2022-02-12
4   Abram 2022-02-13
5   Sally 2022-02-14
6    Josh 2022-02-15

"visit_count"列将根据患者姓名自动填充,并根据日期按顺序排列。
我不太确定该去哪里,我已经研究过使用mutate和nrow()函数来计算行,但是我很难找到一种方法来过滤特定的名称,然后只计算小于当前记录日期的日期。

ftf50wuq

ftf50wuq1#

library(dplyr)
df %>%
  group_by(paitent) %>%
  mutate(visit_count2 = rank(visit, ties.method = "first")) %>%
  ungroup()
# # A tibble: 6 x 4
#   paitent visit      visit_count visit_count2
#   <chr>   <date>           <dbl>        <int>
# 1 Sally   2022-02-10           1            1
# 2 Josh    2022-02-11           1            1
# 3 Josh    2022-02-12           2            2
# 4 Abram   2022-02-13           1            1
# 5 Sally   2022-02-14           2            2
# 6 Josh    2022-02-15           3            3

碱基R

df$visit_count2 <- ave(as.numeric(df$visit), df$paitent, FUN = function(z) rank(z, ties.method = "first"))
df
#   paitent      visit visit_count visit_count2
# 1   Sally 2022-02-10           1            1
# 2    Josh 2022-02-11           1            1
# 3    Josh 2022-02-12           2            2
# 4   Abram 2022-02-13           1            1
# 5   Sally 2022-02-14           2            2
# 6    Josh 2022-02-15           3            3

数据表

library(data.table)
as.data.table(df)[, visit_count2 := rank(visit, ties.method = "first"), by = .(paitent)]

数据类型

df <- structure(list(paitent = c("Sally", "Josh", "Josh", "Abram", "Sally", "Josh"), visit = structure(c(19033, 19034, 19035, 19036, 19037, 19038), class = "Date"), visit_count = c(1, 1, 2, 1, 2, 3), visit_count2 = c(1, 1, 2, 1, 2, 3)), row.names = c(NA, -6L), class = "data.frame")
ncecgwcz

ncecgwcz2#

我们可以按patient分组,按升序排序,然后创建visit_count

df%>%
  group_by(paitent)%>%
  arrange(visit)%>%
  mutate(visit_count=row_number())

# A tibble: 6 x 3
# Groups:   paitent [3]
  paitent visit      visit_count
  <fct>   <date>           <int>
1 Sally   2022-02-10           1
2 Josh    2022-02-11           1
3 Josh    2022-02-12           2
4 Abram   2022-02-13           1
5 Sally   2022-02-14           2
6 Josh    2022-02-15           3
avwztpqn

avwztpqn3#

library(dplyr)

df %>% 
  group_by(patient) %>% 
  mutate(visit_count =1:n())
patient visit      visit_count
  <chr>   <date>           <int>
1 Sally   2022-02-10           1
2 Josh    2022-02-11           1
3 Josh    2022-02-12           2
4 Abram   2022-02-13           1
5 Sally   2022-02-14           2
6 Josh    2022-02-15           3

相关问题