R语言 根据另一列中的两个变量创建新列,并对其进行连续编号

cvxl0en2  于 2023-10-13  发布在  其他
关注(0)|答案(2)|浏览(123)

我有一个dataframe,看起来像这样:

structure(list(Date = structure(c(1630544400, 1630548000, 1630551600, 
1630555200, 1630558800, 1630562400, 1630566000, 1630569600, 1630573200, 
1630576800, 1630580400, 1630630800, 1630634400, 1630638000, 1630641600, 
1630645200, 1630648800, 1630652400, 1630656000, 1630659600), tzone = "America/Chicago", class = c("POSIXct", 
"POSIXt")), daytime = c("Night", "Night", "Night", "Night", "Morning", 
"Morning", "Morning", "Morning", "Morning", "Morning", "Morning", 
"Night", "Night", "Night", "Night", "Morning", "Morning", "Morning", 
"Morning", "Morning")), row.names = c(NA, -20L), class = c("tbl_df", 
"tbl", "data.frame"))

我想创建另一个列来按顺序对夜晚和早晨进行分组,因此输出结果如下所示:

Date                daytime        nightcount
   <dttm>              <chr>          <dbl>
 1 2021-09-01 20:00:00 Night            1
 2 2021-09-01 21:00:00 Night            1  
 3 2021-09-01 22:00:00 Night            1
 4 2021-09-01 23:00:00 Night            1
 5 2021-09-02 00:00:00 Morning          1
 6 2021-09-02 01:00:00 Morning          1
 7 2021-09-02 02:00:00 Morning          1
 8 2021-09-02 03:00:00 Morning          1
 9 2021-09-02 04:00:00 Morning          1
10 2021-09-02 05:00:00 Morning          1
11 2021-09-02 06:00:00 Morning          1
12 2021-09-02 20:00:00 Night            2
13 2021-09-02 21:00:00 Night            2  
14 2021-09-02 22:00:00 Night            2
15 2021-09-02 23:00:00 Night            2  
16 2021-09-03 00:00:00 Morning          2
17 2021-09-03 01:00:00 Morning          2
18 2021-09-03 02:00:00 Morning          2
19 2021-09-03 03:00:00 Morning          2
20 2021-09-03 04:00:00 Morning          2

有没有一个简单的解决方案,使用duberr?

nxowjjhe

nxowjjhe1#

您可以在“Morning”变为“Night”时创建一个逻辑值,然后使用cumsum跨行对这些逻辑值求和:

library(dplyr)

df |>
  mutate(nightcount = cumsum(daytime == "Night" & lag(daytime, default = "Morning") == "Morning"))
uyhoqukh

uyhoqukh2#

对于这个dplyr答案,我们创建一个变量group_by,然后使用它来获取nightcount变量的序列ID。

library(dplyr)
library(lubridate)

df %>%
  mutate(day = if_else(
    daytime == 'Night', date(Date), date(Date) - days(1)
  )) %>%
  mutate(nightcount = cur_group_id(),
         .by = day) %>%
  select(-day)
#> # A tibble: 20 × 3
#>    Date                daytime nightcount
#>    <dttm>              <chr>        <int>
#>  1 2021-09-01 20:00:00 Night            1
#>  2 2021-09-01 21:00:00 Night            1
#>  3 2021-09-01 22:00:00 Night            1
#>  4 2021-09-01 23:00:00 Night            1
#>  5 2021-09-02 00:00:00 Morning          1
#>  6 2021-09-02 01:00:00 Morning          1
#>  7 2021-09-02 02:00:00 Morning          1
#>  8 2021-09-02 03:00:00 Morning          1
#>  9 2021-09-02 04:00:00 Morning          1
#> 10 2021-09-02 05:00:00 Morning          1
#> 11 2021-09-02 06:00:00 Morning          1
#> 12 2021-09-02 20:00:00 Night            2
#> 13 2021-09-02 21:00:00 Night            2
#> 14 2021-09-02 22:00:00 Night            2
#> 15 2021-09-02 23:00:00 Night            2
#> 16 2021-09-03 00:00:00 Morning          2
#> 17 2021-09-03 01:00:00 Morning          2
#> 18 2021-09-03 02:00:00 Morning          2
#> 19 2021-09-03 03:00:00 Morning          2
#> 20 2021-09-03 04:00:00 Morning          2

创建于2023-10-10使用reprex v2.0.2

相关问题