R语言 如何合并这两个数组?

ercv8c1e  于 2023-10-13  发布在  其他
关注(0)|答案(2)|浏览(89)

对于R和dr:

library(dplyr)

tab1 <- tribble(
  ~code, ~periode, ~prix,
  "a", "2023-01-01", 3,
  "b", "2023-01-01", 2,
  "c", "2023-02-01", 2,
  "d", "2023-02-01", 4
)

tab2 <- tribble(
  ~code, ~`2018-01-01`,  ~`2018-02-01`,  ~`2019-01-01`, ~`2019-02-01`, ~`2020-01-01`,  
  ~`2020-02-01`,~`2021-01-01`,  ~`2021-02-01`, ~`2022-01-01`, ~`2022-02-01`,
  "a", 2, 1, 2, 3, 3, 4, 1, 2, 2, 1,
  "b", 1, 2, 4, 2, 1, 2, 1, 5, 1, 2,
  "c", 2, 1, 5, 6, 1, 3, 4, 3, 1, 3,
  "d", 3, 2, 7, 8, 6, 4, 5, 9, 7, 8
)

我想在tab1中添加三个列N_1N_2N_3,这样,例如,当tab 1中的periode为2023-01-01时:

  • N_1包含tab 2中的2022-01-01的值
  • N_2包含tab 2中2021-01-01的值
  • N_3包含表2中2020-01-01的值

这里有一个最小的可重复的例子,具有预期的结果:

# A tibble: 4 × 6
  code  periode     prix   N_1   N_2   N_3
  <chr> <chr>      <dbl> <dbl> <dbl> <dbl>
1 a     2023-01-01     3     2     1     3
2 b     2023-01-01     2     1     1     1
3 c     2023-02-01     2     3     3     3
4 d     2023-02-01     4     8     9     4

在真实的表中,当然有每年的所有月份,这就是为什么找到要添加的正确列很重要。
我试过这样的方法:

result <- tab1 |>
 group_by(periode) |> 
 mutate(N_1 = tab2[[""]]...

提前感谢!

8fsztsew

8fsztsew1#

看起来你要做的是将tab2旋转到更长的形式,过滤2020年到2022年,将它们重命名为N_1到N_3,然后将它们旋转回按月分组的宽形式。

library(tidyverse)
library(lubridate)
tab2 %>% 
  pivot_longer(-code,names_to = "date") %>%
  mutate(year = year(date),
         month = month(date)) %>%
  filter(year %in% (2020:2022)) %>% 
  select(-date) %>%
  mutate(year = recode(year, `2022` = "N_1", `2021` = "N_2", `2020` = "N_3")) %>%
  pivot_wider(id_cols = c("month","code"),names_from = "year") -> tab2_revised

tab2_revised
## A tibble: 8 × 5
#  month code    N_3   N_2   N_1
#  <dbl> <chr> <dbl> <dbl> <dbl>
#1     1 a         3     1     2
#2     2 a         4     2     1
#3     1 b         1     1     1
#4     2 b         2     5     2
#5     1 c         1     4     1
#6     2 c         3     3     3
#7     1 d         6     5     7
#8     2 d         4     9     8

一旦我们有了它,我们就可以按月加入tab1

tab1 %>%
  mutate(month = month(ymd(periode))) %>%
  left_join(tab2_revised, by = c("code","month")) %>%
  select(code,periode,prix,N_1,N_2,N_3)
## A tibble: 4 × 6
#  code  periode     prix   N_1   N_2   N_3
#  <chr> <chr>      <dbl> <dbl> <dbl> <dbl>
#1 a     2023-01-01     3     2     1     3
#2 b     2023-01-01     2     1     1     1
#3 c     2023-02-01     2     3     3     3
#4 d     2023-02-01     4     8     9     4
zzoitvuj

zzoitvuj2#

另一个解决方案是动态日期。关键是使用seq.Date生成当前日期前1年、2年和3年的日期:

library(tidyr)

df2 <- tab2 %>%
  pivot_longer(-1) %>%
  mutate(name=as.Date(name))

tab1 %>%
  mutate(periode2=purrr::map(as.Date(periode),seq,by="-1 year",length.out=4))%>%
  unnest(periode2) %>%
  left_join(df2,join_by(code,closest(periode2>=name))) %>%
  mutate(id=row_number()-1,.by=code) %>%
  filter(id!=0)%>%
  select(-name,-periode2)%>%
  pivot_wider(names_from=id,names_prefix = "N_")

# A tibble: 4 × 6
  code  periode     prix   N_1   N_2   N_3
  <chr> <chr>      <dbl> <dbl> <dbl> <dbl>
1 a     2023-01-01     3     2     1     3
2 b     2023-01-01     2     1     1     1
3 c     2023-02-01     2     3     3     3
4 d     2023-02-01     4     8     9     4

相关问题