R从 Dataframe 中的步骤创建路由

9lowa7mx  于 2023-02-14  发布在  其他
关注(0)|答案(2)|浏览(116)

这是一个 Dataframe ,我有

我举了一个简单的例子,但逻辑是这样的:对于给定的产品代码,我可以有各种最终目的地和各种步骤(可以是工厂到最终目的地工厂到步骤1到步骤x...到最终目的地

site <- c("DC_Frankfurt","F6_DC_Bordeaux","B3_Paris","BEAG_Toronto","DC_Frankfurt","Final_dest1","Final2","Final3")
product_code <- c("000001","000001","000001","000001","000002","000001","000001","000001")
transfersite <- c("Plant1","DC_Frankfurt","DC_Frankfurt","DC_Frankfurt","Plant2","B3_Paris","BEAG_Toronto","F6_DC_Bordeaux")

df <- data.frame(transfersite, product_code,site)

这是我所期望的:

product_code <- c("000001","000001","000001","000002")
step1 <- c("Plant1","Plant1","Plant1","Plant2")
step2 <- c("DC_Frankfurt","DC_Frankfurt","DC_Frankfurt","DC_Frankfurt")
step3 <- c("F6_DC_Bordeaux","B3_Paris","BEAG_Toronto",NA)
step4 <- c("Final3","Final_dest1","Final2",NA)

result_expected <- data.frame(product_code,step1,step2,step3,step4)

到目前为止,我尝试过这个方法,效果很好,但是如果超过4步,我就死定了,如果没有,代码会在最后一步循环...另外,我不知道如何合并同一行,它还不符合我的期望。

my_test <- df %>% 
  filter(str_detect(transfersite,"Plant" )) %>%
  mutate(step1 = transfersite,
         step2 = site) %>%
  full_join(df)

my_test <- my_test %>%
  semi_join(my_test, by = c("product_code" = "product_code", "transfersite" = "step2")) %>%
  mutate(step3 = site) %>%
  full_join(my_test)

my_test <- my_test %>%
  semi_join(my_test, by = c("product_code" = "product_code", "transfersite" = "step3")) %>%
  mutate(step4 = site) %>%
  full_join(my_test)

谢谢大家。

mzmfm0qo

mzmfm0qo1#

像这样的东西,也许?

i <- 2; var <- paste0("step", i)
dfnew <- rename(df, step1 = transfersite, step2 = site)
while (i < 22 && any(!is.na(dfnew[[ var ]]))) {
  prevvar <- var
  i <- i + 1; var <- paste0("step", i)
  dfnew <- left_join(dfnew, rename(df, !!prevvar := transfersite, !!var := site),
                     by = c("product_code", prevvar))
}
dfnew %>%
  mutate(NAs = rowSums(is.na(cur_data()))) %>%
  group_by(product_code) %>%
  filter(NAs == min(NAs)) %>%
  ungroup() %>%
  select(product_code, everything(), -!!var, -NAs)
# # A tibble: 4 × 5
#   product_code step1  step2        step3          step4      
#   <chr>        <chr>  <chr>        <chr>          <chr>      
# 1 000001       Plant1 DC_Frankfurt F6_DC_Bordeaux Final3     
# 2 000001       Plant1 DC_Frankfurt B3_Paris       Final_dest1
# 3 000001       Plant1 DC_Frankfurt BEAG_Toronto   Final2     
# 4 000002       Plant2 DC_Frankfurt NA             NA

我添加i < 22只是为了防止无限循环。当图中存在循环路径时,可能会发生无限循环。“22”是任意的,如果你期望所有的真实的路径都小于(比如)10,这也是一个很好的数字。

pprl5pva

pprl5pva2#

下面是一个递归函数,它将添加步骤,直到没有更多的步骤:

library(dplyr)   # >= v1.1.0
library(stringr)

route_steps <- function(data, step = 1, max_steps = Inf) {
  step_name <- paste0("step", step)
  if (step == 1) {
    out <- data %>% 
      filter(str_detect(transfersite, "Plant")) %>%
      rename(!!step_name := transfersite)
  } else {
    keys <- c("product_code", "transfersite")
    names(keys) <- c("product_code", step_name)
    out <- data %>% 
      rename(!!step_name := site) %>%
      left_join(df, by = keys, multiple = "all")
  }
  if (all(is.na(out$site)) | step == max_steps) mutate(out, site = NULL)
  else route_steps(out, step = step + 1, max_steps = max_steps)
}

route_steps(df)

result
#    step1 product_code        step2          step3       step4
# 1 Plant1       000001 DC_Frankfurt F6_DC_Bordeaux      Final3
# 2 Plant1       000001 DC_Frankfurt       B3_Paris Final_dest1
# 3 Plant1       000001 DC_Frankfurt   BEAG_Toronto      Final2
# 4 Plant2       000002 DC_Frankfurt           <NA>        <NA>

事情可能会变得不稳定,例如,如果循环路由是可能的,作为一个后备,您可以尝试设置max_steps参数,这可能有帮助,也可能没有帮助--我还没有测试过循环路由。

相关问题