如何使用dplyr::slice_* 函数重写带权重的dplyr::top_n()调用

k5hmc34c  于 2023-01-10  发布在  其他
关注(0)|答案(1)|浏览(110)

我想用推荐的slice_max()函数替换下面代码中被取代的top_n()调用,但是我不知道如何用slice_max()请求加权。

top10 <- 
  structure(
    list(
      Variable = c("tfidf_text_crossing", "tfidf_text_best", 
                   "tfidf_text_amazing", "tfidf_text_fantastic",
                   "tfidf_text_player", "tfidf_text_great",
                   "tfidf_text_10", "tfidf_text_progress", 
                   "tfidf_text_relaxing", "tfidf_text_fix"), 
      Importance = c(0.428820580430941, 0.412741988094224,
                     0.368676982306671, 0.361409225854695, 
                     0.331176924533776, 0.307393456208119,
                     0.293945850296236, 0.286313554816565, 
                     0.283457020779205, 0.27899280757397), 
      Sign = c(tfidf_text_crossing = "POS", tfidf_text_best = "POS", 
               tfidf_text_amazing = "POS", tfidf_text_fantastic = "POS", 
               tfidf_text_player = "NEG", tfidf_text_great = "POS", 
               tfidf_text_10 = "POS", tfidf_text_progress = "NEG", 
               tfidf_text_relaxing = "POS", tfidf_text_fix = "NEG")
    ), 
    row.names = c(NA, -10L), 
    class = c("vi", "tbl_df", "tbl", "data.frame"), 
    type = "|coefficient|"
  )

suppressPackageStartupMessages(library(dplyr))

top10 |> 
  group_by(Sign) |> 
  top_n(2, wt = abs(Importance))
#> # A tibble: 4 × 3
#> # Groups:   Sign [2]
#>   Variable            Importance Sign 
#>   <chr>                    <dbl> <chr>
#> 1 tfidf_text_crossing      0.429 POS  
#> 2 tfidf_text_best          0.413 POS  
#> 3 tfidf_text_player        0.331 NEG  
#> 4 tfidf_text_progress      0.286 NEG

创建于2023年1月6日,使用reprex v2.0.2
我想我会得到正确的答案:

top10 |> 
  group_by(Sign) |> 
  arrange(desc(abs(Importance))) |> 
  slice_head(n = 2)

但是对于我正在教的新手来说,这就不那么容易理解了。2有没有一个明显的方法可以用slice_* 函数来实现这一点呢?

mm9b1k5b

mm9b1k5b1#

您可以使用order_by=处理数据的arrange ing,这将使它更具可读性(并且它确实模拟了您的top_n代码)。

top10 |>
  group_by(Sign) |>
  slice_max(n = 2, order_by = abs(Importance))
# # A tibble: 4 × 3
# # Groups:   Sign [2]
#   Variable            Importance Sign 
#   <chr>                    <dbl> <chr>
# 1 tfidf_text_player        0.331 NEG  
# 2 tfidf_text_progress      0.286 NEG  
# 3 tfidf_text_crossing      0.429 POS  
# 4 tfidf_text_best          0.413 POS

相关问题