tryCatch内dplyr的突变？

dohp0rv5 于 2023-09-27 发布在其他

关注(0)|答案(4)|浏览(103)

dplyr的mutate()中有异常处理机制吗？我的意思是一种捕获异常并处理它们的方法。
让我们假设我有一个函数，在某些情况下抛出错误（在示例中，如果输入是负的），为了简单起见，我定义了函数，但在真实的生活中，它将是某个R包中的函数。让我们假设这个函数是向量化的：

# function throwing an error
my_func <- function(x){
  if(x > 0) return(sqrt(x))
  stop('x must be positive')
}

my_func_vect <- Vectorize(my_func)

现在，假设我想在mutate()中使用这个函数。
如果在mutate()内部使用此函数，则它会在第一个错误处停止，并且不会返回任何结果：

library(dplyr)
# dummy data
data <- data.frame(x = c(1, -1, 4, 9))
data %>% mutate(y = my_func_vect(x))
# Error in mutate_impl(.data, dots) : Evaluation error: x must be positive.

有没有一种方法可以捕获错误，并做一些事情（例如：返回一个NA），同时获取其他元素的结果？
我期望的结果是使用tryCatch()循环可以实现的结果，即沿着如下的东西：

y <- rep(NA_real_, length(data$x))
for(i in seq_along(data$x)) {
  tryCatch({
    y[i] <- my_func_vect(data$x[i])
  }, error = function(err){})
}
y
# Result is: 1 NA 2 4

来源：https://stackoverflow.com/questions/50334972/trycatch-inside-dplyrs-mutate

4条答案

按热度按时间

s8vozzvw1#

我们还可以使用purrr的safely()或possibly()函数。
purrr帮助：

safely：wrapped函数返回一个包含组件result和error的列表。一个值始终为NULL。
静静：wrapped函数返回一个包含组件结果、输出、消息和警告的列表。
可能：每当发生错误时， Package 函数都使用默认值（否则）。

这并不能改变您必须将函数分别应用于每一行的事实。

library(dplyr)
library(purrr)

# function throwing an error
my_func <- function(x){
  if(x > 0) return(sqrt(x))
  stop('x must be positive')
}

my_func_vect <- Vectorize(my_func)

# dummy data
data <- data.frame(x = c(1, -1, 4, 9))

使用map：

data %>% 
  mutate(y = map_dbl(x, ~possibly(my_func_vect, otherwise = NA_real_)(.x)))
#>    x  y
#> 1  1  1
#> 2 -1 NA
#> 3  4  2
#> 4  9  3

使用`rowwise()`：

data %>%
  rowwise() %>% 
  mutate(y = possibly(my_func_vect, otherwise = NA_real_)(x))
#> Source: local data frame [4 x 2]
#> Groups: <by row>
#> 
#> # A tibble: 4 x 2
#>       x     y
#>   <dbl> <dbl>
#> 1     1     1
#> 2    -1    NA
#> 3     4     2
#> 4     9     3

其他函数在“数据框环境”中使用和应用起来有些困难，因为它们更适合处理列表，并返回列表。
由reprex package（v0.2.0）于2018-05-15创建。

赞(0）回复(0）举报 2023-09-27

sqyvllje2#

如果你想单独评估每个出现的错误，也许你不应该使用矢量化函数。相反，使用purrr包中的map-它实际上与这里的lapply相同。
如果你想在得到错误的情况下得到NA值，那么做一个函数来捕获错误。

try_my_func <- function(x) {
  tryCatch(my_func(x), error = function(err){NA})
}

然后将mutate与map一起使用

data %>% mutate(y = purrr::map(x, try_my_func))
   x  y
1  1  1
2 -1 NA
3  4  2
4  9  3

或者类似地，如果你不想声明一个新函数。

data %>% mutate(y = purrr::map(x, ~ tryCatch(my_func(.), error = function(err){NA})))

最后，如果你想使用矢量化函数，你可以完全跳过map函数。但我个人从来没有使用过Vectorize，所以我会用map。

data %>% mutate(y = Vectorize(try_my_func)(x))

赞(0）回复(0）举报 2023-09-27

gg58donl3#

使用rowwise可以避免向量化，直接在函数上使用tryCatch。我不知道这是如何推广虽然。

library(dplyr) |> suppressMessages()

data <- data.frame(x = c(1, -1, 4, 9))

my_func <- function(x){
  if(x > 0) return(sqrt(x))
  stop('x must be positive')
}

data |> 
  rowwise() |> 
  mutate(y = tryCatch(my_func(x), error = function(e) NA))
#> # A tibble: 4 × 2
#> # Rowwise: 
#>       x     y
#>   <dbl> <dbl>
#> 1     1     1
#> 2    -1    NA
#> 3     4     2
#> 4     9     3

创建于2023-09-19带有reprex v2.0.2

赞(0）回复(0）举报 2023-09-27

pkbketx94#

Vectorize（）的使用有些笨拙，因为你的函数不是“真正”矢量化的，而是假装矢量化的。因此，您必须以不同的方式对其进行分层，以达到所需的效果。这里有一个稍微拉长的例子。

library(dplyr)
library(purrr)

# function throwing an error
my_func <- function(x){
  if(x > 0) return(sqrt(x))
  stop('x must be positive')
}

safe_my_func <- safely(my_func, otherwise = NA_real_)
unwrapped_safe_my_func <- function(x)safe_my_func(x)$result
unwrapped_safe_my_func_vect <- Vectorize(unwrapped_safe_my_func)

# dummy data
data <- data.frame(x = c(1, -1, 4, 9))
data %>% mutate(y = unwrapped_safe_my_func_vect(x))

# or fully inlined : 
data %>% mutate(y = Vectorize(\(x)safely(my_func, otherwise = NA_real_)(x)$result)(x))

赞(0）回复(0）举报 2023-09-27

我来回答

tryCatch内dplyr的突变？

4条答案

使用map：

使用`rowwise()`：

相关问题

热门标签

最新问答

tryCatch内dplyr的突变？

4条答案

使用map：

使用rowwise()：

相关问题

热门标签

最新问答

使用`rowwise()`：