R dplyr行平均值或最小值和其他方法?

omvjsjqw  于 2022-12-20  发布在  其他
关注(0)|答案(7)|浏览(189)

我怎样才能用dplyr得到 Dataframe 中每一行的最小值(或平均值)呢?

apply(mydataframe, 1, mean) 
apply(mydataframe, 1, min)

我试过了

mydataframe %>% rowwise() %>% mean

mydataframe %>% rowwise() %>% summarise(mean)

或者其他的组合,但是我总是出错,我不知道正确的方法。
我知道我也可以使用rowMeans,但没有简单的“rowMin”等价物。也有一个matrixStats包,但大多数函数不接受数据。帧,只接受矩阵。
如果我想按行计算最小值,我可以用
do.call(pmin,mydataframe)行平均值有没有像这样简单的东西?

do.call(mean, mydataframe)

不起作用,我想我需要一个pmean函数或更复杂的函数。
谢啦,谢啦
为了比较结果,我们可以都用同一个例子:

set.seed(124)
df <- data.frame(A=rnorm(10), B=rnorm(10), C=rnorm(10))
ej83mcc0

ej83mcc01#

我想这就是你想达到的目的

df <- data.frame(A=rnorm(10), B=rnorm(10), C=rnorm(10))

library(dplyr)
df %>% rowwise() %>% mutate(Min = min(A, B, C), Mean = mean(c(A, B, C)))

#             A          B           C        Min        Mean
# 1   1.3720142  0.2156418  0.61260582  0.2156418  0.73342060
# 2  -1.4265665 -0.2090585 -0.05978302 -1.4265665 -0.56513600
# 3   0.6801410  1.5695065 -2.70446924 -2.7044692 -0.15160724
# 4   0.0335067  0.8367425 -0.83621791 -0.8362179  0.01134377
# 5  -0.2068252 -0.2305140  0.23764322 -0.2305140 -0.06656532
# 6  -0.3571095 -0.8776854 -0.80199141 -0.8776854 -0.67892877
# 7   1.0667424 -0.6376245 -0.41189564 -0.6376245  0.00574078
# 8  -1.0003376 -1.5985281  0.90406055 -1.5985281 -0.56493504
# 9  -0.8218494  1.1100531 -1.12477401 -1.1247740 -0.27885677
# 10  0.7868666  0.6099156 -0.58994138 -0.5899414  0.26894694
z9gpfhce

z9gpfhce2#

似乎有传言说,一些dplyr函数(如rowwise)可能会在长期内被弃用(如here显示器上的隆隆声),而map系列函数中的某些函数--如pmap函数--来自purrr,可用于执行此类计算:

library(tidyverse)

df %>% mutate(Min = pmap(df, min), Mean = rowMeans(.))

#              A          B           C        Min       Mean
# 1  -1.38507062  0.3183367 -1.10363778  -1.385071 -0.7234572
# 2   0.03832318 -1.4237989  0.44418506  -1.423799 -0.3137635
# 3  -0.76303016 -0.4050909 -0.20495061 -0.7630302 -0.4576905
# 4   0.21230614  0.9953866  1.67563243  0.2123061  0.9611084
# 5   1.42553797  0.9588178 -0.13132225 -0.1313222  0.7510112
# 6   0.74447982  0.9180879 -0.19988298  -0.199883  0.4875616
# 7   0.70022940 -0.1509696  0.05491242 -0.1509696  0.2013907
# 8  -0.22935461 -1.2230688 -0.68216549  -1.223069 -0.7115296
# 9   0.19709386 -0.8688243 -0.72770415 -0.8688243 -0.4664782
# 10  1.20715377 -1.0424854 -0.86190429  -1.042485 -0.2324120

Mean是一种特殊情况(因此使用了基函数rowMeans),因为data.frame对象上的mean在R3.0中已被弃用。

rseugnpd

rseugnpd3#

这个怎么样?

library(dplyr)
as.data.frame(t(mtcars)) %>%
  summarise_all(funs(mean))

为了更加清晰,您可以在末尾添加另一个t()

as.data.frame(t(mtcars)) %>%
  summarise_all(funs(mean)) %>%
  t()
kiayqfof

kiayqfof4#

dplyr 1.0.0中,您可以将rowwisec_across一起使用:

library(dplyr)

df %>%
  rowwise() %>%
  mutate(Min = min(c_across(A:C)), 
          Mean = mean(c_across(A:C)))

#       A      B       C    Min   Mean
#     <dbl>  <dbl>   <dbl>  <dbl>  <dbl>
# 1 -1.39    0.318 -1.10   -1.39  -0.723
# 2  0.0383 -1.42   0.444  -1.42  -0.314
# 3 -0.763  -0.405 -0.205  -0.763 -0.458
# 4  0.212   0.995  1.68    0.212  0.961
# 5  1.43    0.959 -0.131  -0.131  0.751
# 6  0.744   0.918 -0.200  -0.200  0.488
# 7  0.700  -0.151  0.0549 -0.151  0.201
# 8 -0.229  -1.22  -0.682  -1.22  -0.712
# 9  0.197  -0.869 -0.728  -0.869 -0.466
#10  1.21   -1.04  -0.862  -1.04  -0.232
o3imoua4

o3imoua45#

一个dplyrpurrr选项,可在其中使用所选助手:

df %>%
 mutate(Min = select(., everything()) %>% reduce(pmin),
        Max = select(., everything()) %>% reduce(pmax))

             A          B           C        Min        Max
1  -1.38507062  0.3183367 -1.10363778 -1.3850706  0.3183367
2   0.03832318 -1.4237989  0.44418506 -1.4237989  0.4441851
3  -0.76303016 -0.4050909 -0.20495061 -0.7630302 -0.2049506
4   0.21230614  0.9953866  1.67563243  0.2123061  1.6756324
5   1.42553797  0.9588178 -0.13132225 -0.1313222  1.4255380
6   0.74447982  0.9180879 -0.19988298 -0.1998830  0.9180879
7   0.70022940 -0.1509696  0.05491242 -0.1509696  0.7002294
8  -0.22935461 -1.2230688 -0.68216549 -1.2230688 -0.2293546
9   0.19709386 -0.8688243 -0.72770415 -0.8688243  0.1970939
10  1.20715377 -1.0424854 -0.86190429 -1.0424854  1.2071538

或者自dplyr 1.0.0

df %>%
 mutate(Min = reduce(across(everything()), pmin),
        Max = reduce(across(everything()), pmax))
r7xajy2e

r7xajy2e6#

认为找到了解决方案-只是转置您的数据。框架:

x <- data_frame(x = rnorm(10), 
            y = rnorm(10))

# A tibble: 10 × 2
        x             y
    <dbl>         <dbl>
1  -1.1240392  0.9306028477
2  -0.8213379  0.2500495105
3  -0.8289104 -0.3693704483
4  -0.6486601 -1.1421141986
5   0.5098542 -0.3703368343
6  -0.3644690 -0.0003744377
7   0.7404057  0.1166905738
8  -0.2475214 -0.0802864865
9   0.2637841 -0.7717699521
10  1.4092874  0.2998021578

x %>% 
  t() %>% 
  data.frame() %>% 
  mutate_all(funs(min)) %>% 
  unique() %>% 
  t()

         1
X1  -1.1240392
X2  -0.8213379
X3  -0.8289104
X4  -1.1421142
X5  -0.3703368
X6  -0.3644690
X7   0.1166906
X8  -0.2475214
X9  -0.7717700
X10  0.2998022
q1qsirdb

q1qsirdb7#

如何避免指定每个列名?如下所示:

set.seed(124)
df <- data.frame(A=rnorm(10), B=rnorm(10), C=rnorm(10))

library(dplyr)

df %>%
  rowwise() %>%
  mutate(Mean = mean(
    eval(
      # The snippet can also be wrapped within a function
      parse(text = sprintf("c(%s)", paste(names(.), collapse = ",")))
    )
  ),
  ArgMin = which.min(
    eval(
      parse(text = sprintf("c(%s)", paste(names(.), collapse = ",")))
    )
  ))

getColnamesExpr <- function(df_names) parse(text = sprintf("c(%s)", paste(df_names, collapse = ",")))

df %>%
  rowwise() %>%
  mutate(
    Mean = mean(eval(getColnamesExpr(names(.)))),
    argmin = which.min(eval(getColnamesExpr(names(.))))
  )

相关问题