R中的连续到分类变量

b4qexyjb  于 2023-09-27  发布在  其他
关注(0)|答案(1)|浏览(96)
for (i in names(cont)) {
  p<- findInterval(cont[[i]], c(quantile(cont[[i]], probs =0, names=FALSE),quantile(cont[[i]], probs =0.25, names=FALSE), quantile(cont[[i]], probs =0.5, names=FALSE), quantile(cont[[i]], probs =0.75, names=FALSE))) 
  cont[[i]] <- data.frame(p)

cont是一个只有连续变量的 Dataframe ,我试图通过对它们进行分组来将它们转换为分类变量。每个列名都显示为p,并且不能使用setNames函数进行更改。当我将文件导出为xlsx时,行是空的。有什么建议来修改代码吗?

eufgjt7s

eufgjt7s1#

看起来你在问一些问题。首先,数据框中存在重复的名称。您可以使用janitor包中的clean_names()函数来解决这个问题。

# build example data frame
cont <-
  data.frame(
    p = c(0.731223661120701,
          1.13614456309113,0.845686635689828,2.24426793066083,
          1.59035100926571,0.157989579685988,0.545634578015312,
          -0.0454752516996225,-2.44600418511105,
          -0.381689489040635),
    p = c(0.429202843909453,
          -1.15598270882576,-2.00044876743628,0.744766670520872,
          0.198281273615171,0.720497399981831,-0.629084645203274,
          0.580137516415831,-0.629454628997864,
          1.20684741795214),
    p = c(-0.191144702428001,
          1.68148114607032,1.00579352512297,1.49795242286717,
          -1.93830111673007,0.798219596507334,-0.264470927894772,
          1.33925831403588,0.998171132878376,-0.685571102346185),
    check.names = FALSE
  )

# fix names
cont <- janitor::clean_names(cont)

然后,您可以使用ggplot2包中的cut_interval()函数从连续变量创建组。

# create groups
cont_grouped <-
  cont |> 
    dplyr::mutate(
      dplyr::across(
        tidyselect::everything(),
        ~ ggplot2::cut_interval(
          ., 
          n = 5 # change this to change how many groups are created
        )
      )
    )

cont_grouped
#>                p              p_2           p_3
#> 1   (0.368,1.31]  (-0.0761,0.565] (-0.49,0.234]
#> 2   (0.368,1.31]   (-1.36,-0.718]  (0.958,1.68]
#> 3   (0.368,1.31]       [-2,-1.36]  (0.958,1.68]
#> 4    (1.31,2.24]     (0.565,1.21]  (0.958,1.68]
#> 5    (1.31,2.24]  (-0.0761,0.565] [-1.94,-1.21]
#> 6  (-0.57,0.368]     (0.565,1.21] (0.234,0.958]
#> 7   (0.368,1.31] (-0.718,-0.0761] (-0.49,0.234]
#> 8  (-0.57,0.368]     (0.565,1.21]  (0.958,1.68]
#> 9  [-2.45,-1.51] (-0.718,-0.0761]  (0.958,1.68]
#> 10 (-0.57,0.368]     (0.565,1.21] (-1.21,-0.49]

创建于2023-09-16带有reprex v2.0.2

相关问题