R如何压缩 Dataframe ,使其不存在漏洞

qlvxas9a  于 2022-12-25  发布在  其他
关注(0)|答案(3)|浏览(126)

这是我的数据框。

structure(list(INVOICE_DATE = structure(c(19205, 19205, 19205, 
19206, 19206, 19206, 19207, 19207, 19207), class = "Date"), CATEGORY = c("Accessory", 
"Concentrate", "Edible", "Accessory", "Concentrate", "Edible", 
"Accessory", "Concentrate", "Edible"), Crumble = c(NA, 47, NA, 
NA, 65, NA, NA, 85, NA), Tincture = c(NA, NA, 567, NA, NA, 1028, 
NA, NA, 830), Other = c(NA_real_, NA_real_, NA_real_, NA_real_, 
NA_real_, NA_real_, NA_real_, NA_real_, NA_real_), Battery = c(1079, 
NA, NA, 1027, NA, NA, 1148, NA, NA)), row.names = c(NA, -9L), class = c("tbl_df", 
"tbl", "data.frame"))

我在R中找不到合适的动词或动词组来做这件事。
如何删除“NA”值,以便将数据框压缩到格式良好的表格中?CATEGORY列可以删除,然后所有其他列都可以放在一行中,没有任何漏洞。
我不能做df %>% na.omit(),因为我最终得到的是一个没有观测值的数据框,如果我尝试这种处理方法,情况也是如此:filter(is.na()) .

guicsvcw

guicsvcw1#

library(tidyr)
library(dplyr)

select(df, -CATEGORY) %>%
  pivot_longer(-INVOICE_DATE) %>%
  filter(!is.na(value)) %>%
  pivot_wider()

# A tibble: 3 × 4
  INVOICE_DATE Battery Crumble Tincture
  <date>         <dbl>   <dbl>    <dbl>
1 2022-08-01      1079      47      567
2 2022-08-02      1027      65     1028
3 2022-08-03      1148      85      830
55ooxyrt

55ooxyrt2#

下面是使用分组summarize()的解决方案。

library(dplyr)

dat %>% 
  group_by(INVOICE_DATE) %>%
  summarize(across(
    Crumble:Battery, 
    ~ ifelse(sum(!is.na(.x)) > 0, .x[!is.na(.x)], NA)
  ))
# A tibble: 3 × 5
  INVOICE_DATE Crumble Tincture Other Battery
  <date>         <dbl>    <dbl> <lgl>   <dbl>
1 2022-08-01        47      567 NA       1079
2 2022-08-02        65     1028 NA       1027
3 2022-08-03        85      830 NA       1148
0pizxfdo

0pizxfdo3#

使用colSumsby发票日期。

by(df[-2], df$INVOICE_DATE, \(x) 
   data.frame(INVOICE_DATE=x[1, 1], t(colSums(x[2:5], na.rm=TRUE)))) |>
  do.call(what=rbind)
#            INVOICE_DATE Crumble Tincture Other Battery
# 2022-08-01   2022-08-01      47      567     0    1079
# 2022-08-02   2022-08-02      65     1028     0    1027
# 2022-08-03   2022-08-03      85      830     0    1148

相关问题