我正在使用具有250列的调查数据。数据示例如下所示:
q1 <- factor(c("yes",NA,"no","yes",NA,"yes","no","yes"))
q2 <- factor(c("Albania","USA","Albania","Albania","UK",NA,"UK","Albania"))
q3 <- factor(c(0,1,NA,0,1,1,NA,0))
q4 <- factor(c(0,NA,NA,NA,1,NA,0,0))
q5 <- factor(c("Dont know","Prefer not to answer","Agree","Disagree",NA,"Agree","Agree",NA))
q6 <- factor(c(1,NA,3,5,800,NA,900,2))
sector <- factor(c("Energy","Water","Energy","Other","Other","Water","Transportation","Energy"))
weights <- factor(c(0.13,0.25,0.13,0.22,0.22,0.25,0.4,0.13)
data <- data.frame(q1,q2,q3,q4,q5,q6,sector,weights)
在stackoverflow的帮助下,我创建了以下函数来循环列并创建条形图,其中x轴显示响应的百分比,y轴显示基础列,填充是扇区。
plot_fun <- function(variable) {
total <- sum(!is.na(data[[variable]]))
data <- data |>
filter(!is.na(.data[[variable]])) |>
group_by(across(all_of(c("sector", variable)))) |>
summarise(n = n(), .groups = "drop_last") |>
mutate(pct = n / sum(n)) |>
ungroup()
ggplot(
data = data,
mapping = aes(fill = sector, x = pct, y = .data[[variable]])
) +
geom_col(position = "dodge") +
labs(
y = variable, x = "Percentage of responses", fill = "Sector legend",
caption = paste("Total =", total)
) +
geom_text(
aes(
label = scales::percent(pct, accuracy = 0.1)
),
position = position_dodge(.9), vjust = 0.5
) +
scale_x_continuous(labels=function(x) paste0(x*100))+
scale_fill_brewer(palette = "Accent")+
theme_bw() +
theme(panel.grid.major.y = element_blank())
}
现在,我想应用调查权重,以便条形图显示加权响应百分比。我尝试将weight = data$weights
添加到mapping(),但没有成功。我还尝试通过summarise(n= sum(weights))
在百分比计算中应用权重,但也没有成功。
有没有办法修改我的代码,使权重的应用?谢谢你事先。
1条答案
按热度按时间z9zf31ra1#
现在还不清楚如何应用权重。这里我假设您想用权重乘以百分比。注意,您需要修改数据。如果您想将权重用作计算的数值,则不应将其作为因子。无论如何,在group_by中使用权重,以便它们继续,然后在mutate中创建加权百分比。
如果这样做不奏效,请明确说明如何使用权重以及最终结果值应该是什么。