我尝试使用survey
包创建和查看每个观测的权重。我有以下形式的数据(简化示例):
# Create data
set.seed(12345)
preYear = c(0:100)
preYear = sample(preYear, 100, replace = TRUE)
income = c(0:100000)
income = sample(income, 100, replace = TRUE)
gender = c("Male", "Female")
gender = sample(gender, 100, replace = TRUE)
gender = as.numeric(factor(gender))
ethnicity = c("White", "African_American", "Mixed_Ethnicity", "Other_Ethnicity")
ethnicity = sample(ethnicity, 100, replace = TRUE)
ethnicity = as.numeric(factor(ethnicity))
postYear = preYear + 10
data = cbind(preYear, income, gender, ethnicity, postYear)
data = as.data.frame(data)
使用调查包,我对性别进行加权:
library(survey)
data.svy.unweighted <- svydesign(ids=~1, data=data)
#
gender.dist <- data.frame(gender = c("1", "2"),
Freq = nrow(data) * c(0.45, 0.55))
data.svy.rake <- rake(design = data.svy.unweighted,
sample.margins = list(~gender),
population.margins = list(gender.dist))
data.svy.rake
Independent Sampling design (with replacement)
rake(design = data.svy.unweighted, sample.margins = list(~gender),
population.margins = list(gender.dist))
但是,我不知道如何查看权重向量。理想情况下,我希望能够返回一个与data
相同的data.表,但有一个名为weight
的额外列,该列对应于在对性别加权后分配给每个观察的权重。如有任何帮助,我们将不胜感激。
1条答案
按热度按时间rryofs0p1#
weights()
函数返回测量设计对象的权重。model.frame()
函数返回变量,因此您可以将两者结合起来。