我尝试将Anesrake应用于数据集,但收到错误消息“No variables is off by more than 5% using the method you have choose,weighting is undecessary or a small pre-raking limit should choose”(使用您选择的方法,No变量偏差超过5%,不需要加权或应选择较小的预处理限制)。*
我已经确保我没有空的水平,我的名字匹配,确保我的变量是因素,但没有工作,我不知道还有什么要尝试。
注:开始时有额外代码,以改变 Package 和年龄的重量变量(这些将是联锁重量)。
目标值为5.5555556,仅作为示例。
#LoadDataSet
NPSSurvey_df <- read.csv('C:/Users/andavies/Desktop/Maru_NPS_CSAT_RAWDATA_13_12_2022_F1/Test.csv')
NPSSurvey_df <- as.data.frame(NPSSurvey_df)
new_data <- NPSSurvey_df %>%
mutate(AgePack = case_when(Age == '18-24' & CustomerType =='In-Life' ~ '18-24 & In-Life',
Age == '25-34' & CustomerType =='In-Life' ~ '25-34 & In-Life',
Age == '35-44' & CustomerType =='In-Life' ~ '35-44 & In-Life',
Age == '45-54' & CustomerType =='In-Life' ~ '45-54 & In-Life',
Age == '55-64' & CustomerType =='In-Life' ~ '55-64 & In-Life',
Age == '65 years or more' & CustomerType =='In-Life' ~ '65+ & In-Life',
Age == '18-24' & CustomerType =='Lapsed' ~ '18-24 & Lapsed',
Age == '25-34' & CustomerType =='Lapsed' ~ '25-34 & Lapsed',
Age == '35-44' & CustomerType =='Lapsed' ~ '35-44 & Lapsed',
Age == '45-54' & CustomerType =='Lapsed' ~ '45-54 & Lapsed',
Age == '55-64' & CustomerType =='Lapsed' ~ '55-64 & Lapsed',
Age == '65 years or more' & CustomerType =='Lapsed' ~ '65+ & Lapsed',
Age == '18-24' & CustomerType =='New' ~ '18-24 & New',
Age == '25-34' & CustomerType =='New' ~ '25-34 & New',
Age == '35-44' & CustomerType =='New' ~ '35-44 & New',
Age == '45-54' & CustomerType =='New' ~ '45-54 & New',
Age == '55-64' & CustomerType =='New' ~ '55-64 & New',
Age == '65 years or more' & CustomerType =='New' ~ '65+ & New'))
new_data$AgePack <- as.factor(new_data$AgePack)
levels(new_data$AgePack) <- c('18-24 & In-Life','25-34 & In-Life', '35-44 & In-Life', '45-54 & In-Life', '55-64 & In-Life', '65+ & In-Life',
'18-24 & Lapsed','25-34 & Lapsed', '35-44 & Lapsed', '45-54 & Lapsed', '55-64 & Lapsed', '65+ & Lapsed',
'18-24 & New','25-34 & New', '35-44 & New', '45-54 & New', '55-64 & New', '65+ & New')
AgePack <- c(5.555555555555556,5.555555555555556,5.555555555555556,5.555555555555556,5.555555555555556,5.555555555555556,
5.555555555555556,5.555555555555556,5.555555555555556,5.555555555555556,5.555555555555556,5.555555555555556,
5.555555555555556,5.555555555555556,5.555555555555556,5.555555555555556,5.555555555555556,5.555555555555556)
names(AgePack) <- c('18-24 & In-Life','25-34 & In-Life', '35-44 & In-Life', '45-54 & In-Life', '55-64 & In-Life', '65+ & In-Life',
'18-24 & Lapsed','25-34 & Lapsed', '35-44 & Lapsed', '45-54 & Lapsed', '55-64 & Lapsed', '65+ & Lapsed',
'18-24 & New','25-34 & New', '35-44 & New', '45-54 & New', '55-64 & New', '65+ & New')
target <- list(AgePack)
names(target) <- c("AgePack")
outsave <- anesrake(target, new_data, caseid = new_data$Response_ID,
verbose= TRUE, cap = 5, choosemethod = "total",
type = "pctlim", pctlim = .05 , nlim = 5,
iterate = TRUE , force1 = TRUE)
summary(outsave)
new_data$weightvec <- unlist(outsave[1])
1条答案
按热度按时间bvhaajcl1#
我相信您的调查数据和参考数据没有太大的差异-即您的调查数据的年龄组细分与您想要的比例相差不到5%。您可能需要考虑加权是否真的有必要,或者如果您已经设置了加权,那么请尝试更改您的限制。现在,你有“pctlim = 0. 05”,这对应于5%的偏差下限。例如,你可以根据你的需要尝试3%甚至1%。