将STATA语法转换为R语法[已关闭]

s4n0splo  于 2022-12-20  发布在  其他
关注(0)|答案(1)|浏览(153)

编辑问题以包含desired behavior, a specific problem or error, and the shortest code necessary to reproduce the problem。这将有助于其他人回答问题。
14小时前关门了。
Improve this question
我在把一些STATA代码翻译成R代码时遇到了一些麻烦:
状态代码:

gen joint_gpw = sbud_jpw * q44 if sbud_jpw < 888 & q44 < 888 

gen sbud_gpw_all = sbud_gpw if sbud_gpw < 888 

replace sbud_gpw_all = q31 if sbud_gpw_all ==. & q31 < 888 

replace sbud_gpw_all = joint_gpw if sbud_gpw_all ==. & joint_gpw !=. 

replace sbud_gpw_all = 888 if q16_1 == 0 & sbud_gpw_all ==. 

replace sbud_gpw_all = 888 if (sbud_gpw == 888 & q31 == 888 & sbud_jpw == 888 & q44 == 888) & sbud_gpw_all ==.  

replace sbud_gpw_all = 999 if (sbud_gpw == 999 | q31 == 999 | sbud_jpw == 999 | q44 == 999  | (q44 !=. & sbud_jpw == 888)) & sbud_gpw_all ==.

下面是我尝试的R代码:

dat%>%
  dplyr::mutate(joint_gpw = ifelse((sbud_jpw<888 & q44<888),sbud_jpw * q44,NA))%>%
  dplyr::mutate(sbud_gpw_all = ifelse(sbud_gpw < 888,sbud_gpw,NA))%>%
  dplyr::mutate(sbud_gpw_all = ifelse((sbud_gpw_all= NA & q31<888),q31,NA))%>%
  dplyr::mutate(sbud_gpw_all = ifelse((sbud_gpw_all = NA & joint_gpw != NA),joint_gpw,NA))%>%
  dplyr::mutate(sbud_gpw_all) = ifelse((q16_1 = 0 & sbud_gpw_all = NA),888,NA)%>%
  dplyr::mutate(sbud_gpw_all) = ifelse((sbud_gpw = 888 & q31 = 888 & sbud_jpw = 888 & q44 = 888) & sbud_gpw_all = NA,888,NA)%>%
  dplyr::mutate(sbud_gpw_all) = ifelse(((sbud_gpw = 999 | q31 = 999 | sbud_jpw = 999 | q44 = 999  | (q44 != NA & sbud_jpw == 888)) & sbud_gpw_all = NA)),999,NA)

以前出现的错误:

Error: unexpected '=' in:
"  dplyr::mutate(sbud_gpw_all) = ifelse((q16_1 = 0 & sbud_gpw_all = NA),888,NA)%>%
  dplyr::mutate(sbud_gpw_all) = ifelse((sbud_gpw = 888 & q31 = 888 & sbud_jpw = 888 & q44 = 888) & sbud_gpw_all ="

我想知道,如果这两套代码是等效的?我非常感谢所有的帮助有!谢谢!!!

yvfmudvl

yvfmudvl1#

错误源于最后三行中sbud_gpw_all后面的右括号)
另外,尽管没有抛出错误,但每次变异都覆盖了sbud_gpw_all。我不知道Stata,你也没有提供一个最小的可重复示例,但我感觉你的代码可以像这样工作:

dat %>%
  mutate(
    joint_gpw = if_else(sbud_jpw < 888 & q44 < 888, sbud_jpw * q44, NA_real_),
    sbud_gpw_all = case_when(
      sbud_gpw < 888 ~ sbud_gpw,
      q31 < 888 ~ q31,
      !is.na(joint_gpw) ~ joint_gpw,
      q16_1 == 0 ~ 888,
      sbud_gpw == 888 & q31 == 888 & sbud_jpw == 888 & q44 == 888 ~ 888,
      sbud_gpw == 999 | q31 == 999 | sbud_jpw == 999 | q44 == 999 | (!is.na(q44) & sbud_jpw == 888) ~ 999
    )
  )

这将首先使用dplyr::if_else() if sbud_jpw < 888 & q44 < 888创建列joint_gpw。然后,有一组条件(在~之前)被顺序检查。第一个匹配行的条件提供值(在~运算符之后)。
注意,正如Sotos在注解中指出的,R中的NA是用is.na(x)检查的,而不是用==/!=,因为它们总是返回NA。我省略了大多数行的NA检查,因为case_when()的顺序特性中隐含了这些检查--只要一个条件匹配,NA_real_是一个数字NA值,使用if_else()case_when()时,必须明确数据类型。

相关问题