5.5 Estimation of odds ratios, relative risks and risk differences

Suppose we want to describe the population relationship between two binary variables, say whether experiencing dry mouth in the past 12 months (the variable \(\texttt{ORH_EXP_DRM_MCQ}\) and sex (the variable \(\texttt{SEX_ASK_TRM}\)). The following codes can be used to produce estimates of odds ratios, relative risks and risk differences.

R

There is no formal support from the \(\texttt{R} \texttt{survey}\) package to produce unadjusted odds ratios, relative risks, risk differences, nor the related confidence intervals. For the relative risk, there is only one example on page 103 of the manual of \(\texttt{R}\) \(\texttt{survey}\)(Lumley 2019). The following \(\texttt{R}\) codes can produce results similar to other survey packages.

LogisticReg4.OR<-svyglm(ORH_EXP_DRM_MCQ ~  SEX_ASK_TRM ,  
family = quasibinomial(link = "logit"), design = CLSA.design) 
exp(coef(LogisticReg4.OR)[2])      ## odds ratio
exp(confint(LogisticReg4.OR)[2, ]) ## confidence interval
    
LogisticReg4.RR<-svyglm(ORH_EXP_DRM_MCQ ~ SEX_ASK_TRM ,  
family = quasibinomial(link = "log"), design = CLSA.design) 
exp(coef(LogisticReg4.RR)[2])      ## relative risk   
exp(confint(LogisticReg4.RR)[2, ]) ## confidence interval
    
LogisticReg4.RD<-svyglm(ORH_EXP_DRM_MCQ ~ SEX_ASK_TRM  ,  
family = quasibinomial(link = "identity"), design = CLSA.design) 
coef(LogisticReg4.RD)[2]       ## risk difference 
confint(LogisticReg4.RD)[2, ]  ## Confidence interval 

SAS

We can obtain the unadjusted odds ratios, relative risks and risk differences by specifying the options, \(\texttt{OR}\) and \(\texttt{RISK}\) in the \(\texttt{table}\) statement. We can specify the order of the categorical variables by the \(\texttt{Proc Format}\). Usually, \(\texttt{SAS}\) only provides confidence intervals instead of standard errors for the estimates.

PROC Format; 
VALUE   $CBin  'Yes' = '1:Yes'  'No' = '2:No' ;
VALUE   $genB  'F' = '2:Female' 'M' = '1:Male';
RUN;    

PROC SURVEYFREQ data = CLSAData  ORDER = FORMATTED ; 
TABLE  SEX_ASK_TRM * ORH_EXP_DRM_MCQ / OR RISK  ;
STRATA GEOSTRAT_TRM;
WEIGHT WGHTS_INFLATION_TRM;
FORMAT ORH_EXP_DRM_MCQ $CBin. SEX_ASK_TRM $genB.;
RUN;    

SPSS

Analyze \(\rightarrow\) Complex Samples \(\rightarrow\) Crosstabs… \(\rightarrow\) Select the file “\(\texttt{CLSADesign.csaplan}\)” in the Plan panel \(\rightarrow\) click “Continue” \(\rightarrow\) select the corresponding variables to the “Rows”, “Factor” and target variable to the “Column” panels \(\rightarrow\) click “Statistics…” \(\rightarrow\) select “Confidence interval” and “Standard error” \(\rightarrow\) click “Odds Ratios”, “Risk difference” and “Relative risk” \(\rightarrow\) click “Continue” \(\rightarrow\) click “Continue” \(\rightarrow\) click “OK”.

Stata

There is no formal support from \(\texttt{Stata}\) \(\texttt{survey}\) package to produce unadjusted odds ratios, relative risks and risk differences and the confidence limits. However, the following codes can produce results similar to other survey packages.

Odds ratio:
svy linearized: logistic ORH_EXP_DRM_MCQ SEX_ASK_TRM
svy linearized: glm ORH_EXP_DRM_MCQ SEX_ASK_TRM, fam(binomial) link(log) eform  
svy linearized: glm ORH_EXP_DRM_MCQ SEX_ASK_TRM, fam(binomial) link(identity)

Result comparison

R SAS SPSS Stata
Odds ratio (M vs F) 0.7539 0.7539 0.7539 0.7539
95% lower confidence limit 0.4309 0.4306 0.4306 0.4309
95% upper confidence limit 1.3188 1.3199 1.3199 1.3188
Relative risk (M vs F) 0.8023 0.8023 0.8023 0.8023
95% lower confidence limit 0.5188 0.5184 0.5184 0.5188
95% upper confidence limit 1.2408 1.2417 1.2417 1.2408
Risk difference (M vs F) -0.0485 -0.0485 -0.0485 -0.0485
95% lower confidence limit -0.1447 -0.1448 -0.1448 -0.1447
95% upper confidence limit 0.0477 0.0478 0.0478 0.0477

Note:

The estimates of odds ratios, relative risks and risk differences obtained by these procedures describe the relationship of the target variables and exposure groups in the population. The binomial regressions used here do not describe a model for the variables in the population. Thus, the results in this chapter may be different from the logistic regression results in the later session. The reason is due to the use of “analytic weights”, which are often rescaled within each stratum under stratified sampling.

Reference

———. 2019. : Analysis of Complex Survey Samples. https://cran.r-project.org/package=survey.