6.3 Logistic regression analysis

In the CLSA datasets, the variable “\(\texttt{ORH_EXP_DRM_MCQ}\)” indicates if the respondents experienced dry mouth in the last 12 months. We relate this variable to the age group of the respondents through a logistic regression model. The variables sex, province and education level are also included in the initial model as covariates.

R

LogitReg <- svyglm(ORH_EXP_DRM_MCQ ~ SEX_ASK_TRM + Age_group_5 + 
              Education +  WGHTS_PROV_TRM,  family = quasibinomial, 
              design = CLSA.design.anly)
summary(LogitReg)

SAS

PROC SURVEYLOGISTIC data = CLSAData ;
CLASS  ORH_EXP_DRM_MCQ SEX_ASK_TRM Age_group_5(ref = '45-48') 
       Education (ref = 'Low Education')  WGHTS_PROV_TRM(ref = 'AB')  /param = ref;                               
MODEL  ORH_EXP_DRM_MCQ(event = 'Yes')= SEX_ASK_TRM Age_group_5 Education
       WGHTS_PROV_TRM;
STRATA GEOSTRAT_TRM;
WEIGHT WGHTS_ANALYTIC_TRM;
STORE  out = LogitReg; 
RUN;

SPSS

Analyze \(\rightarrow\) Complex Samples \(\rightarrow\) Logistc Regression… \(\rightarrow\) Select “\(\texttt{CLSADesignAnyl.csaplan}\)” in the Plan panel \(\rightarrow\) click “Continue” \(\rightarrow\) select the corresponding variables to the “Dependent Variable”, “Factor” and “Covariate” panels \(\rightarrow\) click “Reference Category” and select “Lowest value” \(\rightarrow\) click “Statistics…” \(\rightarrow\) select “Estimate” and “Standard error” \(\rightarrow\) click “Continue” \(\rightarrow\) Click “Save…” \(\rightarrow\) Enter “\(\texttt{LogitReg.xml}\)” under “Export Model as XML” \(\rightarrow\) click “Continue” \(\rightarrow\) click “OK”.

Stata

svy linearized: logit ORH_EXP_DRM_MCQ i.SEX_ASK_TRM i.Age_group_5 i.Education
                     i.WGHTS_PROV_TRM  
estimates save "[Path]\LogitReg.ster", replace

Result comparison

R
SAS
SPSS
Stata
Population Est. Coeff. SE Coeff. SE Coeff. SE Coeff. SE
(Intercept) -1.2901 0.5690 -1.2901 0.5751 -1.2901 0.5690 -1.2901 0.5690
SEX_ASK_TRM=“M” -0.1740 0.2567 -0.1740 0.2594 -0.1740 0.2567 -0.1740 0.2567
Age Groups: relative to Age_Gpr0: Age 45-48
Age_Gpr1:Age 49-54 0.1458 0.4669 0.1458 0.4720 0.1458 0.4669 0.1458 0.4669
Age_Gpr2:Age 55-64 0.2541 0.4451 0.2541 0.4499 0.2541 0.4451 0.2541 0.4451
Age_Gpr3:Age 65-74 -0.3279 0.4895 -0.3279 0.4947 -0.3279 0.4895 -0.3279 0.4895
Age_Gpr4:Age 75+ 0.3194 0.4821 0.3194 0.4874 0.3194 0.4821 0.3194 0.4821
Education Levels: relative to Lower Education
Medium Education -0.6017 0.3757 -0.6017 0.3797 -0.6017 0.3757 -0.6017 0.3757
Higher Education lower -0.5422 0.4272 -0.5422 0.4318 -0.5422 0.4272 -0.5422 0.4272
Higher Education upper -0.4210 0.3804 -0.4210 0.3845 -0.4210 0.3804 -0.4210 0.3804
Provinces: relative to Alberta
British Columbia 0.8768 0.4762 0.8768 0.4813 0.8768 0.4762 0.8768 0.4762
Manitoba 0.8336 0.5594 0.8336 0.5654 0.8336 0.5594 0.8336 0.5594
New Brunswick -0.2646 0.6567 -0.2646 0.6638 -0.2646 0.6567 -0.2646 0.6567
Newfoundland & Labrador -0.6137 0.6589 -0.6135 0.6660 -0.6137 0.6589 -0.6137 0.6589
Nova Scotia 1.0726 0.5857 1.0726 0.5920 1.0726 0.5857 1.0726 0.5857
Ontario 0.0504 0.4219 0.0504 0.4265 0.0504 0.4219 0.0504 0.4219
Prince Edward Island 1.2012 0.5425 1.2012 0.5484 1.2012 0.5425 1.2012 0.5425
Quebec 0.6766 0.4635 0.6766 0.4685 0.6766 0.4635 0.6766 0.4635
Saskatchewan 0.6629 0.5085 0.6629 0.5140 0.6629 0.5085 0.6629 0.5085

The fitted model can also be used to predict the probabilities of the response categories at a new data entry. We take the first entry of the dataset as an example for illustration and compare the predicted probability of “YES” and confidence intervals.

R

Pred2<-predict(LogitReg, newdata = CLSAData[1, ], 
               type = "link", se.fit = TRUE)
plogis(Pred2[1]); plogis(confint(Pred2))
predict(LogitReg, newdata = CLSAData[1,], type = "response" ,
        se.fit = TRUE) 

SAS

PROC PLM source = LogitReg ALPHA = 0.05 ;
SCORE data = CLSAData(obs = 1) out = testout predicted STDERR LCLM UCLM /ilink;
RUN;    

SPSS

Open a new dataset \(\rightarrow\) Utilities \(\rightarrow\) Scoring Wizard \(\rightarrow\) Click “Browse” and enter the corresponding path and select “\(\texttt{LogitReg.xml}\)\(\rightarrow\) Click “Next >” \(\rightarrow\) Match the variable with model fields \(\rightarrow\) Click “Next >” \(\rightarrow\) Enter a value for selected probability \(\rightarrow\) Click “Next >” and Click “Finish”.

Stata

estimates use "[Path]\LogitReg.ster"
predict  p2  if entity_id == 724976 ,  xb
predict  se2 if entity_id == 724976 ,  stdp 
display invlogit(p2)
display invlogit(p2 - invnormal(0.975)*se2)
display invlogit(p2 + invnormal(0.975)*se2)

Result comparison

R SAS SPSS Stata
Predicted value 0.2899 0.2899 0.2899 0.2899
Standard error 0.0914 0.0924 Not provided Not provided
95% lower confidence limit 0.1461 0.1447 Not provided 0.1461
95% upper confidence limit 0.4936 0.4963 Not provided 0.4936