6.3 Logistic regression analysis
In the CLSA datasets, the variable “\(\texttt{ORH_EXP_DRM_MCQ}\)” indicates if the respondents experienced dry mouth in the last 12 months. We relate this variable to the age group of the respondents through a logistic regression model. The variables sex, province and education level are also included in the initial model as covariates.
R
SAS
PROC SURVEYLOGISTIC data = CLSAData ;
CLASS ORH_EXP_DRM_MCQ SEX_ASK_TRM Age_group_5(ref = '45-48')
Education (ref = 'Low Education') WGHTS_PROV_TRM(ref = 'AB') /param = ref;
MODEL ORH_EXP_DRM_MCQ(event = 'Yes')= SEX_ASK_TRM Age_group_5 Education
WGHTS_PROV_TRM;
STRATA GEOSTRAT_TRM;
WEIGHT WGHTS_ANALYTIC_TRM;
STORE out = LogitReg;
RUN;
SPSS
Analyze \(\rightarrow\) Complex Samples \(\rightarrow\) Logistc Regression… \(\rightarrow\) Select “\(\texttt{CLSADesignAnyl.csaplan}\)” in the Plan panel \(\rightarrow\) click “Continue” \(\rightarrow\) select the corresponding variables to the “Dependent Variable”, “Factor” and “Covariate” panels \(\rightarrow\) click “Reference Category” and select “Lowest value” \(\rightarrow\) click “Statistics…” \(\rightarrow\) select “Estimate” and “Standard error” \(\rightarrow\) click “Continue” \(\rightarrow\) Click “Save…” \(\rightarrow\) Enter “\(\texttt{LogitReg.xml}\)” under “Export Model as XML” \(\rightarrow\) click “Continue” \(\rightarrow\) click “OK”.
Stata
Result comparison
Population Est. | Coeff. | SE | Coeff. | SE | Coeff. | SE | Coeff. | SE |
---|---|---|---|---|---|---|---|---|
(Intercept) | -1.2901 | 0.5690 | -1.2901 | 0.5751 | -1.2901 | 0.5690 | -1.2901 | 0.5690 |
SEX_ASK_TRM=“M” | -0.1740 | 0.2567 | -0.1740 | 0.2594 | -0.1740 | 0.2567 | -0.1740 | 0.2567 |
Age Groups: relative to Age_Gpr0: Age 45-48 | ||||||||
Age_Gpr1:Age 49-54 | 0.1458 | 0.4669 | 0.1458 | 0.4720 | 0.1458 | 0.4669 | 0.1458 | 0.4669 |
Age_Gpr2:Age 55-64 | 0.2541 | 0.4451 | 0.2541 | 0.4499 | 0.2541 | 0.4451 | 0.2541 | 0.4451 |
Age_Gpr3:Age 65-74 | -0.3279 | 0.4895 | -0.3279 | 0.4947 | -0.3279 | 0.4895 | -0.3279 | 0.4895 |
Age_Gpr4:Age 75+ | 0.3194 | 0.4821 | 0.3194 | 0.4874 | 0.3194 | 0.4821 | 0.3194 | 0.4821 |
Education Levels: relative to Lower Education | ||||||||
Medium Education | -0.6017 | 0.3757 | -0.6017 | 0.3797 | -0.6017 | 0.3757 | -0.6017 | 0.3757 |
Higher Education lower | -0.5422 | 0.4272 | -0.5422 | 0.4318 | -0.5422 | 0.4272 | -0.5422 | 0.4272 |
Higher Education upper | -0.4210 | 0.3804 | -0.4210 | 0.3845 | -0.4210 | 0.3804 | -0.4210 | 0.3804 |
Provinces: relative to Alberta | ||||||||
British Columbia | 0.8768 | 0.4762 | 0.8768 | 0.4813 | 0.8768 | 0.4762 | 0.8768 | 0.4762 |
Manitoba | 0.8336 | 0.5594 | 0.8336 | 0.5654 | 0.8336 | 0.5594 | 0.8336 | 0.5594 |
New Brunswick | -0.2646 | 0.6567 | -0.2646 | 0.6638 | -0.2646 | 0.6567 | -0.2646 | 0.6567 |
Newfoundland & Labrador | -0.6137 | 0.6589 | -0.6135 | 0.6660 | -0.6137 | 0.6589 | -0.6137 | 0.6589 |
Nova Scotia | 1.0726 | 0.5857 | 1.0726 | 0.5920 | 1.0726 | 0.5857 | 1.0726 | 0.5857 |
Ontario | 0.0504 | 0.4219 | 0.0504 | 0.4265 | 0.0504 | 0.4219 | 0.0504 | 0.4219 |
Prince Edward Island | 1.2012 | 0.5425 | 1.2012 | 0.5484 | 1.2012 | 0.5425 | 1.2012 | 0.5425 |
Quebec | 0.6766 | 0.4635 | 0.6766 | 0.4685 | 0.6766 | 0.4635 | 0.6766 | 0.4635 |
Saskatchewan | 0.6629 | 0.5085 | 0.6629 | 0.5140 | 0.6629 | 0.5085 | 0.6629 | 0.5085 |
The fitted model can also be used to predict the probabilities of the response categories at a new data entry. We take the first entry of the dataset as an example for illustration and compare the predicted probability of “YES” and confidence intervals.
R
SAS
PROC PLM source = LogitReg ALPHA = 0.05 ;
SCORE data = CLSAData(obs = 1) out = testout predicted STDERR LCLM UCLM /ilink;
RUN;
SPSS
Open a new dataset \(\rightarrow\) Utilities \(\rightarrow\) Scoring Wizard \(\rightarrow\) Click “Browse” and enter the corresponding path and select “\(\texttt{LogitReg.xml}\)” \(\rightarrow\) Click “Next >” \(\rightarrow\) Match the variable with model fields \(\rightarrow\) Click “Next >” \(\rightarrow\) Enter a value for selected probability \(\rightarrow\) Click “Next >” and Click “Finish”.
Stata
Result comparison
R | SAS | SPSS | Stata | |
---|---|---|---|---|
Predicted value | 0.2899 | 0.2899 | 0.2899 | 0.2899 |
Standard error | 0.0914 | 0.0924 | Not provided | Not provided |
95% lower confidence limit | 0.1461 | 0.1447 | Not provided | 0.1461 |
95% upper confidence limit | 0.4936 | 0.4963 | Not provided | 0.4936 |