6.3 Logistic regression analysis
In the CLSA datasets, the variable “” indicates if the respondents experienced dry mouth in the last 12 months. We relate this variable to the age group of the respondents through a logistic regression model. The variables sex, province and education level are also included in the initial model as covariates.
R
LogitReg <- svyglm(ORH_EXP_DRM_MCQ ~ SEX_ASK_TRM + Age_group_5 +
Education + WGHTS_PROV_TRM, family = quasibinomial,
design = CLSA.design.anly)
summary(LogitReg)
SAS
PROC SURVEYLOGISTIC data = CLSAData ;
CLASS ORH_EXP_DRM_MCQ SEX_ASK_TRM Age_group_5(ref = '45-48')
Education (ref = 'Low Education') WGHTS_PROV_TRM(ref = 'AB') /param = ref;
MODEL ORH_EXP_DRM_MCQ(event = 'Yes')= SEX_ASK_TRM Age_group_5 Education
WGHTS_PROV_TRM;
STRATA GEOSTRAT_TRM;
WEIGHT WGHTS_ANALYTIC_TRM;
STORE out = LogitReg;
RUN;
SPSS
Analyze Complex Samples Logistc Regression… Select “” in the Plan panel click “Continue” select the corresponding variables to the “Dependent Variable”, “Factor” and “Covariate” panels click “Reference Category” and select “Lowest value” click “Statistics…” select “Estimate” and “Standard error” click “Continue” Click “Save…” Enter “” under “Export Model as XML” click “Continue” click “OK”.
Stata
svy linearized: logit ORH_EXP_DRM_MCQ i.SEX_ASK_TRM i.Age_group_5 i.Education
i.WGHTS_PROV_TRM
estimates save "[Path]\LogitReg.ster", replace
Result comparison
Population Est. | Coeff. | SE | Coeff. | SE | Coeff. | SE | Coeff. | SE |
---|---|---|---|---|---|---|---|---|
(Intercept) | -1.2901 | 0.5690 | -1.2901 | 0.5751 | -1.2901 | 0.5690 | -1.2901 | 0.5690 |
SEX_ASK_TRM=“M” | -0.1740 | 0.2567 | -0.1740 | 0.2594 | -0.1740 | 0.2567 | -0.1740 | 0.2567 |
Age Groups: relative to Age_Gpr0: Age 45-48 | ||||||||
Age_Gpr1:Age 49-54 | 0.1458 | 0.4669 | 0.1458 | 0.4720 | 0.1458 | 0.4669 | 0.1458 | 0.4669 |
Age_Gpr2:Age 55-64 | 0.2541 | 0.4451 | 0.2541 | 0.4499 | 0.2541 | 0.4451 | 0.2541 | 0.4451 |
Age_Gpr3:Age 65-74 | -0.3279 | 0.4895 | -0.3279 | 0.4947 | -0.3279 | 0.4895 | -0.3279 | 0.4895 |
Age_Gpr4:Age 75+ | 0.3194 | 0.4821 | 0.3194 | 0.4874 | 0.3194 | 0.4821 | 0.3194 | 0.4821 |
Education Levels: relative to Lower Education | ||||||||
Medium Education | -0.6017 | 0.3757 | -0.6017 | 0.3797 | -0.6017 | 0.3757 | -0.6017 | 0.3757 |
Higher Education lower | -0.5422 | 0.4272 | -0.5422 | 0.4318 | -0.5422 | 0.4272 | -0.5422 | 0.4272 |
Higher Education upper | -0.4210 | 0.3804 | -0.4210 | 0.3845 | -0.4210 | 0.3804 | -0.4210 | 0.3804 |
Provinces: relative to Alberta | ||||||||
British Columbia | 0.8768 | 0.4762 | 0.8768 | 0.4813 | 0.8768 | 0.4762 | 0.8768 | 0.4762 |
Manitoba | 0.8336 | 0.5594 | 0.8336 | 0.5654 | 0.8336 | 0.5594 | 0.8336 | 0.5594 |
New Brunswick | -0.2646 | 0.6567 | -0.2646 | 0.6638 | -0.2646 | 0.6567 | -0.2646 | 0.6567 |
Newfoundland & Labrador | -0.6137 | 0.6589 | -0.6135 | 0.6660 | -0.6137 | 0.6589 | -0.6137 | 0.6589 |
Nova Scotia | 1.0726 | 0.5857 | 1.0726 | 0.5920 | 1.0726 | 0.5857 | 1.0726 | 0.5857 |
Ontario | 0.0504 | 0.4219 | 0.0504 | 0.4265 | 0.0504 | 0.4219 | 0.0504 | 0.4219 |
Prince Edward Island | 1.2012 | 0.5425 | 1.2012 | 0.5484 | 1.2012 | 0.5425 | 1.2012 | 0.5425 |
Quebec | 0.6766 | 0.4635 | 0.6766 | 0.4685 | 0.6766 | 0.4635 | 0.6766 | 0.4635 |
Saskatchewan | 0.6629 | 0.5085 | 0.6629 | 0.5140 | 0.6629 | 0.5085 | 0.6629 | 0.5085 |
The fitted model can also be used to predict the probabilities of the response categories at a new data entry. We take the first entry of the dataset as an example for illustration and compare the predicted probability of “YES” and confidence intervals.
R
Pred2<-predict(LogitReg, newdata = CLSAData[1, ],
type = "link", se.fit = TRUE)
plogis(Pred2[1]); plogis(confint(Pred2))
predict(LogitReg, newdata = CLSAData[1,], type = "response" ,
se.fit = TRUE)
SAS
PROC PLM source = LogitReg ALPHA = 0.05 ;
SCORE data = CLSAData(obs = 1) out = testout predicted STDERR LCLM UCLM /ilink;
RUN;
SPSS
Open a new dataset Utilities Scoring Wizard Click “Browse” and enter the corresponding path and select “” Click “Next >” Match the variable with model fields Click “Next >” Enter a value for selected probability Click “Next >” and Click “Finish”.
Stata
estimates use "[Path]\LogitReg.ster"
predict p2 if entity_id == 724976 , xb
predict se2 if entity_id == 724976 , stdp
display invlogit(p2)
display invlogit(p2 - invnormal(0.975)*se2)
display invlogit(p2 + invnormal(0.975)*se2)
Result comparison
R | SAS | SPSS | Stata | |
---|---|---|---|---|
Predicted value | 0.2899 | 0.2899 | 0.2899 | 0.2899 |
Standard error | 0.0914 | 0.0924 | Not provided | Not provided |
95% lower confidence limit | 0.1461 | 0.1447 | Not provided | 0.1461 |
95% upper confidence limit | 0.4936 | 0.4963 | Not provided | 0.4936 |