6.5 Ordinal logistic regression analysis
Many variables in complex surveys are ordinal, where the response variable has a well-defined order for the categories. Ordinal logistic regression analysis is suitable to model the relationship between the probabilities of the cumulative categories and the key factors. In the CLSA Tracking cohort, the participants are asked the following question:
How do you feel about your local area, that is, everywhere within a 20 minute walk or about a kilometer from your home? Please tell me how strongly you agree or disagree with the following statements: People would be afraid to walk alone after dark in local area.
There are 4 levels of responses: Strongly Agree (\(d=1\)), Agree (\(d=2\)), Disagree (\(d=3\)), and Strongly Disagree(\(d=4\)) and the results are stored as the variable \(\texttt{ENV_AFRDWLK_MCQ}\). Suppose we are interested in the relationship between the response categories of \(\texttt{ENV_AFRDWLK_MCQ}\) and sex and age. The cumulative logistic regression model has the form
\[\begin{align*} g(\Pr(Y<d \mid \pmb x)) = \alpha_d + \pmb{x}' \pmb \beta, & \;\;\; \mbox{(for } \texttt{SAS}\mbox{)}\\ g(\Pr(Y<d \mid \pmb x)) = \alpha_d - \pmb{x}' \pmb \eta, & \;\;\; \mbox{(for }\texttt{R}, \texttt{SPSS} \mbox{ and }\texttt{Stata} \mbox{)} \end{align*}\]
with \(d=2, 3, 4\), where \(g(\cdot)\) is a link function. We usually specify \(g(\cdot)\) as the logit function, and the model becomes the cumulative logit model (also known as the proportional odds model). For comparison purposes, we multiply \(-1\) to the \(\texttt{SAS}\) coefficient estimates output for \(\pmb \beta\).
R
SAS
PROC SURVEYLOGISTIC data = CLSAData;
CLASS ENV_AFRDWLK_MCQ WGHTS_PROV_TRM(ref = 'AB') Age_group_5(ref = '45-48')
SEX_ASK_TRM(ref ='F') Education(ref = 'Low Education')/param = ref;
MODEL ENV_AFRDWLK_MCQ = SEX_ASK_TRM Age_group_5 Education WGHTS_PROV_TRM /clodds ;
STRATA GEOSTRAT_TRM;
WEIGHT WGHTS_ANALYTIC_TRM;
RUN;
SPSS
Transform \(\rightarrow\) Recode into Different Variables \(\rightarrow\) select \(\texttt{ENV_AFRDWLK_MCQ}\) \(\rightarrow\) Under Output Variable, enter \(\texttt{ENV_AFRDWLK_Num}\) in the Name field and click change \(\rightarrow\) Click “Old and New Values” \(\rightarrow\) Enter “Strongly Agree” under “Old Value” and enter “1” under “New Value” in the “Value fields” , then click “Add” \(\rightarrow\) repeat the process for converting “Agree” to “2”, “Disagree” to “3” and “Strongly Disagree” to “4” \(\rightarrow\) “Continue” \(\rightarrow\) “OK”.
Analyze \(\rightarrow\) Complex Samples \(\rightarrow\) Ordinal Regression… \(\rightarrow\) in the “Plan” panel, select the file “\(\texttt{CLSADesignAnyl.csaplan}\)” \(\rightarrow\) click “Continue” \(\rightarrow\) select the corresponding variables to the “Dependent Variable”, “Factor” and “Covariate” panels \(\rightarrow\) click “Response Probabilities” and select “Accumulate from lowest value of dependent variable to highest value” \(\rightarrow\) click “Statistics…” \(\rightarrow\) select “Estimate” and “Standard error” \(\rightarrow\) click “Continue” \(\rightarrow\) click “OK”.
Stata
Result comparison
Population Estimates | Coeff. | SE | Coeff. | SE | Coeff. | SE | Coeff. | SE |
---|---|---|---|---|---|---|---|---|
SEX_ASK_TRM=“M” | -0.2834 | 0.1991 | -0.2834 | 0.1914 | -0.2833 | 0.1991 | -0.2833 | 0.1991 |
Age Groups: relative to Age_Gpr0: Age 45-48 | ||||||||
Age_Gpr1:Age 49-54 | -1.1538 | 0.3958 | -1.1538 | 0.4419 | -1.1538 | 0.3958 | -1.1538 | 0.3958 |
Age_Gpr2:Age 55-64 | -2.3273 | 0.3670 | -2.3273 | 0.4091 | -2.3273 | 0.3670 | -2.3273 | 0.3670 |
Age_Gpr3:Age 65-74 | -3.0533 | 0.4539 | -3.0533 | 0.4509 | -3.0533 | 0.4539 | -3.0533 | 0.4539 |
Age_Gpr4:Age 75+ | -2.6253 | 0.5430 | -2.6249 | 0.4754 | -2.6253 | 0.5430 | -2.6253 | 0.5430 |
Education Levels: relative to Lower Education | ||||||||
Medium Education | -0.0299 | 0.3011 | -0.4574 | 0.3484 | -0.0298 | 0.3011 | -0.0298 | 0.3011 |
Higher Education lower | -0.4574 | 0.3498 | -0.1075 | 0.3101 | -0.4574 | 0.3498 | -0.4574 | 0.3498 |
Higher Education upper | -0.1075 | 0.3200 | -0.0299 | 0.2956 | -0.1075 | 0.3200 | -0.1075 | 0.3200 |
Provinces: relative to Alberta | ||||||||
British Columbia | 0.3047 | 0.4696 | 0.3047 | 0.4341 | 0.3048 | 0.4696 | 0.3048 | 0.4696 |
Manitoba | -1.0332 | 0.3975 | -1.0332 | 0.4040 | -1.0332 | 0.3975 | -1.0332 | 0.3975 |
New Brunswick | -0.1808 | 0.6405 | -0.1807 | 0.5937 | -0.1807 | 0.6405 | -0.1807 | 0.6405 |
Newfoundland & Labrador | -0.9933 | 0.5100 | -0.9932 | 0.5213 | -0.9931 | 0.5100 | -0.9931 | 0.5100 |
Nova Scotia | 0.2867 | 0.5028 | 0.2868 | 0.4749 | 0.2868 | 0.5028 | 0.2868 | 0.5028 |
Ontario | 0.2651 | 0.3650 | 0.2652 | 0.3427 | 0.2652 | 0.3650 | 0.2652 | 0.3650 |
Prince Edward Island | -0.1440 | 0.4912 | -0.1439 | 0.4795 | -0.1439 | 0.4912 | -0.1439 | 0.4912 |
Quebec | -0.4611 | 0.3718 | -0.4611 | 0.3604 | -0.4610 | 0.3718 | -0.4610 | 0.3718 |
Saskatchewan | -0.6360 | 0.4435 | -0.6359 | 0.4366 | -0.6359 | 0.4435 | -0.6359 | 0.4435 |
(Intercepts) | ||||||||
Strongly Agree|Agree | -4.8973 | 0.5194 | -4.8972 | 0.5330 | -4.8971 | 0.5194 | -4.8971 | 0.5194 |
Agree|Disagree | -3.7190 | 0.4981 | -3.7189 | 0.5064 | -3.7188 | 0.4981 | -3.7188 | 0.4981 |
Disagree|Strongly Disagree | -2.1126 | 0.5161 | -2.1125 | 0.5287 | -2.1124 | 0.5161 | -2.1124 | 0.5161 |
The coefficient for the variable “\(\texttt{SEX_ASK_TRM}\)” is negative but with a large standard error. It suggests that both males and females have a similar feeling about the safety of walking alone after dark in the local area. The negative coefficients of “\(\texttt{Age_Gpr1:Age 49-54}\),” “\(\texttt{Age_Gpr2:Age 55-64}\),” “\(\texttt{Age_Gpr3:Age 65-74}\)” and “\(\texttt{Age_Gpr4:Age 75+}\)” suggest that older people would feel unsafe to walk alone in the dark compared to the reference age group, “\(\texttt{Age_Gpr0:Age 45-48}\).” The results are in line with common sense.