4.1 R preparation
In R, the command \(\texttt{read.csv}\) is used for importing datasets from CSV files:
Then, we can specify the age groups and declare the survey design with the package \(\texttt{survey}\):
library (survey)
CLSAData$StraVar <- CLSAData$GEOSTRAT_TRM
CLSA.design<- svydesign( ids= ~ entity_id, strata = ~ StraVar,
weights = ~ WGHTS_INFLATION_TRM, data= CLSAData, nest =TRUE )
The option \(\texttt{weights}\) specifies the sampling weights. We use the inflation weights for tracking cohort with name “\(\texttt{WGHTS_INFLATION_TRM}\)”. The analyses of different cohorts would have different weight variables: for analysis involving comprehensive cohort, the label for the inflation weights is “\(\texttt{WGHTS_INFLATION_COM}\),” and the label for the strata variable is “\(\texttt{GEOSTRAT_COM}\)”; for analysis involving combined cohort, the label for the inflation weights is “\(\texttt{WGHTS_INFLATION_CLSAM}\),” and the label for the strata variable is “\(\texttt{GEOSTRAT_CLSAM}\).”
Most proprietary statistical packages would assume single PSU strata to have no contribution to the variance by default (Bruin 2011). We would add the following option: