4.1 R preparation
In R, the command is used for importing datasets from CSV files:
CLSAData <- read.csv("[Path]/CLSARealExample.csv", header=TRUE, sep = ",")
Then, we can specify the age groups and declare the survey design with the package :
library (survey)
CLSAData$StraVar <- CLSAData$GEOSTRAT_TRM
CLSA.design<- svydesign( ids= ~ entity_id, strata = ~ StraVar,
weights = ~ WGHTS_INFLATION_TRM, data= CLSAData, nest =TRUE )
The option specifies the sampling weights. We use the inflation weights for tracking cohort with name “”. The analyses of different cohorts would have different weight variables: for analysis involving comprehensive cohort, the label for the inflation weights is “,” and the label for the strata variable is “”; for analysis involving combined cohort, the label for the inflation weights is “,” and the label for the strata variable is “.”
Most proprietary statistical packages would assume single PSU strata to have no contribution to the variance by default (Bruin 2011). We would add the following option:
options(survey.lonely.psu = "certainty")