Derive LOCF (Last Observation Carried Forward) Records

Adds LOCF records as new observations for each 'by group' when the dataset does not contain observations for missed visits/time points.

Usage

derive_locf_records(
  dataset,
  dataset_ref,
  by_vars,
  analysis_var = AVAL,
  order,
  keep_vars = NULL
)

Arguments

dataset

Input dataset

The variables specified by the by_vars, analysis_var, order, and keep_vars arguments are expected to be in the dataset.

dataset_ref

Expected observations dataset

Data frame with all the combinations of PARAMCD, PARAM, AVISIT, AVISITN, ... which are expected in the dataset is expected.

by_vars

Grouping variables

For each group defined by by_vars those observations from dataset_ref are added to the output dataset which do not have a corresponding observation in the input dataset or for which analysis_var is NA for the corresponding observation in the input dataset.

Permitted Values: list of variables created by exprs() e.g. exprs(USUBJID, VISIT)

analysis_var

Analysis variable.

Default: AVAL

Permitted Values: a variable

order

Sort order

The dataset is sorted by order before carrying the last observation forward (e.g. AVAL) within each by_vars.

For handling of NAs in sorting variables see Sort Order.

keep_vars

Variables that need carrying the last observation forward

Keep variables that need carrying the last observation forward other than analysis_var (e.g., PARAMN, VISITNUM). If by default NULL, only variables specified in by_vars and analysis_var will be populated in the newly created records.

Value

The input dataset with the new "LOCF" observations added for each by_vars. Note, a variable will only be populated in the new parameter rows if it is specified in by_vars.

Details

For each group (with respect to the variables specified for the by_vars parameter) those observations from dataset_ref are added to the output dataset

which do not have a corresponding observation in the input dataset or
for which analysis_var is NA for the corresponding observation in the input dataset.

For the new observations, analysis_var is set to the non-missing analysis_var of the previous observation in the input dataset (when sorted by order) and DTYPE is set to "LOCF".

Author

G Gayatri

Examples


library(dplyr)
library(tibble)

advs <- tribble(
  ~STUDYID,  ~USUBJID,      ~PARAMCD, ~PARAMN, ~AVAL, ~AVISITN, ~AVISIT,
  "CDISC01", "01-701-1015", "PULSE",        1,    65,        0, "BASELINE",
  "CDISC01", "01-701-1015", "DIABP",        2,    79,        0, "BASELINE",
  "CDISC01", "01-701-1015", "DIABP",        2,    80,        2, "WEEK 2",
  "CDISC01", "01-701-1015", "DIABP",        2,    NA,        4, "WEEK 4",
  "CDISC01", "01-701-1015", "DIABP",        2,    NA,        6, "WEEK 6",
  "CDISC01", "01-701-1015", "SYSBP",        3,   130,        0, "BASELINE",
  "CDISC01", "01-701-1015", "SYSBP",        3,   132,        2, "WEEK 2",
  "CDISC01", "01-701-1028", "PULSE",        1,    61,        0, "BASELINE",
  "CDISC01", "01-701-1028", "PULSE",        1,    60,        6, "WEEK 6",
  "CDISC01", "01-701-1028", "DIABP",        2,    51,        0, "BASELINE",
  "CDISC01", "01-701-1028", "DIABP",        2,    50,        2, "WEEK 2",
  "CDISC01", "01-701-1028", "DIABP",        2,    51,        4, "WEEK 4",
  "CDISC01", "01-701-1028", "DIABP",        2,    50,        6, "WEEK 6",
  "CDISC01", "01-701-1028", "SYSBP",        3,   121,        0, "BASELINE",
  "CDISC01", "01-701-1028", "SYSBP",        3,   121,        2, "WEEK 2",
  "CDISC01", "01-701-1028", "SYSBP",        3,   121,        4, "WEEK 4",
  "CDISC01", "01-701-1028", "SYSBP",        3,   121,        6, "WEEK 6"
)

# A dataset with all the combinations of PARAMCD, PARAM, AVISIT, AVISITN, ... which are expected.
advs_expected_obsv <- tribble(
  ~PARAMCD, ~AVISITN, ~AVISIT,
  "PULSE",         0, "BASELINE",
  "PULSE",         6, "WEEK 6",
  "DIABP",         0, "BASELINE",
  "DIABP",         2, "WEEK 2",
  "DIABP",         4, "WEEK 4",
  "DIABP",         6, "WEEK 6",
  "SYSBP",         0, "BASELINE",
  "SYSBP",         2, "WEEK 2",
  "SYSBP",         4, "WEEK 4",
  "SYSBP",         6, "WEEK 6"
)

derive_locf_records(
  dataset = advs,
  dataset_ref = advs_expected_obsv,
  by_vars = exprs(STUDYID, USUBJID, PARAMCD),
  order = exprs(AVISITN, AVISIT),
  keep_vars = exprs(PARAMN)
) |>
  arrange(USUBJID, PARAMCD, AVISIT)
#> # A tibble: 22 × 8
#>    STUDYID USUBJID     PARAMCD PARAMN  AVAL AVISITN AVISIT   DTYPE
#>    <chr>   <chr>       <chr>    <dbl> <dbl>   <dbl> <chr>    <chr>
#>  1 CDISC01 01-701-1015 DIABP        2    79       0 BASELINE NA   
#>  2 CDISC01 01-701-1015 DIABP        2    80       2 WEEK 2   NA   
#>  3 CDISC01 01-701-1015 DIABP        2    80       4 WEEK 4   LOCF 
#>  4 CDISC01 01-701-1015 DIABP        2    NA       4 WEEK 4   NA   
#>  5 CDISC01 01-701-1015 DIABP        2    80       6 WEEK 6   LOCF 
#>  6 CDISC01 01-701-1015 DIABP        2    NA       6 WEEK 6   NA   
#>  7 CDISC01 01-701-1015 PULSE        1    65       0 BASELINE NA   
#>  8 CDISC01 01-701-1015 PULSE        1    65       6 WEEK 6   LOCF 
#>  9 CDISC01 01-701-1015 SYSBP        3   130       0 BASELINE NA   
#> 10 CDISC01 01-701-1015 SYSBP        3   132       2 WEEK 2   NA   
#> # ℹ 12 more rows