Skip to contents

Adds LOCF records as new observations for each 'by group' when the dataset does not contain observations for missed visits/time points.

Usage

derive_locf_records(dataset, dataset_expected_obs, by_vars, order)

Arguments

dataset

Input dataset

The columns specified by the by_vars and the order parameter are expected.

dataset_expected_obs

Expected observations dataset

Data frame with all the combinations of PARAMCD, PARAM, AVISIT, AVISITN, ... which are expected in the dataset is expected.

by_vars

Grouping variables

For each group defined by by_vars those observations from dataset_expected_obs are added to the output dataset which do not have a corresponding observation in the input dataset or for which AVAL is NA for the corresponding observation in the input dataset. Only variables specified in by_vars will be populated in the newly created records.

order

List of variables for sorting a dataset

The dataset is sorted by order before carrying the last observation forward (eg. AVAL) within each by_vars.

Value

The input dataset with the new "LOCF" observations added for each by_vars. Note, a variable will only be populated in the new parameter rows if it is specified in by_vars.

Details

For each group (with respect to the variables specified for the by_vars parameter) those observations from dataset_expected_obs are added to the output dataset

  • which do not have a corresponding observation in the input dataset or

  • for which AVAL is NA for the corresponding observation in the input dataset.

    For the new observations, AVAL is set to the non-missing AVAL of the previous observation in the input dataset (when sorted by order) and DTYPE is set to "LOCF".

Author

G Gayatri

Examples


library(dplyr)
library(tibble)

advs <- tribble(
  ~STUDYID,  ~USUBJID,      ~PARAMCD, ~AVAL, ~AVISITN, ~AVISIT,
  "CDISC01", "01-701-1015", "PULSE",     61,        0, "BASELINE",
  "CDISC01", "01-701-1015", "PULSE",     60,        2, "WEEK 6",
  "CDISC01", "01-701-1015", "DIABP",     51,        0, "BASELINE",
  "CDISC01", "01-701-1015", "DIABP",     50,        2, "WEEK 2",
  "CDISC01", "01-701-1015", "DIABP",     51,        4, "WEEK 4",
  "CDISC01", "01-701-1015", "DIABP",     50,        6, "WEEK 6",
  "CDISC01", "01-701-1015", "SYSBP",    121,        0, "BASELINE",
  "CDISC01", "01-701-1015", "SYSBP",    121,        2, "WEEK 2",
  "CDISC01", "01-701-1015", "SYSBP",    121,        4, "WEEK 4",
  "CDISC01", "01-701-1015", "SYSBP",    121,        6, "WEEK 6",
  "CDISC01", "01-701-1028", "PULSE",     65,        0, "BASELINE",
  "CDISC01", "01-701-1028", "DIABP",     79,        0, "BASELINE",
  "CDISC01", "01-701-1028", "DIABP",     80,        2, "WEEK 2",
  "CDISC01", "01-701-1028", "DIABP",     NA,        4, "WEEK 4",
  "CDISC01", "01-701-1028", "DIABP",     NA,        6, "WEEK 6",
  "CDISC01", "01-701-1028", "SYSBP",    130,        0, "BASELINE",
  "CDISC01", "01-701-1028", "SYSBP",    132,        2, "WEEK 2"
)


# A dataset with all the combinations of PARAMCD, PARAM, AVISIT, AVISITN, ... which are expected.
advs_expected_obsv <- tibble::tribble(
  ~PARAMCD, ~AVISITN, ~AVISIT,
  "PULSE",         0, "BASELINE",
  "PULSE",         6, "WEEK 6",
  "DIABP",         0, "BASELINE",
  "DIABP",         2, "WEEK 2",
  "DIABP",         4, "WEEK 4",
  "DIABP",         6, "WEEK 6",
  "SYSBP",         0, "BASELINE",
  "SYSBP",         2, "WEEK 2",
  "SYSBP",         4, "WEEK 4",
  "SYSBP",         6, "WEEK 6"
)

derive_locf_records(
  data = advs,
  dataset_expected_obs = advs_expected_obsv,
  by_vars = vars(STUDYID, USUBJID, PARAMCD),
  order = vars(AVISITN, AVISIT)
)
#> # A tibble: 23 x 7
#>    STUDYID USUBJID     PARAMCD  AVAL AVISITN AVISIT   DTYPE
#>    <chr>   <chr>       <chr>   <dbl>   <dbl> <chr>    <chr>
#>  1 CDISC01 01-701-1015 DIABP      51       0 BASELINE NA   
#>  2 CDISC01 01-701-1015 DIABP      50       2 WEEK 2   NA   
#>  3 CDISC01 01-701-1015 DIABP      51       4 WEEK 4   NA   
#>  4 CDISC01 01-701-1015 DIABP      50       6 WEEK 6   NA   
#>  5 CDISC01 01-701-1015 PULSE      61       0 BASELINE NA   
#>  6 CDISC01 01-701-1015 PULSE      60       2 WEEK 6   NA   
#>  7 CDISC01 01-701-1015 PULSE      60       6 WEEK 6   LOCF 
#>  8 CDISC01 01-701-1015 SYSBP     121       0 BASELINE NA   
#>  9 CDISC01 01-701-1015 SYSBP     121       2 WEEK 2   NA   
#> 10 CDISC01 01-701-1015 SYSBP     121       4 WEEK 4   NA   
#> # … with 13 more rows