
Derive LOCF (Last Observation Carried Forward) Records
Source:R/derive_locf_records.R
derive_locf_records.Rd
Adds LOCF records as new observations for each 'by group' when the dataset does not contain observations for missed visits/time points and when analysis value is missing.
Usage
derive_locf_records(
dataset,
dataset_ref,
by_vars,
id_vars_ref = NULL,
analysis_var = AVAL,
imputation = "add",
order,
keep_vars = NULL
)
Arguments
- dataset
-
Input dataset
The variables specified by the
by_vars
,analysis_var
,order
, andkeep_vars
arguments are expected to be in the dataset.- Default value
none
- dataset_ref
-
Expected observations dataset
Data frame with all the combinations of
PARAMCD
,PARAM
,AVISIT
,AVISITN
, ... which are expected in the dataset is expected.- Default value
none
- by_vars
-
Grouping variables
For each group defined by
by_vars
those observations fromdataset_ref
are added to the output dataset which do not have a corresponding observation in the input dataset or for whichanalysis_var
isNA
for the corresponding observation in the input dataset.- Default value
none
- id_vars_ref
-
Grouping variables in expected observations dataset
The variables to group by in
dataset_ref
when determining which observations should be added to the input dataset.- Default value
All the variables in
dataset_ref
- analysis_var
-
Analysis variable.
- Permitted values
a variable
- Default value
AVAL
- imputation
-
Select the mode of imputation:
add
: Keep all original records and add imputed records for missing timepoints and missinganalysis_var
values fromdataset_ref
.update
: Update records with missinganalysis_var
and add imputed records for missing timepoints fromdataset_ref
.update_add
: Keep all original records, update records with missinganalysis_var
and add imputed records for missing timepoints fromdataset_ref
.- Permitted values
One of these 3 values:
"add"
,"update"
,"update_add"
- Default value
"add"
- order
-
Sort order
The dataset is sorted by
order
before carrying the last observation forward (e.g.AVAL
) within eachby_vars
.For handling of
NA
s in sorting variables see Sort Order.- Default value
none
- keep_vars
-
Variables that need carrying the last observation forward
Keep variables that need carrying the last observation forward other than
analysis_var
(e.g.,PARAMN
,VISITNUM
). If by defaultNULL
, only variables specified inby_vars
andanalysis_var
will be populated in the newly created records.- Default value
NULL
Value
The input dataset with the new "LOCF" observations added for each
by_vars
, based on the value passed to the imputation
argument.
Details
For each group (with respect to the variables specified for the
by_vars parameter) those observations from dataset_ref
are added to
the output dataset
which do not have a corresponding observation in the input dataset or
-
for which
analysis_var
isNA
for the corresponding observation in the input dataset.For the new observations,
analysis_var
is set to the non-missinganalysis_var
of the previous observation in the input dataset (when sorted byorder
) andDTYPE
is set to "LOCF".The
imputation
argument decides whether to update the existing observation whenanalysis_var
isNA
("update"
and"update_add"
), or to add a new observation fromdataset_ref
instead ("add"
).
See also
BDS-Findings Functions for adding Parameters/Records:
default_qtc_paramcd()
,
derive_expected_records()
,
derive_extreme_event()
,
derive_extreme_records()
,
derive_param_bmi()
,
derive_param_bsa()
,
derive_param_computed()
,
derive_param_doseint()
,
derive_param_exist_flag()
,
derive_param_exposure()
,
derive_param_framingham()
,
derive_param_map()
,
derive_param_qtc()
,
derive_param_rr()
,
derive_param_wbc_abs()
,
derive_summary_records()
Examples
library(dplyr)
library(tibble)
advs <- tribble(
~STUDYID, ~USUBJID, ~VSSEQ, ~PARAMCD, ~PARAMN, ~AVAL, ~AVISITN, ~AVISIT,
"CDISC01", "01-701-1015", 1, "PULSE", 1, 65, 0, "BASELINE",
"CDISC01", "01-701-1015", 2, "DIABP", 2, 79, 0, "BASELINE",
"CDISC01", "01-701-1015", 3, "DIABP", 2, 80, 2, "WEEK 2",
"CDISC01", "01-701-1015", 4, "DIABP", 2, NA, 4, "WEEK 4",
"CDISC01", "01-701-1015", 5, "DIABP", 2, NA, 6, "WEEK 6",
"CDISC01", "01-701-1015", 6, "SYSBP", 3, 130, 0, "BASELINE",
"CDISC01", "01-701-1015", 7, "SYSBP", 3, 132, 2, "WEEK 2"
)
# A dataset with all the combinations of PARAMCD, PARAM, AVISIT, AVISITN, ...
# which are expected.
advs_expected_obsv <- tribble(
~PARAMCD, ~AVISITN, ~AVISIT,
"PULSE", 0, "BASELINE",
"PULSE", 6, "WEEK 6",
"DIABP", 0, "BASELINE",
"DIABP", 2, "WEEK 2",
"DIABP", 4, "WEEK 4",
"DIABP", 6, "WEEK 6",
"SYSBP", 0, "BASELINE",
"SYSBP", 2, "WEEK 2",
"SYSBP", 4, "WEEK 4",
"SYSBP", 6, "WEEK 6"
)
# Example 1: Add imputed records for missing timepoints and for missing
# `analysis_var` values (from `dataset_ref`), keeping all the original records.
derive_locf_records(
dataset = advs,
dataset_ref = advs_expected_obsv,
by_vars = exprs(STUDYID, USUBJID, PARAMCD),
imputation = "add",
order = exprs(AVISITN, AVISIT),
keep_vars = exprs(PARAMN)
) |>
arrange(USUBJID, PARAMCD, AVISIT)
#> # A tibble: 12 × 9
#> STUDYID USUBJID VSSEQ PARAMCD PARAMN AVAL AVISITN AVISIT DTYPE
#> <chr> <chr> <dbl> <chr> <dbl> <dbl> <dbl> <chr> <chr>
#> 1 CDISC01 01-701-1015 2 DIABP 2 79 0 BASELINE NA
#> 2 CDISC01 01-701-1015 3 DIABP 2 80 2 WEEK 2 NA
#> 3 CDISC01 01-701-1015 NA DIABP 2 80 4 WEEK 4 LOCF
#> 4 CDISC01 01-701-1015 4 DIABP 2 NA 4 WEEK 4 NA
#> 5 CDISC01 01-701-1015 NA DIABP 2 80 6 WEEK 6 LOCF
#> 6 CDISC01 01-701-1015 5 DIABP 2 NA 6 WEEK 6 NA
#> 7 CDISC01 01-701-1015 1 PULSE 1 65 0 BASELINE NA
#> 8 CDISC01 01-701-1015 NA PULSE 1 65 6 WEEK 6 LOCF
#> 9 CDISC01 01-701-1015 6 SYSBP 3 130 0 BASELINE NA
#> 10 CDISC01 01-701-1015 7 SYSBP 3 132 2 WEEK 2 NA
#> 11 CDISC01 01-701-1015 NA SYSBP 3 132 4 WEEK 4 LOCF
#> 12 CDISC01 01-701-1015 NA SYSBP 3 132 6 WEEK 6 LOCF
# Example 2: Add imputed records for missing timepoints (from `dataset_ref`)
# and update missing `analysis_var` values.
derive_locf_records(
dataset = advs,
dataset_ref = advs_expected_obsv,
by_vars = exprs(STUDYID, USUBJID, PARAMCD),
imputation = "update",
order = exprs(AVISITN, AVISIT),
) |>
arrange(USUBJID, PARAMCD, AVISIT)
#> # A tibble: 10 × 9
#> STUDYID USUBJID VSSEQ PARAMCD PARAMN AVAL AVISITN AVISIT DTYPE
#> <chr> <chr> <dbl> <chr> <dbl> <dbl> <dbl> <chr> <chr>
#> 1 CDISC01 01-701-1015 2 DIABP 2 79 0 BASELINE NA
#> 2 CDISC01 01-701-1015 3 DIABP 2 80 2 WEEK 2 NA
#> 3 CDISC01 01-701-1015 4 DIABP 2 80 4 WEEK 4 LOCF
#> 4 CDISC01 01-701-1015 5 DIABP 2 80 6 WEEK 6 LOCF
#> 5 CDISC01 01-701-1015 1 PULSE 1 65 0 BASELINE NA
#> 6 CDISC01 01-701-1015 NA PULSE NA 65 6 WEEK 6 LOCF
#> 7 CDISC01 01-701-1015 6 SYSBP 3 130 0 BASELINE NA
#> 8 CDISC01 01-701-1015 7 SYSBP 3 132 2 WEEK 2 NA
#> 9 CDISC01 01-701-1015 NA SYSBP NA 132 4 WEEK 4 LOCF
#> 10 CDISC01 01-701-1015 NA SYSBP NA 132 6 WEEK 6 LOCF
# Example 3: Add imputed records for missing timepoints (from `dataset_ref`) and
# update missing `analysis_var` values, keeping all the original records.
derive_locf_records(
dataset = advs,
dataset_ref = advs_expected_obsv,
by_vars = exprs(STUDYID, USUBJID, PARAMCD),
imputation = "update_add",
order = exprs(AVISITN, AVISIT),
) |>
arrange(USUBJID, PARAMCD, AVISIT)
#> # A tibble: 12 × 9
#> STUDYID USUBJID VSSEQ PARAMCD PARAMN AVAL AVISITN AVISIT DTYPE
#> <chr> <chr> <dbl> <chr> <dbl> <dbl> <dbl> <chr> <chr>
#> 1 CDISC01 01-701-1015 2 DIABP 2 79 0 BASELINE NA
#> 2 CDISC01 01-701-1015 3 DIABP 2 80 2 WEEK 2 NA
#> 3 CDISC01 01-701-1015 4 DIABP 2 80 4 WEEK 4 LOCF
#> 4 CDISC01 01-701-1015 4 DIABP 2 NA 4 WEEK 4 NA
#> 5 CDISC01 01-701-1015 5 DIABP 2 80 6 WEEK 6 LOCF
#> 6 CDISC01 01-701-1015 5 DIABP 2 NA 6 WEEK 6 NA
#> 7 CDISC01 01-701-1015 1 PULSE 1 65 0 BASELINE NA
#> 8 CDISC01 01-701-1015 NA PULSE NA 65 6 WEEK 6 LOCF
#> 9 CDISC01 01-701-1015 6 SYSBP 3 130 0 BASELINE NA
#> 10 CDISC01 01-701-1015 7 SYSBP 3 132 2 WEEK 2 NA
#> 11 CDISC01 01-701-1015 NA SYSBP NA 132 4 WEEK 4 LOCF
#> 12 CDISC01 01-701-1015 NA SYSBP NA 132 6 WEEK 6 LOCF