
Add the First or Last Observation for Each By Group as New Records
Source:R/derive_extreme_records.R
derive_extreme_records.Rd
Add the first or last observation for each by group as new observations. The
new observations can be selected from the additional dataset. This function can
be used for adding the maximum or minimum value as a separate visit.
All variables of the selected observation are kept. This distinguishes
derive_extreme_records()
from derive_summary_records()
,
where only the by variables are populated for the new records.
Usage
derive_extreme_records(
dataset = NULL,
dataset_add,
dataset_ref = NULL,
by_vars = NULL,
order = NULL,
mode = NULL,
filter_add = NULL,
check_type = "warning",
exist_flag = NULL,
true_value = "Y",
false_value = NA_character_,
keep_source_vars = exprs(everything()),
set_values_to
)
Arguments
- dataset
Input dataset
If the argument is not specified (or set to
NULL
), a new dataset is created. Otherwise, the new records are appended to the specified dataset.- Permitted values
a dataset, i.e., a
data.frame
or tibble- Default value
NULL
- dataset_add
Additional dataset
The additional dataset, which determines the by groups returned in the input dataset, based on the groups that exist in this dataset after being subset by
filter_add
.The variables specified in the
by_vars
andfilter_add
parameters are expected in this dataset. Ifmode
andorder
are specified, the first or last observation within each by group, defined byby_vars
, is selected.- Permitted values
a dataset, i.e., a
data.frame
or tibble- Default value
none
- dataset_ref
Reference dataset
The variables specified for
by_vars
are expected. For each observation of the specified dataset a new observation is added to the input dataset.For records which are added from
dataset_ref
because there are no records indataset_add
for the by group only those variables are kept which are also indataset_add
(and are included inkeep_source_vars
).- Permitted values
a dataset, i.e., a
data.frame
or tibble- Default value
NULL
- by_vars
Grouping variables
If
dataset_ref
is specified, this argument must be specified.- Permitted values
list of variables created by
exprs()
, e.g.,exprs(USUBJID, VISIT)
- Default value
NULL
- order
Sort order
Within each by group the observations are ordered by the specified order.
- Permitted values
list of expressions created by
exprs()
, e.g.,exprs(ADT, desc(AVAL))
- Default value
none
- mode
Selection mode (first or last)
If
"first"
is specified, the first observation of each by group is added to the input dataset. If"last"
is specified, the last observation of each by group is added to the input dataset.- Permitted values
"first"
,"last"
- Default value
NULL
- filter_add
Filter for additional dataset (
dataset_add
)Only observations in
dataset_add
fulfilling the specified condition are considered.- Permitted values
an unquoted condition, e.g.,
AVISIT == "BASELINE"
- Default value
NULL
- check_type
Check uniqueness?
If
"warning"
or"error"
is specified, the specified message is issued if the observations of the (restricted) additional dataset are not unique with respect to the by variables and the order.- Permitted values
"none"
,"message"
,"warning"
,"error"
- Default value
"warning"
- exist_flag
Existence flag
The specified variable is added to the output dataset.
For by groups with at least one observation in the additional dataset (
dataset_add
)exist_flag
is set to the value specified by thetrue_value
argument.For all other by groups
exist_flag
is set to the value specified by thefalse_value
argument.- Permitted values
Variable name
- Default value
NULL
- true_value
True value
For new observations selected from the additional dataset (
dataset_add
),exist_flag
is set to the specified value.- Default value
"Y"
- false_value
False value
For new observations not selected from the additional dataset (
dataset_add
),exist_flag
is set to the specified value.- Default value
NA_character_
- keep_source_vars
Variables to be kept in the new records
A named list or tidyselect expressions created by
exprs()
defining the variables to be kept for the new records. The variables specified forby_vars
andset_values_to
need not be specified here as they are kept automatically.- Default value
exprs(everything())
- set_values_to
Variables to be set
The specified variables are set to the specified values for the new observations.
Set a list of variables to some specified value for the new records
LHS refer to a variable.
RHS refers to the values to set to the variable. This can be a string, a symbol, a numeric value, an expression or NA. If summary functions are used, the values are summarized by the variables specified for
by_vars
.
For example:
- Default value
none
Value
The input dataset with the first or last observation of each by group added as new observations.
Details
The additional dataset (
dataset_add
) is restricted as specified by thefilter_add
argument.For each group (with respect to the variables specified for the
by_vars
argument) the first or last observation (with respect to the order specified for theorder
argument and the mode specified for themode
argument) is selected.If
dataset_ref
is specified, observations which are indataset_ref
but not in the selected records are added.The variables specified by the
set_values_to
argument are added to the selected observations.The variables specified by the
keep_source_vars
argument are selected along with the variables specified inby_vars
andset_values_to
arguments.The observations are added to input dataset (
dataset
). If no input dataset is provided, a new dataset is created.
See also
BDS-Findings Functions for adding Parameters/Records:
default_qtc_paramcd()
,
derive_expected_records()
,
derive_extreme_event()
,
derive_locf_records()
,
derive_param_bmi()
,
derive_param_bsa()
,
derive_param_computed()
,
derive_param_doseint()
,
derive_param_exist_flag()
,
derive_param_exposure()
,
derive_param_framingham()
,
derive_param_map()
,
derive_param_qtc()
,
derive_param_rr()
,
derive_param_wbc_abs()
,
derive_summary_records()
Examples
Add last/first record as new record
For each subject the last record should be added as a new visit.
The source dataset for the new records is specified by the
dataset_add
argument. Here it is the same as the input dataset.The groups for which new records are added are specified by the
by_vars
argument. Here for each subject a record should be added. Thusby_vars = exprs(USUBJID)
is specified.As there are multiple records per subject, the
mode
andorder
arguments are specified to request that the last record is selected when the records are sorted by visit (AVISITN
). The records are sorted by each by group (by_vars
) separately, i.e., it is not necessary to add the variables fromby_vars
toorder
.To avoid duplicates in the output dataset the
set_values_to
argument is specified to set the visit (AVISIT
) to a special value for the new records.
library(tibble)
library(dplyr, warn.conflicts = FALSE)
library(lubridate, warn.conflicts = FALSE)
adlb <- tribble(
~USUBJID, ~AVISITN, ~AVAL,
"1", 1, 113,
"1", 2, 111,
"2", 1, 101,
"2", 2, NA,
"3", 1, NA,
)
derive_extreme_records(
adlb,
dataset_add = adlb,
by_vars = exprs(USUBJID),
order = exprs(AVISITN),
mode = "last",
set_values_to = exprs(
AVISITN = 99
)
)
#> # A tibble: 8 × 3
#> USUBJID AVISITN AVAL
#> <chr> <dbl> <dbl>
#> 1 1 1 113
#> 2 1 2 111
#> 3 2 1 101
#> 4 2 2 NA
#> 5 3 1 NA
#> 6 1 99 111
#> 7 2 99 NA
#> 8 3 99 NA
Restricting source records (filter_add
)
The source records can be restricted by the filter_add
argument,
e.g., to exclude visits with missing analysis value from selecting for the
last visit record:
derive_extreme_records(
adlb,
dataset_add = adlb,
filter_add = !is.na(AVAL),
by_vars = exprs(USUBJID),
order = exprs(AVISITN),
mode = "last",
set_values_to = exprs(
AVISITN = 99
)
)
#> # A tibble: 7 × 3
#> USUBJID AVISITN AVAL
#> <chr> <dbl> <dbl>
#> 1 1 1 113
#> 2 1 2 111
#> 3 2 1 101
#> 4 2 2 NA
#> 5 3 1 NA
#> 6 1 99 111
#> 7 2 99 101
Please note that new records are added only for subjects in the
restricted source data. Therefore no new record is added for subject
"3"
.
Adding records for groups not in source (dataset_ref
)
Adding records for groups which are not in the source data can be
achieved by specifying a reference dataset by the dataset_ref
argument.
For example, specifying the input dataset for dataset_ref
below ensures
that new records are added also for subject without a valid analysis value:
derive_extreme_records(
adlb,
dataset_add = adlb,
filter_add = !is.na(AVAL),
dataset_ref = adlb,
by_vars = exprs(USUBJID),
order = exprs(AVISITN),
mode = "last",
set_values_to = exprs(
AVISITN = 99
)
)
#> # A tibble: 8 × 3
#> USUBJID AVISITN AVAL
#> <chr> <dbl> <dbl>
#> 1 1 1 113
#> 2 1 2 111
#> 3 2 1 101
#> 4 2 2 NA
#> 5 3 1 NA
#> 6 1 99 111
#> 7 2 99 101
#> 8 3 99 NA
Selecting variables for new records (keep_source_vars
)
Which variables from the source dataset are kept for the new records
can be specified by the keep_source_vars
argument. Variables specified by
by_vars
or set_values_to
don't need to be added to keep_source_vars
as these are always kept.
adlb <- tribble(
~USUBJID, ~AVISIT, ~AVAL, ~LBSEQ,
"1", "WEEK 1", 123, 1,
"1", "WEEK 2", 101, 2,
"2", "WEEK 1", 99, 1,
"2", "WEEK 2", 110, 2,
"2", "WEEK 3", 93, 3
)
derive_extreme_records(
dataset_add = adlb,
filter_add = !is.na(AVAL),
by_vars = exprs(USUBJID),
order = exprs(AVAL),
mode = "first",
keep_source_vars = exprs(AVAL),
set_values_to = exprs(
AVISIT = "MINIMUM"
)
)
#> # A tibble: 2 × 3
#> USUBJID AVISIT AVAL
#> <chr> <chr> <dbl>
#> 1 1 MINIMUM 101
#> 2 2 MINIMUM 93
Handling duplicates (check_type
)
The source records are checked regarding duplicates with respect to
by_vars
and order
. By default, a warning is issued if any duplicates
are found.
adlb <- tribble(
~USUBJID, ~AVISIT, ~AVAL,
"1", "WEEK 1", 123,
"1", "WEEK 2", 123,
"2", "WEEK 1", 99,
"2", "WEEK 2", 110,
"2", "WEEK 3", 93,
)
derive_extreme_records(
dataset_add = adlb,
filter_add = !is.na(AVAL),
by_vars = exprs(USUBJID),
order = exprs(AVAL),
mode = "first",
set_values_to = exprs(
AVISIT = "MINIMUM"
)
)
#> # A tibble: 2 × 3
#> USUBJID AVISIT AVAL
#> <chr> <chr> <dbl>
#> 1 1 MINIMUM 123
#> 2 2 MINIMUM 93
#> Warning: Dataset contains duplicate records with respect to `USUBJID` and `AVAL`
#> ℹ Run `admiral::get_duplicates_dataset()` to access the duplicate records
For investigating the issue the dataset of the duplicate source records
can be obtained by calling get_duplicates_dataset()
:
get_duplicates_dataset()
#> Duplicate records with respect to `USUBJID` and `AVAL`.
#> # A tibble: 2 × 3
#> USUBJID AVAL AVISIT
#> * <chr> <dbl> <chr>
#> 1 1 123 WEEK 1
#> 2 1 123 WEEK 2
Common options to solve the issue are:
Restricting the source records by specifying/updating the
filter_add
argument.Specifying additional variables for
order
.Setting
check_type = "none"
to ignore any duplicates.
In this example it doesn't matter which of the records with the minimum value is chosen because it doesn't affect the output dataset. Thus the third option is used:
derive_extreme_records(
dataset_add = adlb,
filter_add = !is.na(AVAL),
by_vars = exprs(USUBJID),
order = exprs(AVAL),
mode = "first",
check_type = "none",
set_values_to = exprs(
AVISIT = "MINIMUM"
)
)
#> # A tibble: 2 × 3
#> USUBJID AVISIT AVAL
#> <chr> <chr> <dbl>
#> 1 1 MINIMUM 123
#> 2 2 MINIMUM 93
Flagging existence of source records (exist_flag
, true_value
,
false_value
)
If the existence of a source record should be flagged, the exist_flag
argument can be specified. The specified variable is set to true_value
if
a source record exists. Otherwise, it is set to false_value
.
The dataset_ref
argument should be specified as otherwise all new
records originate from dataset_add
, i.e., exist_flag
would be set to
true_value
for all records.
adsl <- tribble(
~USUBJID, ~DTHDT,
"1", ymd("2022-05-13"),
"2", ymd(""),
"3", ymd("")
)
derive_extreme_records(
dataset_ref = adsl,
dataset_add = adsl,
by_vars = exprs(USUBJID),
filter_add = !is.na(DTHDT),
exist_flag = AVALC,
true_value = "Y",
false_value = "N",
set_values_to = exprs(
PARAMCD = "DEATH",
ADT = DTHDT
)
)
#> # A tibble: 3 × 5
#> USUBJID PARAMCD ADT DTHDT AVALC
#> <chr> <chr> <date> <date> <chr>
#> 1 1 DEATH 2022-05-13 2022-05-13 Y
#> 2 2 DEATH NA NA N
#> 3 3 DEATH NA NA N
Derive DTYPE = "LOV"
For each subject and parameter the last valid assessment (with respect
to AVISITN
and LBSEQ
) should be selected and added as a new record to
the input dataset. For the new records set AVISIT = "PBL LAST"
, AVISITN = 99
, and DTYPE = "LOV"
.
adlb <- tribble(
~USUBJID, ~AVISIT, ~AVISITN, ~PARAMCD, ~AVAL, ~LBSEQ,
"1", "BASELINE", 1, "ABC", 120, 1,
"1", "WEEK 1", 2, "ABC", 113, 2,
"1", "WEEK 1", 2, "ABC", 117, 3,
"2", "BASELINE", 1, "ABC", 101, 1,
"2", "WEEK 1", 2, "ABC", 101, 2,
"2", "WEEK 2", 3, "ABC", 95, 3,
"1", "BASELINE", 1, "DEF", 17, 1,
"1", "WEEK 1", 2, "DEF", NA, 2,
"1", "WEEK 1", 2, "DEF", 13, 3,
"2", "BASELINE", 1, "DEF", 9, 1,
"2", "WEEK 1", 2, "DEF", 10, 2,
"2", "WEEK 2", 3, "DEF", 12, 3
)
mutate(STUDYID = "XYZ", .before = USUBJID)
derive_extreme_records(
adlb,
dataset_add = adlb,
filter_add = !is.na(AVAL) & AVISIT != "BASELINE",
by_vars = exprs(!!!get_admiral_option("subject_keys"), PARAMCD),
order = exprs(AVISITN, LBSEQ),
mode = "last",
set_values_to = exprs(
AVISIT = "PBL LAST",
AVISITN = 99,
DTYPE = "LOV"
)
)
#> # A tibble: 16 × 8
#> STUDYID USUBJID AVISIT AVISITN PARAMCD AVAL LBSEQ DTYPE
#> <chr> <chr> <chr> <dbl> <chr> <dbl> <dbl> <chr>
#> 1 XYZ 1 BASELINE 1 ABC 120 1 <NA>
#> 2 XYZ 1 WEEK 1 2 ABC 113 2 <NA>
#> 3 XYZ 1 WEEK 1 2 ABC 117 3 <NA>
#> 4 XYZ 2 BASELINE 1 ABC 101 1 <NA>
#> 5 XYZ 2 WEEK 1 2 ABC 101 2 <NA>
#> 6 XYZ 2 WEEK 2 3 ABC 95 3 <NA>
#> 7 XYZ 1 BASELINE 1 DEF 17 1 <NA>
#> 8 XYZ 1 WEEK 1 2 DEF NA 2 <NA>
#> 9 XYZ 1 WEEK 1 2 DEF 13 3 <NA>
#> 10 XYZ 2 BASELINE 1 DEF 9 1 <NA>
#> 11 XYZ 2 WEEK 1 2 DEF 10 2 <NA>
#> 12 XYZ 2 WEEK 2 3 DEF 12 3 <NA>
#> 13 XYZ 1 PBL LAST 99 ABC 117 3 LOV
#> 14 XYZ 1 PBL LAST 99 DEF 13 3 LOV
#> 15 XYZ 2 PBL LAST 99 ABC 95 3 LOV
#> 16 XYZ 2 PBL LAST 99 DEF 12 3 LOV
Derive DTYPE = "MINIMUM"
For each subject and parameter the record with the minimum analysis
value should be selected and added as a new record to the input dataset. If
there are multiple records meeting the minimum, the first record with
respect to AVISIT
and LBSEQ
should be selected. For the new records set
AVISIT = "PBL MIN"
, AVISITN = 97
, and DTYPE = "MINIMUM"
.
derive_extreme_records(
adlb,
dataset_add = adlb,
filter_add = !is.na(AVAL) & AVISIT != "BASELINE",
by_vars = exprs(!!!get_admiral_option("subject_keys"), PARAMCD),
order = exprs(AVAL, AVISITN, LBSEQ),
mode = "first",
set_values_to = exprs(
AVISIT = "PBL MIN",
AVISITN = 97,
DTYPE = "MINIMUM"
)
)
#> # A tibble: 16 × 8
#> STUDYID USUBJID AVISIT AVISITN PARAMCD AVAL LBSEQ DTYPE
#> <chr> <chr> <chr> <dbl> <chr> <dbl> <dbl> <chr>
#> 1 XYZ 1 BASELINE 1 ABC 120 1 <NA>
#> 2 XYZ 1 WEEK 1 2 ABC 113 2 <NA>
#> 3 XYZ 1 WEEK 1 2 ABC 117 3 <NA>
#> 4 XYZ 2 BASELINE 1 ABC 101 1 <NA>
#> 5 XYZ 2 WEEK 1 2 ABC 101 2 <NA>
#> 6 XYZ 2 WEEK 2 3 ABC 95 3 <NA>
#> 7 XYZ 1 BASELINE 1 DEF 17 1 <NA>
#> 8 XYZ 1 WEEK 1 2 DEF NA 2 <NA>
#> 9 XYZ 1 WEEK 1 2 DEF 13 3 <NA>
#> 10 XYZ 2 BASELINE 1 DEF 9 1 <NA>
#> 11 XYZ 2 WEEK 1 2 DEF 10 2 <NA>
#> 12 XYZ 2 WEEK 2 3 DEF 12 3 <NA>
#> 13 XYZ 1 PBL MIN 97 ABC 113 2 MINIMUM
#> 14 XYZ 1 PBL MIN 97 DEF 13 3 MINIMUM
#> 15 XYZ 2 PBL MIN 97 ABC 95 3 MINIMUM
#> 16 XYZ 2 PBL MIN 97 DEF 10 2 MINIMUM
Derive DTYPE = "MAXIMUM"
For each subject and parameter the record with the maximum analysis
value should be selected and added as a new record to the input dataset. If
there are multiple records meeting the maximum, the first record with
respect to AVISIT
and LBSEQ
should be selected. For the new records set
AVISIT = "PBL MAX"
, AVISITN = 98
, and DTYPE = "MAXIMUM"
.
derive_extreme_records(
adlb,
dataset_add = adlb,
filter_add = !is.na(AVAL) & AVISIT != "BASELINE",
by_vars = exprs(!!!get_admiral_option("subject_keys"), PARAMCD),
order = exprs(desc(AVAL), AVISITN, LBSEQ),
mode = "first",
set_values_to = exprs(
AVISIT = "PBL MAX",
AVISITN = 99,
DTYPE = "MAXIMUM"
)
)
#> # A tibble: 16 × 8
#> STUDYID USUBJID AVISIT AVISITN PARAMCD AVAL LBSEQ DTYPE
#> <chr> <chr> <chr> <dbl> <chr> <dbl> <dbl> <chr>
#> 1 XYZ 1 BASELINE 1 ABC 120 1 <NA>
#> 2 XYZ 1 WEEK 1 2 ABC 113 2 <NA>
#> 3 XYZ 1 WEEK 1 2 ABC 117 3 <NA>
#> 4 XYZ 2 BASELINE 1 ABC 101 1 <NA>
#> 5 XYZ 2 WEEK 1 2 ABC 101 2 <NA>
#> 6 XYZ 2 WEEK 2 3 ABC 95 3 <NA>
#> 7 XYZ 1 BASELINE 1 DEF 17 1 <NA>
#> 8 XYZ 1 WEEK 1 2 DEF NA 2 <NA>
#> 9 XYZ 1 WEEK 1 2 DEF 13 3 <NA>
#> 10 XYZ 2 BASELINE 1 DEF 9 1 <NA>
#> 11 XYZ 2 WEEK 1 2 DEF 10 2 <NA>
#> 12 XYZ 2 WEEK 2 3 DEF 12 3 <NA>
#> 13 XYZ 1 PBL MAX 99 ABC 117 3 MAXIMUM
#> 14 XYZ 1 PBL MAX 99 DEF 13 3 MAXIMUM
#> 15 XYZ 2 PBL MAX 99 ABC 101 2 MAXIMUM
#> 16 XYZ 2 PBL MAX 99 DEF 12 3 MAXIMUM
Derive DTYPE = "WOC"
or DTYPE = "BOC"
For each subject and parameter the record with the worst analysis
value should be selected and added as a new record to the input dataset.
The worst value is either the minimum or maximum value depending on the
parameter. If there are multiple records meeting the worst value, the first
record with respect to AVISIT
and LBSEQ
should be selected. For the new
records set AVISIT = "PBL WORST"
, AVISITN = 96
, and DTYPE = "WOC"
.
Here the maximum is considered worst for PARAMCD = "ABC"
and the minimum
for PARAMCD = "DEF"
.
derive_extreme_records(
adlb,
dataset_add = adlb,
filter_add = !is.na(AVAL) & AVISIT != "BASELINE",
by_vars = exprs(!!!get_admiral_option("subject_keys"), PARAMCD),
order = exprs(
if_else(PARAMCD == "ABC", desc(AVAL), AVAL),
AVISITN, LBSEQ
),
mode = "first",
set_values_to = exprs(
AVISIT = "PBL WORST",
AVISITN = 96,
DTYPE = "WOC"
)
)
#> # A tibble: 16 × 8
#> STUDYID USUBJID AVISIT AVISITN PARAMCD AVAL LBSEQ DTYPE
#> <chr> <chr> <chr> <dbl> <chr> <dbl> <dbl> <chr>
#> 1 XYZ 1 BASELINE 1 ABC 120 1 <NA>
#> 2 XYZ 1 WEEK 1 2 ABC 113 2 <NA>
#> 3 XYZ 1 WEEK 1 2 ABC 117 3 <NA>
#> 4 XYZ 2 BASELINE 1 ABC 101 1 <NA>
#> 5 XYZ 2 WEEK 1 2 ABC 101 2 <NA>
#> 6 XYZ 2 WEEK 2 3 ABC 95 3 <NA>
#> 7 XYZ 1 BASELINE 1 DEF 17 1 <NA>
#> 8 XYZ 1 WEEK 1 2 DEF NA 2 <NA>
#> 9 XYZ 1 WEEK 1 2 DEF 13 3 <NA>
#> 10 XYZ 2 BASELINE 1 DEF 9 1 <NA>
#> 11 XYZ 2 WEEK 1 2 DEF 10 2 <NA>
#> 12 XYZ 2 WEEK 2 3 DEF 12 3 <NA>
#> 13 XYZ 1 PBL WORST 96 ABC 117 3 WOC
#> 14 XYZ 1 PBL WORST 96 DEF 13 3 WOC
#> 15 XYZ 2 PBL WORST 96 ABC 101 2 WOC
#> 16 XYZ 2 PBL WORST 96 DEF 10 2 WOC
Derive a parameter for the first disease progression (PD)
For each subject in the ADSL
dataset a new parameter should be added
to the input dataset which indicates whether disease progression (PD)
occurred (set AVALC = "Y"
, AVAL = 1
) or not (set AVALC = "N"
, AVAL = 0
). For the new parameter set PARAMCD = "PD"
and PARAM = "Disease Progression"
.
adsl <- tribble(
~USUBJID, ~DTHDT,
"1", ymd("2022-05-13"),
"2", ymd(""),
"3", ymd("")
)
mutate(STUDYID = "XX1234")
adrs <- tribble(
~USUBJID, ~RSDTC, ~AVALC, ~AVAL,
"1", "2020-01-02", "PR", 2,
"1", "2020-02-01", "CR", 1,
"1", "2020-03-01", "CR", 1,
"2", "2021-06-15", "SD", 3,
"2", "2021-07-16", "PD", 4,
"2", "2021-09-14", "PD", 4
)
mutate(
STUDYID = "XX1234", .before = USUBJID
)
mutate(
ADT = ymd(RSDTC),
PARAMCD = "OVR",
PARAM = "Overall Response",
.after = RSDTC
)
derive_extreme_records(
adrs,
dataset_ref = adsl,
dataset_add = adrs,
by_vars = get_admiral_option("subject_keys"),
filter_add = PARAMCD == "OVR" & AVALC == "PD",
order = exprs(ADT),
exist_flag = AVALC,
true_value = "Y",
false_value = "N",
mode = "first",
set_values_to = exprs(
PARAMCD = "PD",
PARAM = "Disease Progression",
AVAL = yn_to_numeric(AVALC),
)
)
#> # A tibble: 9 × 8
#> STUDYID USUBJID RSDTC ADT PARAMCD PARAM AVALC AVAL
#> <chr> <chr> <chr> <date> <chr> <chr> <chr> <dbl>
#> 1 XX1234 1 2020-01-02 2020-01-02 OVR Overall Response PR 2
#> 2 XX1234 1 2020-02-01 2020-02-01 OVR Overall Response CR 1
#> 3 XX1234 1 2020-03-01 2020-03-01 OVR Overall Response CR 1
#> 4 XX1234 2 2021-06-15 2021-06-15 OVR Overall Response SD 3
#> 5 XX1234 2 2021-07-16 2021-07-16 OVR Overall Response PD 4
#> 6 XX1234 2 2021-09-14 2021-09-14 OVR Overall Response PD 4
#> 7 XX1234 2 2021-07-16 2021-07-16 PD Disease Progression Y 1
#> 8 XX1234 1 <NA> NA PD Disease Progression N 0
#> 9 XX1234 3 <NA> NA PD Disease Progression N 0
Derive parameter indicating death
For each subject in the ADSL
dataset a new parameter should be
created which indicates whether the subject died (set AVALC = "Y"
, AVAL = 1
) or not (set AVALC = "N"
, AVAL = 0
). For the new parameter set
PARAMCD = "DEATH"
, PARAM = "Death"
, and ADT
to the date of death
(DTHDT
).
derive_extreme_records(
dataset_ref = adsl,
dataset_add = adsl,
by_vars = exprs(STUDYID, USUBJID),
filter_add = !is.na(DTHDT),
exist_flag = AVALC,
true_value = "Y",
false_value = "N",
mode = "first",
keep_source_vars = exprs(AVALC),
set_values_to = exprs(
PARAMCD = "DEATH",
PARAM = "Death",
ADT = DTHDT
)
)
#> # A tibble: 3 × 6
#> STUDYID USUBJID PARAMCD PARAM ADT AVALC
#> <chr> <chr> <chr> <chr> <date> <chr>
#> 1 XX1234 1 DEATH Death 2022-05-13 Y
#> 2 XX1234 2 DEATH Death NA N
#> 3 XX1234 3 DEATH Death NA N
The keep_source_vars
argument is specified to avoid that all ADLS
variables (like DTHDT
) are copied to the parameter.