Add a time-to-event parameter to the input dataset.
Usage
derive_param_tte(
dataset = NULL,
dataset_adsl,
source_datasets,
by_vars = NULL,
start_date = TRTSDT,
end_dates = NULL,
event_conditions,
censor_conditions = NULL,
event_type = "negative",
create_datetime = FALSE,
set_values_to,
subject_keys = get_admiral_option("subject_keys"),
check_type = "warning"
)Arguments
- dataset
-
Input dataset
PARAMCDis expected.- Permitted values
a dataset, i.e., a
data.frameor tibble- Default value
NULL
- dataset_adsl
-
ADSL input dataset
The variables specified for
start_date, andsubject_keysare expected.- Permitted values
a dataset, i.e., a
data.frameor tibble- Default value
none
- source_datasets
-
Source datasets
A named list of datasets is expected. The
dataset_namefield oftte_source()refers to the dataset provided in the list.- Permitted values
named list of datasets, e.g.,
list(adsl = adsl, ae = ae)- Default value
none
- by_vars
-
By variables
If the parameter is specified, separate time to event parameters are derived for each by group.
The by variables must be in at least one of the source datasets. Each source dataset must contain either all by variables or none of the by variables.
The by variables are not included in the output dataset.
- Permitted values
list of variables created by
exprs(), e.g.,exprs(USUBJID, VISIT)- Default value
NULL
- start_date
-
Time to event origin date
The variable
STARTDTis set to the specified date. The value is taken from the ADSL dataset.If the event or censoring date is before the origin date,
ADTis set to the origin date.- Permitted values
a date or datetime variable
- Default value
TRTSDT
- end_dates
-
Time to event end date(s)
A list of
censor_source()objects is expected. Each date restricts the observation period for time-to-event analysis. For each subject the earliest date across allend_datesis used as the end date for that subject. The records defined byevent_conditionsandcensor_conditionsare restricted to dates before or equal to the selected end date.This argument should be specified if there is more than one reason for stopping the observation of a subject, e.g., end of study, death, or intercurrent events like start of new drug. If there is only one reason for stopping the observation, it is sufficient to just include this as a censoring condition in
censor_conditions.See the Differentiating censoring reasons and Differentiating censoring date description examples to see how the values (
set_value_to) set by end dates interact with the values set by the censoring conditions.- Permitted values
a list of source objects, e.g.,
list(pd, death)- Default value
NULL
- event_conditions
-
Sources and conditions defining events
A list of
event_source()objects is expected.- Permitted values
a list of source objects, e.g.,
list(pd, death)- Default value
none
- censor_conditions
-
Sources and conditions defining censorings
A list of
censor_source()objects is expected. Each record defined by thecensor_conditionsshould be a possible censoring time, i.e., at this time it should be known that the event of interest has not yet occurred. Assessment where it is not known whether the event of interest has occurred or not should not be included as censoring times, e.g., when an assessment was not evaluable.It is acceptable to include records with event as long as these are also included in
event_conditionsbecause events take precedence over censorings.- Permitted values
a list of source objects, e.g.,
list(pd, death)- Default value
NULL
- event_type
-
Type of event
For events that are considered unfavorable, e.g., adverse events, progression, worsening, etc., the value should be
"negative"and for events that are considered favorable, e.g., response to treatment, improvement, etc., the value should be"positive".If
event_typeis specified as"positive", the objects specified forend_datesare added to the censoring conditions (censor_conditions). I.e., if a subject is censored, it is censored at the earliest date provided byend_dates.- Permitted values
"negative","positive"- Default value
"negative"
- create_datetime
-
Create datetime variables?
If set to
TRUE, variablesADTMandSTARTDTMare created. Otherwise, variablesADTandSTARTDTare created.- Permitted values
TRUE,FALSE- Default value
FALSE
- set_values_to
-
Variables to set
A named list returned by
exprs()defining the variables to be set for the new parameter, e.g.exprs(PARAMCD = "OS", PARAM = "Overall Survival")is expected. The values must be symbols, character strings, numeric values, expressions, orNA.- Permitted values
list of named expressions created by a formula using
exprs(), e.g.,exprs(AVALC = VSSTRESC, AVAL = yn_to_numeric(AVALC))- Default value
none
- subject_keys
-
Variables to uniquely identify a subject
A list of symbols created using
exprs()is expected.- Permitted values
list of variables created by
exprs(), e.g.,exprs(USUBJID, VISIT)- Default value
get_admiral_option("subject_keys")
- check_type
-
Check uniqueness
If
"warning","message", or"error"is specified, the specified message is issued if the observations of the source datasets are not unique with respect to the by variables and the date and order specified in theevent_source()andcensor_source()objects.- Permitted values
"none","message","warning","error"- Default value
"warning"
Details
The following steps are performed to create the observations of the new parameter:
Deriving the events:
For each event source dataset the observations as specified by the
filterelement are selected. If theend_datesargument is specified, records after the first of the end dates are excluded. Then for each subject the first observation (with respect todateandorder) is selected.The
ADTvariable is set to the variable specified by thedateelement. If the date variable is a datetime variable, only the datepart is copied.The
CNSRvariable is added and set to thecensorelement.The variables specified by the
set_values_toelement are added.The selected observations of all event source datasets are combined into a single dataset.
For each subject the first observation (with respect to the
ADT/ADTMvariable) from the single dataset is selected. If there is more than one event with the same date, the first event with respect to the order of events inevent_conditionsis selected.
Deriving the censoring observations:
-
For each censoring source dataset:
The observations as specified by the
filterelement are selected.If the
end_datesargument is specified, records after the first of the end dates are excluded and the variables defined by theset_values_toelement of the first end date are added.If
event_type = "positive"and theend_datesargument is specified, new records with the first end date and the variables defined by theset_values_toelement are added. (These will be selected in the next step, i.e., for positive events the first end date is used as censoring date.)Then for each subject the last observation (with respect to
dateandorder) is selected.
The
ADTvariable is set to the variable specified by thedateelement. If the date variable is a datetime variable, only the datepart is copied.The
CNSRvariable is added. If theend_datesargument is specified and theconsider_end_dateselement isTRUE, it is set to the censor value of the first of the end dates. Otherwise, it set to thecensorelement.The variables specified by the
set_values_toelement are added.The selected observations of all censoring source datasets are combined into a single dataset.
For each subject the last observation (with respect to the
ADT/ADTMvariable) from the single dataset is selected. If there is more than one censoring with the same date, the last censoring with respect to the order of censorings incensor_conditionsis selected.
For each subject (as defined by the subject_keys parameter) an
observation is selected. If an event is available, the event observation is
selected. Otherwise the censoring observation is selected.
Finally:
The variable specified for
start_dateis joined from the ADSL dataset. Only subjects in both datasets are kept, i.e., subjects with both an event or censoring and an observation indataset_adsl.The variables as defined by the
set_values_toparameter are added.The
ADT/ADTMvariable is set to the maximum ofADT/ADTMandSTARTDT/STARTDTM(depending on thecreate_datetimeparameter).The new observations are added to the output dataset.
Examples
Add a basic time to event parameter
For each subject the time to first adverse event should be created as a parameter.
The event source object is created using
event_source()and the date is set to adverse event start date.The censor source object is created using
censor_source()and the date is set to end of study date.The event and censor source objects are then passed to
derive_param_tte()to derive the time to event parameter with the provided parameter descriptions (PARAMCDandPARAM).Note the values of the censor variable (
CNSR) that are derived below, where the first subject has an event and the second does not.
library(tibble)
library(dplyr, warn.conflicts = FALSE)
library(lubridate, warn.conflicts = FALSE)
adsl <- tribble(
~USUBJID, ~TRTSDT, ~EOSDT, ~NEWDRGDT,
"01", ymd("2020-12-06"), ymd("2021-03-06"), NA,
"02", ymd("2021-01-16"), ymd("2021-02-03"), ymd("2021-01-03")
) %>%
mutate(STUDYID = "AB42")
adae <- tribble(
~USUBJID, ~ASTDT, ~AESEQ, ~AEDECOD,
"01", ymd("2021-01-03"), 1, "Flu",
"01", ymd("2021-03-04"), 2, "Cough",
"01", ymd("2021-03-05"), 3, "Cough"
) %>%
mutate(STUDYID = "AB42")
ttae <- event_source(
dataset_name = "adae",
date = ASTDT,
set_values_to = exprs(
EVNTDESC = "AE",
SRCDOM = "ADAE",
SRCVAR = "ASTDT",
SRCSEQ = AESEQ
)
)
eos <- censor_source(
dataset_name = "adsl",
date = EOSDT,
set_values_to = exprs(
EVNTDESC = "END OF STUDY",
SRCDOM = "ADSL",
SRCVAR = "EOSDT"
)
)
derive_param_tte(
dataset_adsl = adsl,
event_conditions = list(ttae),
censor_conditions = list(eos),
source_datasets = list(adsl = adsl, adae = adae),
set_values_to = exprs(
PARAMCD = "TTAE",
PARAM = "Time to First Adverse Event"
)
) %>%
select(USUBJID, STARTDT, PARAMCD, PARAM, ADT, CNSR, SRCSEQ)
#> # A tibble: 2 × 7
#> USUBJID STARTDT PARAMCD PARAM ADT CNSR SRCSEQ
#> <chr> <date> <chr> <chr> <date> <int> <dbl>
#> 1 01 2020-12-06 TTAE Time to First Adverse Event 2021-01-03 0 1
#> 2 02 2021-01-16 TTAE Time to First Adverse Event 2021-02-03 1 NA
Adding a by variable (by_vars)
By variables can be added using the by_vars argument, e.g., now for
each subject the time to first occurrence of each adverse event preferred
term (AEDECOD) should be created as parameters.
Please note that CDISC requires separate parameters (PARAMCD, PARAM) for
the by groups. Therefore the variables specified for the by_vars parameter
are not included in the output dataset. The PARAMCD variable should be
specified for the set_value_to parameter using an expression on the right
hand side which results in a unique value for each by group. If the values of
the by variables should be included in the output dataset, they can be stored
in PARCATy variables.
derive_param_tte(
dataset_adsl = adsl,
by_vars = exprs(AEDECOD),
event_conditions = list(ttae),
censor_conditions = list(eos),
source_datasets = list(adsl = adsl, adae = adae),
set_values_to = exprs(
PARAMCD = paste0("TTAE", as.numeric(as.factor(AEDECOD))),
PARAM = paste("Time to First", AEDECOD, "Adverse Event")
)
) %>%
select(USUBJID, STARTDT, PARAMCD, PARAM, ADT, CNSR, SRCSEQ)
#> # A tibble: 4 × 7
#> USUBJID STARTDT PARAMCD PARAM ADT CNSR SRCSEQ
#> <chr> <date> <chr> <chr> <date> <int> <dbl>
#> 1 01 2020-12-06 TTAE1 Time to First Cough Advers… 2021-03-04 0 2
#> 2 01 2020-12-06 TTAE2 Time to First Flu Adverse … 2021-01-03 0 1
#> 3 02 2021-01-16 TTAE1 Time to First Cough Advers… 2021-02-03 1 NA
#> 4 02 2021-01-16 TTAE2 Time to First Flu Adverse … 2021-02-03 1 NA
Handling duplicates (check_type)
The source records are checked regarding duplicates with respect to the
by variables and the date and order specified in the source objects.
By default, a warning is issued if any duplicates are found.
Note here how after creating a new adverse event dataset containing a
duplicate date for "Cough", it was then passed to the function using the
source_datasets argument - where you see below adae = adae_dup.
adae_dup <- tribble(
~USUBJID, ~ASTDT, ~AESEQ, ~AEDECOD, ~AESER,
"01", ymd("2021-01-03"), 1, "Flu", "Y",
"01", ymd("2021-03-04"), 2, "Cough", "N",
"01", ymd("2021-03-04"), 3, "Cough", "Y"
) %>%
mutate(STUDYID = "AB42")
derive_param_tte(
dataset_adsl = adsl,
by_vars = exprs(AEDECOD),
start_date = TRTSDT,
source_datasets = list(adsl = adsl, adae = adae_dup),
event_conditions = list(ttae),
censor_conditions = list(eos),
set_values_to = exprs(
PARAMCD = paste0("TTAE", as.numeric(as.factor(AEDECOD))),
PARAM = paste("Time to First", AEDECOD, "Adverse Event")
)
)
#> # A tibble: 4 × 11
#> USUBJID STUDYID EVNTDESC SRCDOM SRCVAR SRCSEQ CNSR ADT STARTDT
#> <chr> <chr> <chr> <chr> <chr> <dbl> <int> <date> <date>
#> 1 01 AB42 AE ADAE ASTDT 2 0 2021-03-04 2020-12-06
#> 2 01 AB42 AE ADAE ASTDT 1 0 2021-01-03 2020-12-06
#> 3 02 AB42 END OF STUDY ADSL EOSDT NA 1 2021-02-03 2021-01-16
#> 4 02 AB42 END OF STUDY ADSL EOSDT NA 1 2021-02-03 2021-01-16
#> # i 2 more variables: PARAMCD <chr>, PARAM <chr>
#> Warning: Dataset "adae" contains duplicate records with respect to `STUDYID`, `USUBJID`,
#> `AEDECOD`, and `ASTDT`
#> i Run `admiral::get_duplicates_dataset()` to access the duplicate records
For investigating the issue, the dataset of the duplicate source records
can be obtained by calling get_duplicates_dataset():
get_duplicates_dataset()
#> Duplicate records with respect to `STUDYID`, `USUBJID`, `AEDECOD`, and `ASTDT`.
#> # A tibble: 2 × 6
#> STUDYID USUBJID AEDECOD ASTDT AESEQ AESER
#> * <chr> <chr> <chr> <date> <dbl> <chr>
#> 1 AB42 01 Cough 2021-03-04 2 N
#> 2 AB42 01 Cough 2021-03-04 3 Y
Common options to solve the issue:
Restricting the source records by specifying/updating the
filterargument in theevent_source()/censor_source()calls.Specifying additional variables for
orderin theevent_source()/censor_source()calls.Setting
check_type = "none"in thederive_param_tte()call to ignore any duplicates.
In this example it does not have significant impact which record is chosen as
the dates are the same so the time to event derivation will be the same, but
it does impact SRCSEQ in the output dataset, so here the second option is
used.
Note here how you can also define source objects from within the derive_param_tte()
function call itself.
derive_param_tte(
dataset_adsl = adsl,
by_vars = exprs(AEDECOD),
start_date = TRTSDT,
source_datasets = list(adsl = adsl, adae = adae_dup),
event_conditions = list(event_source(
dataset_name = "adae",
date = ASTDT,
set_values_to = exprs(
EVNTDESC = "AE",
SRCDOM = "ADAE",
SRCVAR = "ASTDT",
SRCSEQ = AESEQ
),
order = exprs(AESEQ)
)),
censor_conditions = list(eos),
set_values_to = exprs(
PARAMCD = paste0("TTAE", as.numeric(as.factor(AEDECOD))),
PARAM = paste("Time to First", AEDECOD, "Adverse Event")
)
) %>%
select(USUBJID, STARTDT, PARAMCD, PARAM, ADT, CNSR, SRCSEQ)
#> # A tibble: 4 × 7
#> USUBJID STARTDT PARAMCD PARAM ADT CNSR SRCSEQ
#> <chr> <date> <chr> <chr> <date> <int> <dbl>
#> 1 01 2020-12-06 TTAE1 Time to First Cough Advers… 2021-03-04 0 2
#> 2 01 2020-12-06 TTAE2 Time to First Flu Adverse … 2021-01-03 0 1
#> 3 02 2021-01-16 TTAE1 Time to First Cough Advers… 2021-02-03 1 NA
#> 4 02 2021-01-16 TTAE2 Time to First Flu Adverse … 2021-02-03 1 NA
Filtering source records (filter)
The first option from above could have been achieved using filter, for
example here only using serious adverse events.
derive_param_tte(
dataset_adsl = adsl,
by_vars = exprs(AEDECOD),
start_date = TRTSDT,
source_datasets = list(adsl = adsl, adae = adae_dup),
event_conditions = list(event_source(
dataset_name = "adae",
filter = AESER == "Y",
date = ASTDT,
set_values_to = exprs(
EVNTDESC = "Serious AE",
SRCDOM = "ADAE",
SRCVAR = "ASTDT",
SRCSEQ = AESEQ
)
)),
censor_conditions = list(eos),
set_values_to = exprs(
PARAMCD = paste0("TTSAE", as.numeric(as.factor(AEDECOD))),
PARAM = paste("Time to First Serious", AEDECOD, "Adverse Event")
)
) %>%
select(USUBJID, STARTDT, PARAMCD, PARAM, ADT, CNSR, SRCSEQ)
#> # A tibble: 4 × 7
#> USUBJID STARTDT PARAMCD PARAM ADT CNSR SRCSEQ
#> <chr> <date> <chr> <chr> <date> <int> <dbl>
#> 1 01 2020-12-06 TTSAE1 Time to First Serious Coug… 2021-03-04 0 3
#> 2 01 2020-12-06 TTSAE2 Time to First Serious Flu … 2021-01-03 0 1
#> 3 02 2021-01-16 TTSAE1 Time to First Serious Coug… 2021-02-03 1 NA
#> 4 02 2021-01-16 TTSAE2 Time to First Serious Flu … 2021-02-03 1 NA
Using multiple event/censor conditions (event_conditions
/censor_conditions)
In the above examples, we only have a single event and single censor
condition. Here, we now consider multiple conditions for each passed using
event_conditions and censor_conditions.
For the event we are going to use first AE and additionally check a lab condition, and for the censor we'll add in treatment start date in case end of study date was ever missing.
adlb <- tribble(
~USUBJID, ~ADT, ~PARAMCD, ~ANRIND,
"01", ymd("2020-12-22"), "HGB", "LOW"
) %>%
mutate(STUDYID = "AB42")
low_hgb <- event_source(
dataset_name = "adlb",
filter = PARAMCD == "HGB" & ANRIND == "LOW",
date = ADT,
set_values_to = exprs(
EVNTDESC = "POSSIBLE ANEMIA",
SRCDOM = "ADLB",
SRCVAR = "ADT"
)
)
trt_start <- censor_source(
dataset_name = "adsl",
date = TRTSDT,
set_values_to = exprs(
EVNTDESC = "TREATMENT START",
SRCDOM = "ADSL",
SRCVAR = "TRTSDT"
)
)
derive_param_tte(
dataset_adsl = adsl,
event_conditions = list(ttae, low_hgb),
censor_conditions = list(eos, trt_start),
source_datasets = list(adsl = adsl, adae = adae, adlb = adlb),
set_values_to = exprs(
PARAMCD = "TTAELB",
PARAM = "Time to First Adverse Event or Possible Anemia (Labs)"
)
) %>%
select(USUBJID, STARTDT, PARAMCD, PARAM, ADT, CNSR, SRCSEQ)
#> # A tibble: 2 × 7
#> USUBJID STARTDT PARAMCD PARAM ADT CNSR SRCSEQ
#> <chr> <date> <chr> <chr> <date> <int> <dbl>
#> 1 01 2020-12-06 TTAELB Time to First Adverse Even… 2020-12-22 0 NA
#> 2 02 2021-01-16 TTAELB Time to First Adverse Even… 2021-02-03 1 NA
Note above how the earliest event date is always taken and the latest censor date.
End of the observation period (end_dates)
The end_dates argument can be used to specify the end of the
observation period if there is more than one date which restricts the
observation period, e.g., end of study date and intercurrent events like
new drug date. The earliest date is used as the end of the observation
period and events/censorings occurring after this date are not considered.
In the example two censor_source() objects are defined, eos and newdrg,
for end of study date and new drug date, respectively, and then passed to the
end_dates argument.
adsl <- tribble(
~USUBJID, ~TRTSDT, ~EOSDT, ~NEWDRGDT,
"01", ymd("2020-12-06"), ymd("2021-03-06"), NA,
"02", ymd("2021-01-16"), ymd("2021-04-03"), ymd("2021-03-21"),
"03", ymd("2021-02-01"), NA, NA,
"04", ymd("2021-03-10"), NA, NA
) %>%
mutate(STUDYID = "AB42")
adqs <- tribble(
~USUBJID, ~ADT, ~CHG,
"01", ymd("2021-01-03"), 5,
"01", ymd("2021-02-03"), -2,
"01", ymd("2021-03-01"), NA,
"01", ymd("2021-03-07"), 10,
"02", ymd("2021-01-03"), 4,
"02", ymd("2021-02-03"), -1,
"02", ymd("2021-04-01"), -12,
"03", ymd("2021-02-15"), 3,
"03", ymd("2021-03-15"), -15
) %>%
mutate(STUDYID = "AB42")
eos <- censor_source(
dataset_name = "adsl",
date = EOSDT
)
newdrg <- censor_source(
dataset_name = "adsl",
date = NEWDRGDT
)
# Note to user: The source function has changed.
worsening <- event_source(
dataset_name = "adqs",
date = ADT,
filter = CHG <= -10
)
valid_assessment <- censor_source(
dataset_name = "adqs",
date = ADT,
filter = !is.na(CHG)
)
no_assessment <- censor_source(
dataset_name = "adsl",
date = TRTSDT
)
derive_param_tte(
dataset_adsl = adsl,
source_datasets = list(adsl = adsl, adqs = adqs),
start_date = TRTSDT,
end_dates = list(eos, newdrg),
event_conditions = list(worsening),
censor_conditions = list(valid_assessment, no_assessment),
set_values_to = exprs(PARAMCD = "TTWORSE")
) %>%
select(-STUDYID, -PARAMCD) %>%
derive_vars_merged(
dataset_add = adsl,
by_vars = exprs(USUBJID),
new_vars = exprs(EOSDT, NEWDRGDT)
)
#> # A tibble: 4 × 6
#> USUBJID ADT CNSR STARTDT EOSDT NEWDRGDT
#> <chr> <date> <int> <date> <date> <date>
#> 1 01 2021-02-03 1 2020-12-06 2021-03-06 NA
#> 2 02 2021-02-03 1 2021-01-16 2021-04-03 2021-03-21
#> 3 03 2021-03-15 0 2021-02-01 NA NA
#> 4 04 2021-03-10 1 2021-03-10 NA NA
Please note that:
subject
02has no event because the assessment withCHG = -12was excluded as it is after the start of a new drug.subject
01and02are censored at the last valid assessment before the end of the observation period.
Positive event (event_type)
If positive events like response or improvement are analyzed,
event_type = "positive" should be used. Subjects without events are
censored at the end of the observation period (defined by end_dates)
instead of the last assessment. For positive events this is the more
conservative approach.
adsl <- tribble(
~USUBJID, ~TRTSDT, ~EOSDT, ~NEWDRGDT,
"01", ymd("2020-12-06"), ymd("2021-03-06"), NA,
"02", ymd("2021-01-16"), ymd("2021-04-03"), ymd("2021-03-21"),
"03", ymd("2021-02-01"), NA, NA
) %>%
mutate(STUDYID = "AB42")
adqs <- tribble(
~USUBJID, ~ADT, ~CHG,
"01", ymd("2021-01-03"), 5,
"01", ymd("2021-02-03"), -2,
"01", ymd("2021-03-01"), NA,
"01", ymd("2021-03-07"), 10,
"02", ymd("2021-01-03"), 4,
"02", ymd("2021-02-03"), -1,
"02", ymd("2021-04-01"), -12,
"03", ymd("2021-02-15"), 3,
"03", ymd("2021-03-15"), 15
) %>%
mutate(STUDYID = "AB42")
eos <- censor_source(
dataset_name = "adsl",
date = EOSDT
)
newdrg <- censor_source(
dataset_name = "adsl",
date = NEWDRGDT
)
improvement <- event_source(
dataset_name = "adqs",
date = ADT,
filter = CHG >= 10
)
valid_assessment <- censor_source(
dataset_name = "adqs",
date = ADT,
filter = !is.na(CHG)
)
derive_param_tte(
dataset_adsl = adsl,
source_datasets = list(adsl = adsl, adqs = adqs),
start_date = TRTSDT,
end_dates = list(eos, newdrg),
event_conditions = list(improvement),
censor_conditions = list(valid_assessment),
event_type = "positive",
set_values_to = exprs(PARAMCD = "TTIMPROV")
) %>%
select(-STUDYID) %>%
derive_vars_merged(
dataset_add = adsl,
by_vars = exprs(USUBJID),
new_vars = exprs(EOSDT, NEWDRGDT)
)
#> # A tibble: 3 × 7
#> USUBJID ADT CNSR STARTDT PARAMCD EOSDT NEWDRGDT
#> <chr> <date> <int> <date> <chr> <date> <date>
#> 1 01 2021-03-06 1 2020-12-06 TTIMPROV 2021-03-06 NA
#> 2 02 2021-03-21 1 2021-01-16 TTIMPROV 2021-04-03 2021-03-21
#> 3 03 2021-03-15 0 2021-02-01 TTIMPROV NA NA
Please note that subject 01 and 02 are censored at the end of the
observation period instead of at the last assessment.
Differentiating censoring reasons
There are three ADaM variables which allow to differentiate censoring reasons:
CNSR: different values>1can be used to differentiate censoring reasons, e.g.,1for end of study,2for new drug, etc.EVNTDESC: description of the event or censoring, e.g.,"END OF STUDY".CNSDTDSC: description of the date used for censoring when different from the censoring event, e.g.,"LAST ASSESSMENT"if the censoring event is the end of the study but the censoring date is not the end of study date but the last assessment date before the end of the study.
In the example five censoring events are considered:
end of study (
eos,no_worsening)start of a new drug (
newdrg,no_worsening)no post-baseline assessments (
no_post_baseline)no baseline assessment (
no_baseline)no assessments (
no_assessments)
In the example data, the first five subjects have the five censoring events, respectively, and the sixth subject has an event.
adsl <- tribble(
~USUBJID, ~TRTSDT, ~EOSDT, ~NEWDRGDT,
"01", ymd("2020-12-06"), ymd("2021-03-06"), NA,
"02", ymd("2021-01-16"), ymd("2021-04-03"), ymd("2021-03-21"),
"03", ymd("2021-03-10"), NA, NA,
"04", ymd("2021-04-02"), NA, NA,
"05", ymd("2021-05-09"), NA, NA,
"06", ymd("2021-02-01"), NA, NA
) %>%
mutate(STUDYID = "AB42")
adqs <- tribble(
~USUBJID, ~ADT, ~CHG, ~ABLFL,
"01", ymd("2021-12-06"), 0, "Y",
"01", ymd("2021-02-03"), -2, NA,
"01", ymd("2021-03-01"), NA, NA,
"01", ymd("2021-03-07"), 10, NA,
"02", ymd("2021-01-16"), 0, "Y",
"02", ymd("2021-02-03"), -1, NA,
"02", ymd("2021-04-01"), -12, NA,
"03", ymd("2021-03-20"), NA, NA,
"03", ymd("2021-04-07"), NA, NA,
"04", ymd("2021-04-02"), 0, "Y",
"06", ymd("2021-02-01"), 0, "Y",
"06", ymd("2021-03-15"), -15, NA
) %>%
mutate(STUDYID = "AB42") %>%
derive_vars_merged(
dataset_add = adsl,
by_vars = exprs(USUBJID),
new_vars = exprs(TRTSDT)
)
The eos and newdrg censoring events define the end dates.
eos <- censor_source(
dataset_name = "adsl",
date = EOSDT,
censor = 1,
set_values_to = exprs(
EVNTDESC = "END OF STUDY"
)
)
newdrg <- censor_source(
dataset_name = "adsl",
date = NEWDRGDT,
censor = 2,
set_values_to = exprs(
EVNTDESC = "NEW DRUG"
)
)
The worsening and valid_assessment events define which assessments
are events and which are valid assessments, respectively. The EVNTDESC
variable is not set by valid_assessment as its value is retrieved from
the eos or newdrg censoring events.
worsening <- event_source(
dataset_name = "adqs",
date = ADT,
filter = CHG <= -10,
set_values_to = exprs(
EVNTDESC = "WORSENING",
SRCDOM = "ADQS",
SRCVAR = "ADT"
)
)
valid_assessment <- censor_source(
dataset_name = "adqs",
date = ADT,
filter = !is.na(CHG),
set_values_to = exprs(
CNSDTDSC = "LAST ASSESSMENT",
SRCDOM = "ADQS",
SRCVAR = "ADT"
)
)
The no_baseline, no_post_baseline, and no_assessment censoring
events are used to censor subjects without valid assessments at day one
(TRTSDT). Three events are defined to distinguish the reasons for not
having valid assessments. The consider_end_dates = FALSE argument is used
for these censoring events to avoid that the ENVTDESC and CNSR value
from the end date (eos and newdrg) is used.
no_baseline <- censor_source(
dataset_name = "adqs",
date = TRTSDT,
censor = 3,
filter = is.na(ABLFL),
order = exprs(ADT),
consider_end_dates = FALSE,
set_values_to = exprs(
EVNTDESC = "NO BASELINE ASSESSMENT",
CNSDTDSC = "TREATMENT START",
SRCDOM = "ADQS",
SRCVAR = "TRTSDT"
)
)
no_post_baseline <- censor_source(
dataset_name = "adqs",
date = TRTSDT,
censor = 4,
filter = ABLFL == "Y",
order = exprs(ADT),
consider_end_dates = FALSE,
set_values_to = exprs(
EVNTDESC = "NO POST-BASELINE ASSESSMENT",
CNSDTDSC = "TREATMENT START",
SRCDOM = "ADQS",
SRCVAR = "TRTSDT"
)
)
no_assessment <- censor_source(
dataset_name = "adsl",
date = TRTSDT,
censor = 5,
consider_end_dates = FALSE,
set_values_to = exprs(
EVNTDESC = "NO ASSESSMENTS",
CNSDTDSC = "TREATMENT START",
SRCDOM = "ADSL",
SRCVAR = "TRTSDT"
)
)
In the derive_param_tte() function call, the order of the censoring
events in the censor_conditions argument is important. For censoring
events the records with the last date is selected. If there are multiple
records with the same last date, then the last record in the order
specified in the censor_conditions argument is selected. The events
no_assessment, no_post_baseline, and no_baseline all use TRTSDT as
date, i.e., the date is the same for them. Specifying no_assessment
before no_post_baseline ensures that if a subject has no post-baseline
assessments the record from no_post_baseline is used, i.e., EVNTDESC is
set to "NO POST-BASELINE ASSESSMENT".
derive_param_tte(
dataset_adsl = adsl,
source_datasets = list(adsl = adsl, adqs = adqs),
start_date = TRTSDT,
end_dates = list(eos, newdrg),
event_conditions = list(worsening),
censor_conditions = list(valid_assessment, no_assessment, no_post_baseline, no_baseline),
set_values_to = exprs(PARAMCD = "TTWORSE")
) %>%
select(-STUDYID, -PARAMCD)
#> # A tibble: 6 × 8
#> USUBJID ADT EVNTDESC SRCDOM SRCVAR CNSR CNSDTDSC STARTDT
#> <chr> <date> <chr> <chr> <chr> <int> <chr> <date>
#> 1 01 2021-02-03 END OF STUDY ADQS ADT 1 LAST AS… 2020-12-06
#> 2 02 2021-02-03 NEW DRUG ADQS ADT 2 LAST AS… 2021-01-16
#> 3 03 2021-03-10 NO BASELINE ASSESS… ADQS TRTSDT 3 TREATME… 2021-03-10
#> 4 04 2021-04-02 NO POST-BASELINE A… ADQS TRTSDT 4 TREATME… 2021-04-02
#> 5 05 2021-05-09 NO ASSESSMENTS ADSL TRTSDT 5 TREATME… 2021-05-09
#> 6 06 2021-03-15 WORSENING ADQS ADT 0 <NA> 2021-02-01
Differentiating censoring date description
In this example, the event description (EVNTDESC) and the censoring
date description (CNSDTDSC) should distinguish between the different
censoring reasons: end of study, new drug, or just last assessment. If a
variable like CNSDTDSC is set in both the end dates and the censoring
condition, the latter overwrites the former. However, the coalesce()
function can be used to avoid this.
adsl <- tribble(
~USUBJID, ~TRTSDT, ~EOSDT, ~NEWDRGDT,
"01", ymd("2020-12-06"), ymd("2021-03-06"), NA,
"02", ymd("2021-01-16"), ymd("2021-04-03"), ymd("2021-03-21"),
"03", ymd("2021-03-10"), NA, NA,
"04", ymd("2021-04-02"), NA, NA,
"05", ymd("2021-05-09"), ymd("2021-07-30"), NA
) %>%
mutate(STUDYID = "AB42")
adqs <- tribble(
~USUBJID, ~ADT, ~CHG,
"01", ymd("2021-02-03"), -2,
"01", ymd("2021-03-01"), NA,
"01", ymd("2021-03-07"), 10,
"02", ymd("2021-02-03"), -1,
"02", ymd("2021-04-01"), -12,
"03", ymd("2021-03-20"), 2,
"03", ymd("2021-04-07"), 5,
"04", ymd("2021-04-15"), -15,
"05", ymd("2021-06-01"), -13
) %>%
mutate(STUDYID = "AB42") %>%
derive_vars_merged(
dataset_add = adsl,
by_vars = exprs(USUBJID),
new_vars = exprs(TRTSDT)
)
Here two end dates are defined which set both EVNTDESC and CNSDTDSC.
eos <- censor_source(
dataset_name = "adsl",
date = EOSDT,
set_values_to = exprs(
EVNTDESC = "END OF STUDY",
CNSDTDSC = "LAST QA BEFORE EOS"
)
)
newdrg <- censor_source(
dataset_name = "adsl",
date = NEWDRGDT,
set_values_to = exprs(
EVNTDESC = "NEW DRUG",
CNSDTDSC = "LAST QA BEFORE NEW DRUG"
)
)
For the censoring condition (valid_assessment) the coalesce()
function is used to set the descriptions only if they are not already set
by the end dates.
valid_assessment <- censor_source(
dataset_name = "adqs",
date = ADT,
filter = !is.na(CHG),
set_values_to = exprs(
EVNTDESC = coalesce(EVNTDESC, "NO WORSENING"),
CNSDTDSC = coalesce(CNSDTDSC, "LAST QA"),
SRCDOM = "ADQS",
SRCVAR = "ADT"
)
)
worsening <- event_source(
dataset_name = "adqs",
date = ADT,
filter = CHG <= -10,
set_values_to = exprs(
EVNTDESC = "WORSENING",
SRCDOM = "ADQS",
SRCVAR = "ADT"
)
)
derive_param_tte(
dataset_adsl = adsl,
source_datasets = list(adsl = adsl, adqs = adqs),
start_date = TRTSDT,
end_dates = list(eos, newdrg),
event_conditions = list(worsening),
censor_conditions = list(valid_assessment),
set_values_to = exprs(PARAMCD = "TTWORSE")
) %>%
select(-STUDYID, -PARAMCD, -STARTDT, -SRCDOM, -SRCVAR)
#> # A tibble: 5 × 5
#> USUBJID ADT EVNTDESC CNSR CNSDTDSC
#> <chr> <date> <chr> <int> <chr>
#> 1 01 2021-02-03 END OF STUDY 1 LAST QA BEFORE EOS
#> 2 02 2021-02-03 NEW DRUG 1 LAST QA BEFORE NEW DRUG
#> 3 03 2021-04-07 NO WORSENING 1 LAST QA
#> 4 04 2021-04-15 WORSENING 0 <NA>
#> 5 05 2021-06-01 WORSENING 0 <NA>
For subjects 01 and 02 the descriptions from the end dates are
used. As subject 03 has no end dates, the descriptions from the censoring
condition (valid_assessments) are used.
Overall survival time to event parameter
In oncology trials, this is commonly derived as time from randomization date to death. For those without event, they are censored at the last date they are known to be alive.
The start date is set using
start_dateargument, now that we need to use different to the default.In this example, datetime was needed, which can be achieved by setting
create_datetimeargument toTRUE.
adsl <- tribble(
~USUBJID, ~RANDDTM, ~LSALVDTM, ~DTHDTM, ~DTHFL,
"01", ymd_hms("2020-10-03 00:00:00"), ymd_hms("2022-12-15 23:59:59"), NA, NA,
"02", ymd_hms("2021-01-23 00:00:00"), ymd_hms("2021-02-03 19:45:59"), ymd_hms("2021-02-03 19:45:59"), "Y"
) %>%
mutate(STUDYID = "AB42")
# derive overall survival parameter
death <- event_source(
dataset_name = "adsl",
filter = DTHFL == "Y",
date = DTHDTM,
set_values_to = exprs(
EVNTDESC = "DEATH",
SRCDOM = "ADSL",
SRCVAR = "DTHDTM"
)
)
last_alive <- censor_source(
dataset_name = "adsl",
date = LSALVDTM,
set_values_to = exprs(
EVNTDESC = "LAST DATE KNOWN ALIVE",
SRCDOM = "ADSL",
SRCVAR = "LSALVDTM"
)
)
derive_param_tte(
dataset_adsl = adsl,
start_date = RANDDTM,
event_conditions = list(death),
censor_conditions = list(last_alive),
create_datetime = TRUE,
source_datasets = list(adsl = adsl),
set_values_to = exprs(
PARAMCD = "OS",
PARAM = "Overall Survival"
)
) %>%
select(USUBJID, STARTDTM, PARAMCD, PARAM, ADTM, CNSR)
#> # A tibble: 2 × 6
#> USUBJID STARTDTM PARAMCD PARAM ADTM CNSR
#> <chr> <dttm> <chr> <chr> <dttm> <int>
#> 1 01 2020-10-03 00:00:00 OS Overall Survival 2022-12-15 23:59:59 1
#> 2 02 2021-01-23 00:00:00 OS Overall Survival 2021-02-03 19:45:59 0
Duration of response time to event parameter
In oncology trials, this is commonly derived as time from response until
progression or death, or if neither have occurred then censor at last tumor
assessment visit date. It is only relevant for subjects with a response.
Note how only observations for subjects in dataset_adsl have the new
parameter created, so see below how this is filtered only on responders.
adsl_resp <- tribble(
~USUBJID, ~DTHFL, ~DTHDT, ~RSPDT,
"01", "Y", ymd("2021-06-12"), ymd("2021-03-04"),
"02", "N", NA, NA,
"03", "Y", ymd("2021-08-21"), NA,
"04", "N", NA, ymd("2021-04-14")
) %>%
mutate(STUDYID = "AB42")
adrs <- tribble(
~USUBJID, ~AVALC, ~ADT, ~ASEQ,
"01", "SD", ymd("2021-01-03"), 1,
"01", "PR", ymd("2021-03-04"), 2,
"01", "PD", ymd("2021-05-05"), 3,
"02", "PD", ymd("2021-02-03"), 1,
"04", "SD", ymd("2021-02-13"), 1,
"04", "PR", ymd("2021-04-14"), 2,
"04", "CR", ymd("2021-05-15"), 3
) %>%
mutate(STUDYID = "AB42", PARAMCD = "OVR")
pd <- event_source(
dataset_name = "adrs",
filter = AVALC == "PD",
date = ADT,
set_values_to = exprs(
EVENTDESC = "PD",
SRCDOM = "ADRS",
SRCVAR = "ADTM",
SRCSEQ = ASEQ
)
)
death <- event_source(
dataset_name = "adsl",
filter = DTHFL == "Y",
date = DTHDT,
set_values_to = exprs(
EVENTDESC = "DEATH",
SRCDOM = "ADSL",
SRCVAR = "DTHDT"
)
)
last_visit <- censor_source(
dataset_name = "adrs",
date = ADT,
set_values_to = exprs(
EVENTDESC = "LAST TUMOR ASSESSMENT",
SRCDOM = "ADRS",
SRCVAR = "ADTM",
SRCSEQ = ASEQ
)
)
derive_param_tte(
dataset_adsl = filter(adsl_resp, !is.na(RSPDT)),
start_date = RSPDT,
event_conditions = list(pd, death),
censor_conditions = list(last_visit),
source_datasets = list(adsl = adsl_resp, adrs = adrs),
set_values_to = exprs(
PARAMCD = "DURRSP",
PARAM = "Duration of Response"
)
) %>%
select(USUBJID, STARTDT, PARAMCD, PARAM, ADT, CNSR, SRCSEQ)
#> # A tibble: 2 × 7
#> USUBJID STARTDT PARAMCD PARAM ADT CNSR SRCSEQ
#> <chr> <date> <chr> <chr> <date> <int> <dbl>
#> 1 01 2021-03-04 DURRSP Duration of Response 2021-05-05 0 3
#> 2 04 2021-04-14 DURRSP Duration of Response 2021-05-15 1 3
Further examples
Further example usages of this function can be found in the
vignette("bds_tte") and vignette("tte_analyses").
