Introduction
This article describes creating an ADSL
ADaM. Examples are currently presented and tested using DM
, EX
, AE
, LB
and DS
SDTM domains. However, other domains could be used.
Note: All examples assume CDISC SDTM and/or ADaM format as input unless otherwise specified.
Programming Flow
- Read in Data
- Derive Period, Subperiod, and Phase Variables (e.g.
APxxSDT
,APxxEDT
, …) - Derive Treatment Variables (
TRT0xP
,TRT0xA
) - Derive/Impute Numeric Treatment Date/Time and Duration (
TRTSDT
,TRTEDT
,TRTDURD
) - Derive Disposition Variables
- Derive Death Variables
- Derive Last Known Date Alive (
LSTALVDT
) - Derive Groupings and Populations
- Derive Other Variables
- Add Labels and Attributes
Read in Data
To start, all data frames needed for the creation of ADSL
should be read into the environment. This will be a company specific process. Some of the data frames needed may be DM
, EX
, DS
, AE
, and LB
.
For example purpose, the CDISC Pilot SDTM datasets—which are included in {admiral.test}
—are used.
library(admiral)
library(dplyr, warn.conflicts = FALSE)
library(admiral.test)
library(lubridate)
library(stringr)
data("admiral_dm")
data("admiral_ds")
data("admiral_ex")
data("admiral_ae")
data("admiral_lb")
dm <- admiral_dm
ds <- admiral_ds
ex <- admiral_ex
ae <- admiral_ae
lb <- admiral_lb
The DM
domain is used as the basis for ADSL
:
Derive Period, Subperiod, and Phase Variables (e.g. APxxSDT
, APxxEDT
, …)
See the “Visit and Period Variables” vignette for more information.
If the variables are not derived based on a period reference dataset, they may be derived at a later point of the flow. For example, phases like “Treatment Phase” and “Follow up” could be derived based on treatment start and end date.
Derive Treatment Variables (TRT0xP
, TRT0xA
)
The mapping of the treatment variables is left to the ADaM programmer. An example mapping for a study without periods may be:
For studies with periods see the “Visit and Period Variables” vignette.
Derive/Impute Numeric Treatment Date/Time and Duration (TRTSDTM
, TRTEDTM
, TRTDURD
)
The function derive_vars_merged()
can be used to derive the treatment start and end date/times using the ex
domain. A pre-processing step for ex
is required to convert the variable EXSTDTC
and EXSTDTC
to datetime variables and impute missing date or time components. Conversion and imputation is done by derive_vars_dtm()
.
Example calls:
# impute start and end time of exposure to first and last respectively, do not impute date
ex_ext <- ex %>%
derive_vars_dtm(
dtc = EXSTDTC,
new_vars_prefix = "EXST"
) %>%
derive_vars_dtm(
dtc = EXENDTC,
new_vars_prefix = "EXEN",
time_imputation = "last"
)
adsl <- adsl %>%
derive_vars_merged(
dataset_add = ex_ext,
filter_add = (EXDOSE > 0 |
(EXDOSE == 0 &
str_detect(EXTRT, "PLACEBO"))) & !is.na(EXSTDTM),
new_vars = vars(TRTSDTM = EXSTDTM, TRTSTMF = EXSTTMF),
order = vars(EXSTDTM, EXSEQ),
mode = "first",
by_vars = vars(STUDYID, USUBJID)
) %>%
derive_vars_merged(
dataset_add = ex_ext,
filter_add = (EXDOSE > 0 |
(EXDOSE == 0 &
str_detect(EXTRT, "PLACEBO"))) & !is.na(EXENDTM),
new_vars = vars(TRTEDTM = EXENDTM, TRTETMF = EXENTMF),
order = vars(EXENDTM, EXSEQ),
mode = "last",
by_vars = vars(STUDYID, USUBJID)
)
This call returns the original data frame with the column TRTSDTM
, TRTSTMF
, TRTEDTM
, and TRTETMF
added. Exposure observations with incomplete date and zero doses of non placebo treatments are ignored. Missing time parts are imputed as first or last for start and end date respectively.
The datetime variables returned can be converted to dates using the derive_vars_dtm_to_dt()
function.
adsl <- adsl %>%
derive_vars_dtm_to_dt(source_vars = vars(TRTSDTM, TRTEDTM))
Now, that TRTSDT
and TRTEDT
are derived, the function derive_var_trtdurd()
can be used to calculate the Treatment duration (TRTDURD
).
adsl <- adsl %>%
derive_var_trtdurd()
Derive Disposition Variables
Disposition Dates (e.g. EOSDT
)
The functions derive_vars_dt()
and derive_vars_merged()
can be used to derive a disposition date. First the character disposition date (DS.DSSTDTC
) is converted to a numeric date (DSSTDT
) calling derive_vars_dt()
. Then the relevant disposition date is selected by adjusting the filter_add
parameter.
To derive the End of Study date (EOSDT
), a call could be:
# convert character date to numeric date without imputation
ds_ext <- derive_vars_dt(
ds,
dtc = DSSTDTC,
new_vars_prefix = "DSST"
)
adsl <- adsl %>%
derive_vars_merged(
dataset_add = ds_ext,
by_vars = vars(STUDYID, USUBJID),
new_vars = vars(EOSDT = DSSTDT),
filter_add = DSCAT == "DISPOSITION EVENT" & DSDECOD != "SCREEN FAILURE"
)
We would get :
This call would return the input dataset with the column EOSDT
added. This function allows the user to impute partial dates as well. If imputation is needed and the date is to be imputed to the first of the month, then set date_imputation = "FIRST"
.
Disposition Status (e.g. EOSSTT
)
The function derive_var_disposition_status()
can be used to derive a disposition status at a specific timepoint. The relevant disposition variable (DS.DSDECOD
) is selected by adjusting the filter parameter and used to derive EOSSTT
.
To derive the End of Study status (EOSSTT
), a call could be:
adsl <- adsl %>%
derive_var_disposition_status(
dataset_ds = ds,
new_var = EOSSTT,
status_var = DSDECOD,
filter_ds = DSCAT == "DISPOSITION EVENT"
)
Link to DS
.
This call would return the input dataset with the column EOSSTT
added.
By default, the function will derive EOSSTT
as
-
"NOT STARTED"
ifDSDECOD
is"SCREEN FAILURE"
or"SCREENING NOT COMPLETED"
-
"COMPLETED"
ifDSDECOD == "COMPLETED"
-
"DISCONTINUED"
ifDSDECOD
is not"COMPLETED"
orNA
-
"ONGOING"
otherwise
If the default derivation must be changed, the user can create his/her own function and pass it to the format_new_var
argument of the function (format_new_var = new_mapping
) to map DSDECOD
to a suitable EOSSTT
value.
Example function format_eosstt()
:
format_eosstt <- function(DSDECOD) {
case_when(
DSDECOD %in% c("COMPLETED") ~ "COMPLETED",
DSDECOD %in% c("SCREEN FAILURE") ~ NA_character_,
!is.na(DSDECOD) ~ "DISCONTINUED",
TRUE ~ "ONGOING"
)
}
The customized mapping function format_eosstt()
can now be passed to the main function:
adsl <- adsl %>%
derive_var_disposition_status(
dataset_ds = ds,
new_var = EOSSTT,
status_var = DSDECOD,
format_new_var = format_eosstt,
filter_ds = DSCAT == "DISPOSITION EVENT"
)
This call would return the input dataset with the column EOSSTT
added.
Disposition Reason(s) (e.g. DCSREAS
, DCSREASP
)
The main reason for discontinuation is usually stored in DSDECOD
while DSTERM
provides additional details regarding subject’s discontinuation (e.g., description of "OTHER"
).
The function derive_vars_disposition_reason()
can be used to derive a disposition reason (along with the details, if required) at a specific timepoint. The relevant disposition variable(s) (DS.DSDECOD
, DS.DSTERM
) are selected by adjusting the filter parameter and used to derive the main reason (and details).
To derive the End of Study reason(s) (DCSREAS
and DCSREASP
), the call would be:
adsl <- adsl %>%
derive_vars_disposition_reason(
dataset_ds = ds,
new_var = DCSREAS,
reason_var = DSDECOD,
new_var_spe = DCSREASP,
reason_var_spe = DSTERM,
filter_ds = DSCAT == "DISPOSITION EVENT" & DSDECOD != "SCREEN FAILURE"
)
Link to DS
.
This call would return the input dataset with the column DCSREAS
and DCSREASP
added.
By default, the function will map
-
DCSREAS
asDSDECOD
ifDSDECOD
is not"COMPLETED"
orNA
,NA
otherwise -
DCSREASP
asDSTERM
ifDSDECOD
is equal toOTHER
,NA
otherwise
If the default derivation must be changed, the user can create his/her own function and pass it to the format_new_var
argument of the function (format_new_var = new_mapping
) to map DSDECOD
and DSTERM
to a suitable DCSREAS
/DCSREASP
value.
Example function format_dcsreas()
:
format_dcsreas <- function(dsdecod, dsterm = NULL) {
if (is.null(dsterm)) {
if_else(dsdecod %notin% c("COMPLETED", "SCREEN FAILURE") & !is.na(dsdecod), dsdecod, NA_character_)
} else {
if_else(dsdecod == "OTHER", dsterm, NA_character_)
}
}
The customized mapping function format_dcsreas()
can now be passed to the main function:
adsl <- adsl %>%
derive_vars_disposition_reason(
dataset_ds = ds,
new_var = DCSREAS,
reason_var = DSDECOD,
new_var_spe = DCSREASP,
reason_var_spe = DSTERM,
format_new_vars = format_dcsreas,
filter_ds = DSCAT == "DISPOSITION EVENT"
)
Randomization Date (RANDDT
)
The function derive_vars_merged()
can be used to derive randomization date variable. To map Randomization Date (RANDDT
), the call would be:
adsl <- adsl %>%
derive_vars_merged(
dataset_add = ds_ext,
filter_add = DSDECOD == "RANDOMIZED",
by_vars = vars(STUDYID, USUBJID),
new_vars = vars(RANDDT = DSSTDT)
)
This call would return the input dataset with the column RANDDT
is added.