Introduction
This article describes creating an ADSL
ADaM. Examples
are currently presented and tested using DM
,
EX
, AE
, LB
and DS
SDTM domains. However, other domains could be used.
Note: All examples assume CDISC SDTM and/or ADaM format as input unless otherwise specified.
Programming Flow
- Read in Data
- Derive Period, Subperiod, and Phase Variables
(e.g.
APxxSDT
,APxxEDT
, …) - Derive Treatment Variables
(
TRT0xP
,TRT0xA
) - Derive/Impute Numeric Treatment Date/Time and
Duration (
TRTSDT
,TRTEDT
,TRTDURD
) - Derive Disposition Variables
- Derive Death Variables
- Derive Last Known Date Alive
(
LSTALVDT
) - Derive Groupings and Populations
- Derive Other Variables
- Add Labels and Attributes
Read in Data
To start, all data frames needed for the creation of
ADSL
should be read into the environment. This will be a
company specific process. Some of the data frames needed may be
DM
, EX
, DS
, AE
, and
LB
.
For example purpose, the CDISC Pilot SDTM datasets—which are included in pharmaversesdtm—are used.
library(admiral)
library(dplyr, warn.conflicts = FALSE)
library(pharmaversesdtm)
library(lubridate)
library(stringr)
data("dm")
data("ds")
data("ex")
data("ae")
data("lb")
dm <- convert_blanks_to_na(dm)
ds <- convert_blanks_to_na(ds)
ex <- convert_blanks_to_na(ex)
ae <- convert_blanks_to_na(ae)
lb <- convert_blanks_to_na(lb)
The DM
domain is used as the basis for
ADSL
:
Derive Period, Subperiod, and Phase Variables
(e.g. APxxSDT
, APxxEDT
, …)
See the “Visit and Period Variables” vignette for more information.
If the variables are not derived based on a period reference dataset, they may be derived at a later point of the flow. For example, phases like “Treatment Phase” and “Follow up” could be derived based on treatment start and end date.
Derive Treatment Variables (TRT0xP
,
TRT0xA
)
The mapping of the treatment variables is left to the ADaM programmer. An example mapping for a study without periods may be:
For studies with periods see the “Visit and Period Variables” vignette.
Derive/Impute Numeric Treatment Date/Time and Duration
(TRTSDTM
, TRTEDTM
, TRTDURD
)
The function derive_vars_merged()
can be used to derive
the treatment start and end date/times using the ex
domain.
A pre-processing step for ex
is required to convert the
variable EXSTDTC
and EXSTDTC
to datetime
variables and impute missing date or time components. Conversion and
imputation is done by derive_vars_dtm()
.
Example calls:
# impute start and end time of exposure to first and last respectively,
# do not impute date
ex_ext <- ex %>%
derive_vars_dtm(
dtc = EXSTDTC,
new_vars_prefix = "EXST"
) %>%
derive_vars_dtm(
dtc = EXENDTC,
new_vars_prefix = "EXEN",
time_imputation = "last"
)
adsl <- adsl %>%
derive_vars_merged(
dataset_add = ex_ext,
filter_add = (EXDOSE > 0 |
(EXDOSE == 0 &
str_detect(EXTRT, "PLACEBO"))) & !is.na(EXSTDTM),
new_vars = exprs(TRTSDTM = EXSTDTM, TRTSTMF = EXSTTMF),
order = exprs(EXSTDTM, EXSEQ),
mode = "first",
by_vars = exprs(STUDYID, USUBJID)
) %>%
derive_vars_merged(
dataset_add = ex_ext,
filter_add = (EXDOSE > 0 |
(EXDOSE == 0 &
str_detect(EXTRT, "PLACEBO"))) & !is.na(EXENDTM),
new_vars = exprs(TRTEDTM = EXENDTM, TRTETMF = EXENTMF),
order = exprs(EXENDTM, EXSEQ),
mode = "last",
by_vars = exprs(STUDYID, USUBJID)
)
This call returns the original data frame with the column
TRTSDTM
, TRTSTMF
, TRTEDTM
, and
TRTETMF
added. Exposure observations with incomplete date
and zero doses of non placebo treatments are ignored. Missing time parts
are imputed as first or last for start and end date respectively.
The datetime variables returned can be converted to dates using the
derive_vars_dtm_to_dt()
function.
adsl <- adsl %>%
derive_vars_dtm_to_dt(source_vars = exprs(TRTSDTM, TRTEDTM))
Now, that TRTSDT
and TRTEDT
are derived,
the function derive_var_trtdurd()
can be used to calculate
the Treatment duration (TRTDURD
).
adsl <- adsl %>%
derive_var_trtdurd()
Derive Disposition Variables
Disposition Dates (e.g. EOSDT
)
The functions derive_vars_dt()
and
derive_vars_merged()
can be used to derive a disposition
date. First the character disposition date (DS.DSSTDTC
) is
converted to a numeric date (DSSTDT
) calling
derive_vars_dt()
. The DS
dataset is extended
by the DSSTDT
variable because the date is required by
other derivations, e.g., RANDDT
as well. Then the relevant
disposition date is selected by adjusting the filter_add
argument.
To add the End of Study date (EOSDT
) to the input
dataset, a call could be:
# convert character date to numeric date without imputation
ds_ext <- derive_vars_dt(
ds,
dtc = DSSTDTC,
new_vars_prefix = "DSST"
)
adsl <- adsl %>%
derive_vars_merged(
dataset_add = ds_ext,
by_vars = exprs(STUDYID, USUBJID),
new_vars = exprs(EOSDT = DSSTDT),
filter_add = DSCAT == "DISPOSITION EVENT" & DSDECOD != "SCREEN FAILURE"
)
The ds_ext
dataset:
The adsl
dataset:
The derive_vars_dt()
function allows to impute partial
dates as well. If imputation is needed and missing days are to be
imputed to the first of the month and missing months to the first month
of the year, set highest_imputation = "M"
.
Disposition Status (e.g. EOSSTT
)
The function derive_vars_merged()
can be used to derive
the End of Study status (EOSSTT
) based on
DSCAT
and DSDECOD
from DS
. The
relevant observations are selected by adjusting the
filter_add
argument. A function mapping
DSDECOD
values to EOSSTT
values can be defined
and used in the new_vars
argument. The mapping for the call
below is
-
"COMPLETED"
ifDSDECOD == "COMPLETED"
-
NA_character_
ifDSDECOD
is"SCREEN FAILURE"
-
"DISCONTINUED"
otherwise
Example function format_eosstt()
:
format_eosstt <- function(x) {
case_when(
x %in% c("COMPLETED") ~ "COMPLETED",
x %in% c("SCREEN FAILURE") ~ NA_character_,
TRUE ~ "DISCONTINUED"
)
}
The customized mapping function format_eosstt()
can now
be passed to the main function. For subjects without a disposition event
the end of study status is set to "ONGOING"
by specifying
the missing_values
argument.
adsl <- adsl %>%
derive_vars_merged(
dataset_add = ds,
by_vars = exprs(STUDYID, USUBJID),
filter_add = DSCAT == "DISPOSITION EVENT",
new_vars = exprs(EOSSTT = format_eosstt(DSDECOD)),
missing_values = exprs(EOSSTT = "ONGOING")
)
This call would return the input dataset with the column
EOSSTT
added.
If the derivation must be changed, the user can create his/her own
function to map DSDECOD
to a suitable EOSSTT
value.
Disposition Reason(s) (e.g. DCSREAS
,
DCSREASP
)
The main reason for discontinuation is usually stored in
DSDECOD
while DSTERM
provides additional
details regarding subject’s discontinuation (e.g., description of
"OTHER"
).
The function derive_vars_merged()
can be used to derive
a disposition reason (along with the details, if required) at a specific
timepoint. The relevant observations are selected by adjusting the
filter_add
argument.
To derive the End of Study reason(s) (DCSREAS
and
DCSREASP
), the function will map DCSREAS
as
DSDECOD
, and DCSREASP
as DSTERM
if DSDECOD
is not "COMPLETED"
,
"SCREEN FAILURE"
, or NA
, NA
otherwise.