Skip to contents

Introduction

This article describes creating an ADSL ADaM. Examples are currently presented and tested using DM, EX , AE, LB and DS SDTM domains. However, other domains could be used.

Note: All examples assume CDISC SDTM and/or ADaM format as input unless otherwise specified.

Programming Flow

Read in Data

To start, all data frames needed for the creation of ADSL should be read into the environment. This will be a company specific process. Some of the data frames needed may be DM, EX, DS, AE, and LB.

For example purpose, the CDISC Pilot SDTM datasets—which are included in {admiral.test}—are used.

library(admiral)
library(dplyr)
library(admiral.test)
library(lubridate)
library(stringr)

data("admiral_dm")
data("admiral_ds")
data("admiral_ex")
data("admiral_ae")
data("admiral_lb")

dm <- admiral_dm
ds <- admiral_ds
ex <- admiral_ex
ae <- admiral_ae
lb <- admiral_lb

The DM domain is used as the basis for ADSL:

adsl <- dm %>%
  select(-DOMAIN)

Derive Treatment Variables (TRT0xP, TRT0xA)

The mapping of the treatment variables is left to the ADaM programmer. An example mapping may be:

adsl <- dm %>%
  mutate(TRT01P = ARM, TRT01A = ACTARM)

Derive/Impute Numeric Treatment Date/Time and Duration (TRTSDTM, TRTEDTM, TRTDURD)

The function derive_vars_merged() can be used to derive the treatment start and end date/times using the ex domain. A pre-processing step for ex is required to convert the variable EXSTDTC and EXSTDTC to datetime variables and impute missing date or time components. Conversion and imputation is done by derive_vars_dtm().

Example calls:

# impute start and end time of exposure to first and last respectively, do not impute date
ex_ext <- ex %>%
  derive_vars_dtm(
    dtc = EXSTDTC,
    new_vars_prefix = "EXST"
  ) %>%
  derive_vars_dtm(
    dtc = EXENDTC,
    new_vars_prefix = "EXEN",
    time_imputation = "last"
  )

adsl <- adsl %>%
  derive_vars_merged(
    dataset_add = ex_ext,
    filter_add = (EXDOSE > 0 |
      (EXDOSE == 0 &
        str_detect(EXTRT, "PLACEBO"))) & !is.na(EXSTDTM),
    new_vars = vars(TRTSDTM = EXSTDTM, TRTSTMF = EXSTTMF),
    order = vars(EXSTDTM, EXSEQ),
    mode = "first",
    by_vars = vars(STUDYID, USUBJID)
  ) %>%
  derive_vars_merged(
    dataset_add = ex_ext,
    filter_add = (EXDOSE > 0 |
      (EXDOSE == 0 &
        str_detect(EXTRT, "PLACEBO"))) & !is.na(EXENDTM),
    new_vars = vars(TRTEDTM = EXENDTM, TRTETMF = EXENTMF),
    order = vars(EXENDTM, EXSEQ),
    mode = "last",
    by_vars = vars(STUDYID, USUBJID)
  )

This call returns the original data frame with the column TRTSDTM, TRTSTMF, TRTEDTM, and TRTETMF added. Exposure observations with incomplete date and zero doses of non placebo treatments are ignored. Missing time parts are imputed as first or last for start and end date respectively.

The datetime variables returned can be converted to dates using the derive_vars_dtm_to_dt() function.

adsl <- adsl %>%
  derive_vars_dtm_to_dt(source_vars = vars(TRTSDTM, TRTEDTM))

Now, that TRTSDT and TRTEDT are derived, the function derive_var_trtdurd() can be used to calculate the Treatment duration (TRTDURD).

adsl <- adsl %>%
  derive_var_trtdurd()

Derive Disposition Variables

Disposition Dates (e.g. EOSDT)

The functions derive_vars_dt() and derive_vars_merged() can be used to derive a disposition date. First the character disposition date (DS.DSSTDTC) is converted to a numeric date (DSSTDT) calling derive_vars_dt(). Then the relevant disposition date is selected by adjusting the filter_add parameter.

To derive the End of Study date (EOSDT), a call could be:

# convert character date to numeric date without imputation
ds_ext <- derive_vars_dt(
  ds,
  dtc = DSSTDTC,
  new_vars_prefix = "DSST"
)

adsl <- adsl %>%
  derive_vars_merged(
    dataset_add = ds_ext,
    by_vars = vars(STUDYID, USUBJID),
    new_vars = vars(EOSDT = DSSTDT),
    filter_add = DSCAT == "DISPOSITION EVENT" & DSDECOD != "SCREEN FAILURE"
  )

With DS:

We would get :

This call would return the input dataset with the column EOSDT added. This function allows the user to impute partial dates as well. If imputation is needed and the date is to be imputed to the first of the month, then set date_imputation = "FIRST".

Disposition Status (e.g. EOSSTT)

The function derive_var_disposition_status() can be used to derive a disposition status at a specific timepoint. The relevant disposition variable (DS.DSDECOD) is selected by adjusting the filter parameter and used to derive EOSSTT.

To derive the End of Study status (EOSSTT), a call could be:

adsl <- adsl %>%
  derive_var_disposition_status(
    dataset_ds = ds,
    new_var = EOSSTT,
    status_var = DSDECOD,
    filter_ds = DSCAT == "DISPOSITION EVENT"
  )

Link to DS.

This call would return the input dataset with the column EOSSTT added.

By default, the function will derive EOSSTT as

  • "NOT STARTED" if DSDECOD is "SCREEN FAILURE" or "SCREENING NOT COMPLETED"
  • "COMPLETED" if DSDECOD == "COMPLETED"
  • "DISCONTINUED" if DSDECOD is not "COMPLETED" or NA
  • "ONGOING" otherwise

If the default derivation must be changed, the user can create his/her own function and pass it to the format_new_var argument of the function (format_new_var = new_mapping) to map DSDECOD to a suitable EOSSTT value.

Example function format_eosstt():

format_eosstt <- function(DSDECOD) {
  case_when(
    DSDECOD %in% c("COMPLETED") ~ "COMPLETED",
    DSDECOD %in% c("SCREEN FAILURE") ~ NA_character_,
    !is.na(DSDECOD) ~ "DISCONTINUED",
    TRUE ~ "ONGOING"
  )
}

The customized mapping function format_eosstt() can now be passed to the main function:


adsl <- adsl %>%
  derive_var_disposition_status(
    dataset_ds = ds,
    new_var = EOSSTT,
    status_var = DSDECOD,
    format_new_var = format_eosstt,
    filter_ds = DSCAT == "DISPOSITION EVENT"
  )

This call would return the input dataset with the column EOSSTT added.

Disposition Reason(s) (e.g. DCSREAS, DCSREASP)

The main reason for discontinuation is usually stored in DSDECOD while DSTERM provides additional details regarding subject’s discontinuation (e.g., description of "OTHER").

The function derive_vars_disposition_reason() can be used to derive a disposition reason (along with the details, if required) at a specific timepoint. The relevant disposition variable(s) (DS.DSDECOD, DS.DSTERM) are selected by adjusting the filter parameter and used to derive the main reason (and details).

To derive the End of Study reason(s) (DCSREAS and DCSREASP), the call would be:

adsl <- adsl %>%
  derive_vars_disposition_reason(
    dataset_ds = ds,
    new_var = DCSREAS,
    reason_var = DSDECOD,
    new_var_spe = DCSREASP,
    reason_var_spe = DSTERM,
    filter_ds = DSCAT == "DISPOSITION EVENT" & DSDECOD != "SCREEN FAILURE"
  )

Link to DS.

This call would return the input dataset with the column DCSREAS and DCSREASP added.

By default, the function will map

  • DCSREAS as DSDECOD if DSDECOD is not "COMPLETED" or NA, NA otherwise
  • DCSREASP as DSTERM if DSDECOD is equal to OTHER, NA otherwise

If the default derivation must be changed, the user can create his/her own function and pass it to the format_new_var argument of the function (format_new_var = new_mapping) to map DSDECOD and DSTERM to a suitable DCSREAS/DCSREASP value.

Example function format_dcsreas():

format_dcsreas <- function(dsdecod, dsterm = NULL) {
  if (is.null(dsterm)) {
    if_else(dsdecod %notin% c("COMPLETED", "SCREEN FAILURE") & !is.na(dsdecod), dsdecod, NA_character_)
  } else {
    if_else(dsdecod == "OTHER", dsterm, NA_character_)
  }
}

The customized mapping function format_dcsreas() can now be passed to the main function:

adsl <- adsl %>%
  derive_vars_disposition_reason(
    dataset_ds = ds,
    new_var = DCSREAS,
    reason_var = DSDECOD,
    new_var_spe = DCSREASP,
    reason_var_spe = DSTERM,
    format_new_vars = format_dcsreas,
    filter_ds = DSCAT == "DISPOSITION EVENT"
  )

Randomization Date (RANDDT)

The function derive_vars_merged() can be used to derive randomization date variable. To map Randomization Date (RANDDT), the call would be:

adsl <- adsl %>%
  derive_vars_merged(
    dataset_add = ds_ext,
    filter_add = DSDECOD == "RANDOMIZED",
    by_vars = vars(STUDYID, USUBJID),
    new_vars = vars(RANDDT = DSSTDT)
  )

This call would return the input dataset with the column RANDDT is added.