library(metacore)
library(metatools)
library(pharmaversesdtm)
library(pharmaverseadam)
library(admiral)
library(xportr)
library(dplyr)
library(lubridate)
library(stringr)
library(reactable)
# Read in input data
adsl <- pharmaverseadam::adsl
ae <- pharmaversesdtm::ae
ex <- pharmaversesdtm::ex
# When SAS datasets are imported into R using haven::read_sas(), missing
# character values from SAS appear as "" characters in R, instead of appearing
# as NA values. Further details can be obtained via the following link:
# https://pharmaverse.github.io/admiral/articles/admiral.html#handling-of-missing-values
ae <- convert_blanks_to_na(ae)
ex <- convert_blanks_to_na(ex)
ADAE
Introduction
This article provides a step-by-step explanation for creating an ADaM ADAE
(Adverse Events) dataset using key pharmaverse packages along with tidyverse components.
Here we try to cover the most common AE analysis dataset derivations and some of the most useful functions for these.
For the purpose of this example, we will use the ADSL
dataset from {pharmaverseadam}
and ae
and ex
domains from {pharmaversesdtm}
.
Programming Flow
- Load Data and Required pharmaverse Packages
- Load Specifications for Metacore
- Select ADSL Variables
- Start Building Derivations
- Derive Analysis Dates
- Derive Duration
- Derive Date of Last Dose
- Derive Analysis Flags
- Derive Occurrence Flags
- Derive Query Variables
- Add ADSL Variables
- Apply Metadata to Create an eSub XPT and Perform Associated Checks
Load Data and Required pharmaverse Packages
The first step is to load the required packages and input data.
Load Specifications for Metacore
We have saved our specifications in an Excel file and will load them into {metacore}
with the metacore::spec_to_metacore()
function.
When you run this in your R script you might find warnings that come from {metacore}
which help you to improve your specifications. Examples could be specification columns with only missing values or unused codelists, but once checked these warnings could be suppressed using something like suppressWarnings()
.
Select ADSL Variables
Some variables from the ADSL
dataset required for the derivations are merged into the AE
domain using the admiral::derive_vars_merged()
function. Find other {admiral}
functions and related variables by searching admiraldiscovery. The rest of the relevant ADSL
variables will be added later.
Start Building Derivations
Derive Analysis Dates
The first derivation step we are going to do is to compute the Analysis Date and Relative Analysis Day with the variables merged from ADSL
dataset using the admiral::derive_vars_dt()
and admiral::derive_vars_dy()
functions.
When deriving AE end date first as AENDT
, we’re going to impute partial dates with missing day or month. For AE start date ASTDT
then we’re going to use the same imputation, but additionally we ensure that the imputed date doesn’t go before treatment start date or after the AE end date.
For all of these example calls we don’t generally include the default arguments, so you should be sure to check out the full functional reference pages as then you’ll also learn what other arguments exist to allow further user control over the derivations.
# Derive ASTDT/ASTDTF/ASTDY and AENDT/AENDTF/AENDY
adae <- adae %>%
derive_vars_dt(
new_vars_prefix = "AEN",
dtc = AEENDTC,
date_imputation = "last",
highest_imputation = "M", # imputation is performed on missing days or months
flag_imputation = "auto" # to automatically create AENDTF variable
) %>%
derive_vars_dt(
new_vars_prefix = "AST",
dtc = AESTDTC,
highest_imputation = "M", # imputation is performed on missing days or months
flag_imputation = "auto", # to automatically create ASTDTF variable
min_dates = exprs(TRTSDT), # apply a minimum date for the imputation
max_dates = exprs(AENDT) # apply a maximum date for the imputation
) %>%
derive_vars_dy(
reference_date = TRTSDT,
source_vars = exprs(ASTDT, AENDT)
)
Sample of Data
Derive Duration
Now we have these date variables we can derive AE duration using the admiral::derive_vars_duration()
function. For this function the default is to + 1 day for the calculation, but this can be controlled with the add_one
argument.
Sample of Data
Derive Date of Last Dose
You might need to add some exposure information from ex
such as deriving the date of last dose before each AE.
In the below case we call this LDOSEDT
and we want to look only at exposure records with a valid dose or placebo. Here we use the admiral::derive_vars_joined()
function which enables more complex joins.
# Derive LDOSEDT
# In our ex data the EXDOSFRQ (frequency) is "QD" which stands for once daily
# If this was not the case then we would need to use the admiral::create_single_dose_dataset() function
# to generate single doses from aggregate dose information
# Refer to https://pharmaverse.github.io/admiral/reference/create_single_dose_dataset.html
ex <- ex %>%
derive_vars_dt(
dtc = EXENDTC,
new_vars_prefix = "EXEN"
)
adae <- adae %>%
derive_vars_joined(
dataset_add = ex,
by_vars = exprs(STUDYID, USUBJID),
order = exprs(EXENDT),
new_vars = exprs(LDOSEDT = EXENDT),
join_vars = exprs(EXENDT),
join_type = "all",
filter_add = (EXDOSE > 0 | (EXDOSE == 0 & str_detect(EXTRT, "PLACEBO"))) & !is.na(EXENDT),
filter_join = EXENDT <= ASTDT,
mode = "last"
)
Sample of Data
Sample of Data
Derive Analysis Flags
Next we looks at common analysis flags such as treatment emergent and on treatment flags using the admiral::derive_var_trtemfl()
and admiral::derive_var_ontrtfl()
functions. For the on-treatment flag we only want to include AEs occurring until 30 days after treatment end.
Sample of Data
At first these 2 functions may appear similar but both offer extra specific arguments for flexibility if you needed to apply more complex analysis rules to these derivations. For example, for treatment emergent you could use the intensity arguments if you also wanted to flag those AEs starting before treatment start and ending after treatment start with worsened intensity (i.e. the most extreme intensity is greater than the initial intensity).
Derive Occurrence Flags
There can be flags that need to be derived based on AE occurrences, such as flagging the first occurrence of maximum severity per patient which in the below example we call AOCCIFL
. Here we use the admiral::derive_var_extreme_flag()
function.
# Derive AOCCIFL
adae <- adae %>%
# create temporary numeric ASEVN for sorting purpose
mutate(TEMP_AESEVN = as.integer(factor(AESEV, levels = c("SEVERE", "MODERATE", "MILD")))) %>%
derive_var_extreme_flag(
new_var = AOCCIFL,
by_vars = exprs(STUDYID, USUBJID),
order = exprs(TEMP_AESEVN, ASTDT, AESEQ),
mode = "first"
)
Sample of Data
Derive Query Variables
The final set of derived flags would be around those checking for queries/baskets of AE terms, e.g. using Standardized MedDRA Queries (SMQs) or Custom Queries (CQs).
Before reading this section, you should read the Queries Dataset Vignette from {admiral}
.
On a study you would need to create your own version of this queries dataset with the AE queries you need, but for the purpose of this article we use the example provided as part of {admiral}
and we filter it to just one CQ and one SMQ. Then you can derive the required variables using the admiral::derive_vars_query()
function.
Sample of Data
Sample of Data
Add ADSL Variables
If needed, the other ADSL
variables can now be added. List of ADSL variables already merged held in vector adsl_vars
.
Apply Metadata to Create an eSub XPT and Perform Associated Checks
We use {metatools}
and {xportr}
to perform checks, apply metadata such as types, lengths, labels, and write the dataset to an XPT file.
dir <- tempdir() # Specify the directory for saving the XPT file
adae %>%
drop_unspec_vars(metacore) %>% # Drop unspecified variables from specs
check_variables(metacore) %>% # Check all variables specified are present and no more
check_ct_data(metacore, na_acceptable = TRUE) %>% # Checks all variables with CT only contain values within the CT
order_cols(metacore) %>% # Orders the columns according to the spec
sort_by_key(metacore) %>% # Sorts the rows by the sort keys
xportr_type(metacore, domain = "ADAE") %>% # Coerce variable type to match spec
xportr_length(metacore) %>% # Assigns SAS length from a variable level metadata
xportr_label(metacore) %>% # Assigns variable label from metacore specifications
xportr_df_label(metacore) %>% # Assigns dataset label from metacore specifications
xportr_write(file.path(dir, "adae.xpt"), metadata = metacore, domain = "ADAE")