ADVS
Introduction
This article provides a step-by-step explanation for creating an ADaM ADVS
(Vital Signs) dataset using key pharmaverse packages along with tidyverse components.
For the purpose of this example, we will use the ADSL
dataset from {pharmaverseadam}
and vs
domain from {pharmaversesdtm}
.
Programming Flow
- Load Data and Required pharmaverse Packages
- Load Specifications for Metacore
- Select ADSL Variables
- Start Building Derivations
- Assign
PARAMCD
,PARAM
,PARAMN
- Derive Results and Units (
AVAL
,AVALU
) - Derive Additional Parameters (e.g.
MAP
,BMI
orBSA
forADVS
) - Derive Timing Variables (e.g.
AVISIT
,ATPT
,ATPTN
) - Derive summary records (e.g. mean of the triplicates at each time point)
- Timing Flag Variables (e.g.
ONTRTFL
) - Assign Reference Range Indicator (
ANRIND
) - Derive Baseline (
BASETYPE
,ABLFL
,BASE
,BNRIND
) - Derive Change from Baseline (
CHG
,PCHG
) - Derive Analysis Flags (e.g.
ANL01FL
) - Assign Treatment (
TRTA
,TRTP
) - Assign
ASEQ
- Derive Categorization Variables (
AVALCATy
) - Assign Parameter Level Values (
PARAM
,PARAMN
) - Add ADSL variables
- Apply Metadata and eSub Checks
Load Data and Required pharmaverse Packages
First we will load the packages required for our project. We will use {admiral}
for the creation of analysis data. {admiral}
requires {dplyr}
, {lubridate}
and {stringr}
. Find {admiral}
functions and related variables by searching admiraldiscovery. We will use {metacore}
and {metatools}
to store and manipulate metadata from our specifications. We will use {xportr}
to perform checks on the final data and export to a transport file.
Then we will load our input data.
Load Specifications for Metacore
We have saved our specifications in an Excel file and will load them into {metacore}
with the metacore::spec_to_metacore()
function.
Select ADSL Variables
Some variables from the ADSL
dataset required for the derivations are merged into the VS
domain using the admiral::derive_vars_merged()
function. The rest of the relevant ADSL
variables would be added later.
Sample of Data
Start Building Derivations
The first derivation step we are going to do is to compute the Analysis Date and Relative Analysis Day with the variables merged from ADSL
dataset. The resulting dataset has the 2 columns created.
# Calculate ADT, ADY
advs <- advs %>%
derive_vars_dt(
new_vars_prefix = "A",
dtc = VSDTC,
# Below arguments are default values and not necessary to add in our case
highest_imputation = "n", # means no imputation is performed on partial/missing dates
flag_imputation = "auto" # To automatically create ADTF variable when highest_imputation is "Y", "M" or "D"
) %>%
derive_vars_dy(
reference_date = TRTSDT,
source_vars = exprs(ADT)
)
Sample of Data
Assign PARAMCD
, PARAM
, PARAMN
To assign parameter level values such as PARAMCD
, PARAM
, PARAMN
, etc., a lookup can be created to join to the source data.
For example, when creating ADVS
, a lookup based on the SDTM --TESTCD
value may be created:
VSTESTCD |
PARAMCD |
PARAM |
PARAMN |
---|---|---|---|
SYSBP | SYSBP | Systolic Blood Pressure (mmHg) | 1 |
DIABP | DIABP | Diastolic Blood Pressure (mmHg) | 2 |
PULSE | PULSE | Pulse Rate (beats/min) | 3 |
WEIGHT | WEIGHT | Weight (kg) | 4 |
HEIGHT | HEIGHT | Height (cm) | 5 |
TEMP | TEMP | Temperature (C) | 6 |
MAP | MAP | Mean Arterial Pressure | 7 |
BMI | BMI | Body Mass Index(kg/m^2) | 8 |
BSA | BSA | Body Surface Area(m^2) | 9 |
This lookup may now be joined to the source data:
At this stage, only PARAMCD
is required to perform the derivations. Additional derived parameters may be added, so only PARAMCD
is joined to the datasets at this point. All other variables related to PARAMCD
(e.g. PARAM
, PARAMN
, …) will be added when all PARAMCD
are derived.
advs <- advs %>%
# Add PARAMCD only - add PARAM etc later
derive_vars_merged_lookup(
dataset_add = param_lookup,
new_vars = exprs(PARAMCD),
by_vars = exprs(VSTESTCD),
# Below arguments are default values and not necessary to add in our case
print_not_mapped = TRUE # Printing whether some parameters are not mapped
)
All `VSTESTCD` are mapped.
Sample of Data
Derive Results and Units (AVAL
, AVALU
)
The mapping of AVAL
and AVALU
is left to the ADaM programmer. An example mapping may be:
Sample of Data
In this example, as is often the case for ADVS, all AVAL
values are numeric without any corresponding non-redundant text value for AVALC
. Per recommendation in ADaMIG v1.3 we do not map AVALC
.
Derive Additional Parameters (e.g. MAP
, BMI
or BSA
for ADVS
)
Optionally derive new parameters creating PARAMCD
and AVAL
. Note that only variables specified in the by_vars
argument will be populated in the newly created records. This is relevant to the functions admiral::derive_param_map
, admiral::derive_param_bsa
, admiral::derive_param_bmi
, and admiral::derive_param_qtc
.
Below is an example of creating Mean Arterial Pressure
for ADVS
using the wrapper function admiral::derive_param_map()
advs <- advs %>%
derive_param_map(
by_vars = exprs(STUDYID, USUBJID, !!!adsl_vars, VISIT, VISITNUM, ADT, ADY, VSTPT, VSTPTNUM, AVALU), # Other variables than the defined ones here won't be populated
set_values_to = exprs(PARAMCD = "MAP"),
get_unit_expr = VSSTRESU,
filter = VSSTAT != "NOT DONE" | is.na(VSSTAT),
# Below arguments are default values and not necessary to add in our case
sysbp_code = "SYSBP",
diabp_code = "DIABP",
hr_code = NULL
)
Similarly we could create Body Mass Index
(BMI) for ADVS
using the wrapper function admiral::derive_param_bmi()
, instead we will see in below example how to use the more generic function admiral::derive_param_computed()
Note that if height is collected only once use constant_parameters
to define the corresponding parameter which will be merged to the other parameters and constant_by_vars
to specify the subject-level variable to merge on. Otherwise BMI is only calculated for visits where both parameters HEIGHT
and WEIGHT
are collected.
advs <- advs %>%
derive_param_computed(
by_vars = exprs(STUDYID, USUBJID, VISIT, VISITNUM, ADT, ADY, VSTPT, VSTPTNUM),
parameters = "WEIGHT",
set_values_to = exprs(
AVAL = AVAL.WEIGHT / (AVAL.HEIGHT / 100)^2,
PARAMCD = "BMI",
AVALU = "kg/m^2"
),
constant_parameters = c("HEIGHT"),
constant_by_vars = exprs(USUBJID)
)
Likewise, wrapper function admiral::derive_param_bsa()
call below, to create parameter Body Surface Area
(BSA) for ADVS
domain. Note that if height is collected only once use constant_by_vars
to specify the subject-level variable to merge on. Otherwise BSA is only calculated for visits where both parameters HEIGHT
and WEIGHT
are collected.
advs <- advs %>%
derive_param_bsa(
by_vars = exprs(STUDYID, USUBJID, !!!adsl_vars, VISIT, VISITNUM, ADT, ADY, VSTPT, VSTPTNUM),
method = "Mosteller",
set_values_to = exprs(
PARAMCD = "BSA",
AVALU = "m^2"
),
get_unit_expr = VSSTRESU,
filter = VSSTAT != "NOT DONE" | is.na(VSSTAT),
constant_by_vars = exprs(USUBJID),
# Below arguments are default values and not necessary to add in our case
height_code = "HEIGHT",
weight_code = "WEIGHT"
)
Sample of Data
Sample of Data
Sample of Data
Derive Timing Variables (e.g. AVISIT
, ATPT
, ATPTN
)
Categorical timing variables are protocol and analysis dependent. Below is a simple example.
advs <- advs %>%
mutate(
ATPTN = VSTPTNUM,
ATPT = VSTPT,
AVISIT = case_when(
str_detect(VISIT, "SCREEN|UNSCHED|RETRIEVAL|AMBUL") ~ NA_character_,
!is.na(VISIT) ~ str_to_title(VISIT),
TRUE ~ NA_character_
),
AVISITN = as.numeric(case_when(
VISIT == "BASELINE" ~ "0",
str_detect(VISIT, "WEEK") ~ str_trim(str_replace(VISIT, "WEEK", "")),
TRUE ~ NA_character_
))
)
For assigning visits based on time windows and deriving periods, subperiods, and phase variables see the “Visit and Period Variables” vignette.
Derive summary records (e.g. mean of the triplicates at each time point)
For adding new records based on aggregating records admiral::derive_summary_records()
can be used. For the new records only the variables specified by by_vars
and set_values_to
are populated.
For each subject, Vital Signs parameter, visit, and date add a record holding the average value for observations on that date. Set DTYPE
to AVERAGE
.
advs <- derive_summary_records(
dataset = advs,
dataset_add = advs, # Observations from the specified dataset are going to be used to calculate and added as new records to the input dataset.
by_vars = exprs(STUDYID, USUBJID, !!!adsl_vars, PARAMCD, AVISITN, AVISIT, ADT, ADY, AVALU),
filter_add = !is.na(AVAL),
set_values_to = exprs(
AVAL = mean(AVAL),
DTYPE = "AVERAGE"
)
)
Timing Flag Variables (e.g. ONTRTFL
)
In some analyses, it may be necessary to flag an observation as on-treatment. The admiral function admiral::derive_var_ontrtfl()
can be used.
For example, if on-treatment is defined as any observation between treatment start and treatment end, the flag may be derived as:
Sample of Data
Assign Reference Range Indicator (ANRIND
)
The admiral function derive_var_anrind()
may be used to derive the reference range indicator ANRIND
.
This function requires the reference range boundaries to exist on the data frame (ANRLO
, ANRHI
) and also accommodates the additional boundaries A1LO
and A1HI
.
The function is called as:
Sample of Data
Derive Baseline (BASETYPE
, ABLFL
, BASE
, BNRIND
)
The BASETYPE
should be derived using the function admiral::derive_basetype_records()
. The parameter basetypes
of this function requires a named list of expression detailing how the BASETYPE
should be assigned. Note, if a record falls into multiple expressions within the basetypes expression, a row will be produced for each BASETYPE
.
advs <- derive_basetype_records(
dataset = advs,
basetypes = exprs(
"LAST: AFTER LYING DOWN FOR 5 MINUTES" = ATPTN == 815,
"LAST: AFTER STANDING FOR 1 MINUTE" = ATPTN == 816,
"LAST: AFTER STANDING FOR 3 MINUTES" = ATPTN == 817,
"LAST" = is.na(ATPTN)
)
)
count(advs, ATPT, ATPTN, BASETYPE)
# A tibble: 4 × 4
ATPT ATPTN BASETYPE n
<chr> <dbl> <chr> <int>
1 AFTER LYING DOWN FOR 5 MINUTES 815 LAST: AFTER LYING DOWN FOR 5 MINUT… 10944
2 AFTER STANDING FOR 1 MINUTE 816 LAST: AFTER STANDING FOR 1 MINUTE 10938
3 AFTER STANDING FOR 3 MINUTES 817 LAST: AFTER STANDING FOR 3 MINUTES 10942
4 <NA> NA LAST 29184
It is important to derive BASETYPE
first so that it can be utilized in subsequent derivations. This will be important if the data frame contains multiple values for BASETYPE
.
Next, the analysis baseline flag ABLFL
can be derived using the {admiral}
function admiral::derive_var_extreme_flag()
. For example, if baseline is defined as the last non-missing AVAL
prior or on TRTSDT
, the function call for ABLFL
would be:
advs <- restrict_derivation(
advs,
derivation = derive_var_extreme_flag,
args = params(
by_vars = exprs(STUDYID, USUBJID, BASETYPE, PARAMCD),
order = exprs(ADT, VISITNUM, VSSEQ),
new_var = ABLFL,
mode = "last", # Determines of the first or last observation is flagged
# Below arguments are default values and not necessary to add in our case
true_value = "Y"
),
filter = (!is.na(AVAL) &
ADT <= TRTSDT & !is.na(BASETYPE) & is.na(DTYPE)
)
)
Sample of Data
Lastly, the BASE
, and BNRIND
columns can be derived using the {admiral}
function admiral::derive_var_base()
. Example calls are:
advs <- derive_var_base(
advs,
by_vars = exprs(STUDYID, USUBJID, PARAMCD, BASETYPE),
source_var = AVAL,
new_var = BASE,
# Below arguments are default values and not necessary to add in our case
filter = ABLFL == "Y"
)
advs <- derive_var_base(
advs,
by_vars = exprs(STUDYID, USUBJID, PARAMCD, BASETYPE),
source_var = ANRIND,
new_var = BNRIND
)
Sample of Data
Derive Change from Baseline (CHG
, PCHG
)
Change and percent change from baseline can be derived using the {admiral}
functions admiral::derive_var_chg()
and admiral::derive_var_pchg()
. These functions expect AVAL
and BASE
to exist in the data frame. The CHG
is simply AVAL - BASE
and the PCHG
is (AVAL - BASE) / absolute value (BASE) * 100
. If the variables should not be derived for all records, e.g., for post-baseline records only, admiral::restrict_derivation()
can be used. Examples calls are:
Sample of Data
Derive Analysis Flags (e.g. ANL01FL
)
In most finding ADaMs, an analysis flag is derived to identify the appropriate observation(s) to use for a particular analysis when a subject has multiple observations within a particular timing period.
In this situation, an analysis flag (e.g. ANLzzFL
) may be used to choose the appropriate record for analysis.
This flag may be derived using the {admiral}
function admiral::derive_var_extreme_flag()
. For this example, we will assume we would like to choose within the Post-Baseline records the latest and highest value by USUBJID
, PARAMCD
, AVISIT
, and ATPT
.
advs <- restrict_derivation(
advs,
derivation = derive_var_extreme_flag,
args = params(
new_var = ANL01FL,
by_vars = exprs(STUDYID, USUBJID, PARAMCD, AVISIT, ATPT, DTYPE),
order = exprs(ADT, AVAL),
mode = "last", # Determines of the first or last observation is flagged - As seen while deriving ABLFL
# Below arguments are default values and not necessary to add in our case
true_value = "Y"
),
filter = !is.na(AVISITN) & ONTRTFL == "Y"
)
Sample of Data
Assign Treatment (TRTA
, TRTP
)
TRTA
and TRTP
must match at least one value of the character treatment variables in ADSL (e.g., TRTxxA
/TRTxxP
, TRTSEQA
/TRTSEQP
, TRxxAGy
/TRxxPGy
).
An example of a simple implementation for a study without periods could be:
# A tibble: 5 × 5
TRTP TRTA TRT01P TRT01A n
<chr> <chr> <chr> <chr> <int>
1 Placebo Placebo Placebo Placebo 22102
2 Xanomeline High Dose Xanomeline High Dose Xanomeline High Dose Xanomeli… 16782
3 Xanomeline High Dose Xanomeline Low Dose Xanomeline High Dose Xanomeli… 1038
4 Xanomeline Low Dose Xanomeline Low Dose Xanomeline Low Dose Xanomeli… 17986
5 <NA> <NA> <NA> <NA> 4100
For studies with periods see the “Visit and Period Variables” vignette.
Assign ASEQ
The {admiral}
function admiral::derive_var_obs_number()
can be used to derive ASEQ
. An example call is:
Sample of Data
Derive Categorization Variables (AVALCATy
)
We can use the admiral::derive_vars_cat()
function to derive the categorization variables.
Sample of Data
Assign Parameter Level Values (PARAM
, PARAMN
)
When all PARAMCD
have been derived and added to the dataset, the other information from the look-up table (PARAM
, PARAMN
,…) should be added using admiral::derive_vars_merged()
function.
Another way to assign parameter-level values is by using the metatools
package with the {metacore}
objects we created at the beginning. To use the metatools::create_var_from_codelist()
function, as shown in the example below, certain prerequisites must be met. Specifically, this function relies on code/decode pairs from a {metacore}
object. Therefore, these pairs must be defined in the corresponding ADaMs specifications before creating the {metacore}
object.
We can look into the {metacore}
object and see these pairs for the PARAM
variable.
# A tibble: 9 × 2
code decode
<chr> <chr>
1 SYSBP Systolic Blood Pressure
2 DIABP Diastolic Blood Pressure
3 PULSE Pulse Rate
4 WEIGHT Weight
5 HEIGHT Height
6 TEMP Temperature
7 MAP Mean Arterial Pressure
8 BMI Body Mass Index (kg/m^2)
9 BSA Body Surface Area (m^2)
Sample of Data
Add ADSL variables
If needed, the other ADSL
variables can now be added. List of ADSL variables already merged held in vector adsl_vars
Sample of Data
Apply Metadata and eSub Checks
We use {metatools}
and {xportr}
to perform checks, apply metadata such as types, lengths, labels, and write the dataset to an XPT file.
dir <- tempdir() # Specify the directory for saving the XPT file
# Apply metadata and perform checks
advs_prefinal <- advs %>%
drop_unspec_vars(metacore) %>% # Drop unspecified variables from specs
check_variables(metacore, dataset_name = "ADVS") %>% # Check all variables specified are present and no more
order_cols(metacore) %>% # Orders the columns according to the spec
sort_by_key(metacore) # Sorts the rows by the sort keys
# Apply apply labels, formats, and export the dataset to an XPT file.
advs_final <- advs_prefinal %>%
xportr_type(metacore) %>%
xportr_length(metacore) %>%
xportr_label(metacore) %>%
xportr_format(metacore, domain = "ADVS") %>%
xportr_df_label(metacore, domain = "ADVS") %>%
xportr_write(file.path(dir, "advs.xpt"), metadata = metacore, domain = "ADVS")