## Main Idea

The main idea of admiral is that an ADaM dataset is built by a sequence of derivations. Each derivation adds one or more variables or parameters to the processed dataset. This modular approach makes it easy to adjust code by adding, removing, or modifying derivations. Each derivation is a function call. Consider for example the following script which creates a (very simple) ADSL dataset.

### Load Packages and Example Datasets

First, we will load our packages and example datasets to help with our `ADSL`

creation. The dplyr and lubridate packages are tidyverse packages and used heavily throughout this script. The admiral package also leverages the `{admiral.test}`

package for example SDTM datasets which are from the CDISC Pilot Study.

### Derive Treatment Variables (`TRT0xP`

, `TRT0xA`

)

The mapping of the treatment variables is left to the ADaM programmer. An example mapping may be:

### Derive/Impute Numeric Treatment Date/Time and Flag Variables (`TRTSDTM`

, `TRTEDTM`

, `TRTSTMF`

, `TRTETMF`

)

The function `derive_vars_dtm()`

can be used to convert the DTC variables from `EX`

to numeric datetime variable and impute missing components. The function call returns the original data frame with the column `EXSTDTM`

, `EXENDTM`

and corresponding time imputation flag variables `EXSTTMF`

and `EXENTMF`

added to the end of the dataframe. Exposure observations with an incomplete date are ignored. We impute missing time to be 23:59:59 using `time_imputation = "last"`

. The required imputation flags are determined automatically by the function. Here only time imputation flags are derived because time is imputed but date is not imputed.

Don’t be intimidated by the number of arguments! We try to make our arguments self-explanatory, e.g. the `new_vars_prefix`

places `EXST`

at the start of the `--DTM`

variable and `time_imputation = "last"`

appends 23:59:59. However, this is not always possible to make every argument self-explanatory. If you click on the function, `derive_vars_dtm()`

, you can bring up the reference documentation and learn more about each argument.

```
# Derive treatment variables
## Impute time of exposure dates (creates numeric datetime and time imputation flag variables)
ex_ext <- ex %>%
derive_vars_dtm(
dtc = EXSTDTC,
new_vars_prefix = "EXST"
) %>%
derive_vars_dtm(
dtc = EXENDTC,
new_vars_prefix = "EXEN",
time_imputation = "last"
)
## Derive variables for first/last treatment date and time imputation flags
adsl <- adsl %>%
derive_vars_merged(
dataset_add = ex_ext,
filter_add = !is.na(EXSTDTM),
new_vars = vars(TRTSDTM = EXSTDTM, TRTSTMF = EXSTTMF),
order = vars(EXSTDTM, EXSEQ),
mode = "first",
by_vars = vars(STUDYID, USUBJID)
) %>%
derive_vars_merged(
dataset_add = ex_ext,
filter_add = !is.na(EXENDTM),
new_vars = vars(TRTEDTM = EXENDTM, TRTETMF = EXENTMF),
order = vars(EXENDTM, EXSEQ),
mode = "last",
by_vars = vars(STUDYID, USUBJID)
)
```

### Derive Date Variables from Date/Time Variables (`TRTSDT`

, `TRTEDT`

)

The datetime variables returned can be converted to dates using the `derive_vars_dtm_to_dt()`

function.

```
adsl <- adsl %>%
derive_vars_dtm_to_dt(source_vars = vars(TRTSDTM, TRTEDTM))
```

### Derive Treatment Duration (`TRTDURD`

)

Now, that `TRTSDT`

and `TRTEDT`

are derived, the function `derive_var_trtdurd()`

can be used to calculate the Treatment duration (`TRTDURD`

). Notice the lack of inputs. The function defaults are set to `TRTSDT`

and `TRTEDT`

. Clicking on `derive_var_trtdurd()`

will bring up the reference documentation where you can see the **default** arguments.

```
adsl <- adsl %>%
derive_var_trtdurd()
```

Amazing! With one dplyr function and four admiral functions we successfully created nine new variables for our `ADSL`

dataset. Let’s take a look at all our newly derived variables.

**Note:** We only display variables that were derived. A user running this code will have additional adsl variables displayed. You can use the **Choose Columns to Display** button to add more variables into the table.

## Derivations

The most important functions in admiral are the derivations. These functions start with `derive_`

. The first parameter of these functions expects the input dataset. This allows us to string together derivations using the `%>%`

operator.

Functions which derive a dedicated variable start with `derive_var_`

followed by the variable name, e.g., `derive_var_trtdurd()`

derives the `TRTDURD`

variable.

Functions which can derive multiple variables start with `derive_vars_`

followed by the variable name, e.g., `derive_vars_dtm()`

can derive both the `TRTSDTM`

and `TRTSTMF`

variables.

Functions which derive a dedicated parameter start with `derive_param_`

followed by the parameter name, e.g., `derive_param_bmi()`

derives the `BMI`

parameter.

## Input and Output

It is expected that the input dataset is not grouped. Otherwise an error is issued.

The input dataset should not include variables starting with `temp_`

. These variable names are reserved for temporary variables used within the derivation and are removed from the output dataset. If the input dataset contains such variables, an error is issued.

It is expected all variable names are uppercase in the input dataset and new variables will be returned in uppercase.

The output dataset is ungrouped. The observations are not ordered in a dedicated way. In particular, the order of the observations of the input dataset may not be preserved.

## Computations

Computations expect vectors as input and return a vector. Usually these computation functions can not be used with `%>%`

. These functions can be used in expressions like `convert_dtc_to_dt()`

in the derivation of `FINLABDT`

in the example below:

```
# Derive final lab visit date
ds_final_lab_visit <- ds %>%
filter(DSDECOD == "FINAL LAB VISIT") %>%
transmute(USUBJID, FINLABDT = convert_dtc_to_dt(DSSTDTC))
# Derive treatment variables
adsl <- dm %>%
# Merge on final lab visit date
derive_vars_merged(
dataset_add = ds_final_lab_visit,
by_vars = vars(USUBJID)
)
```

## Parameters

For parameters which expect variable names or expressions of variable names, symbols or expressions must be specified rather than strings.

For parameters which expect a single variable name, the name can be specified without quotes and quotation, e.g.

`new_var = TEMPBL`

For parameters which expect one or more variable names, a list of symbols is expected, e.g.

`by_vars = vars(PARAMCD, AVISIT)`

For parameters which expect a single expression, the expression needs to be passed “as is”, e.g.

`filter = PARAMCD == "TEMP"`

For parameters which expect one or more expressions, a list of expressions is expected, e.g.

`order = vars(AVISIT, desc(AESEV))`

## Handling of Missing Values

When using the haven package to read SAS datasets into R, SAS-style character missing values, i.e. `""`

, are *not* converted into proper R `NA`

values. Rather they are kept as is. This is problematic for any downstream data processing as R handles `""`

just as any other string. Thus, before any data manipulation is being performed SAS blanks should be converted to R `NA`

s using admiral’s `convert_blanks_to_na()`

function, e.g.

```
dm <- haven::read_sas("dm.sas7bdat") %>%
convert_blanks_to_na()
```

Note that any logical operator being applied to an `NA`

value *always* returns `NA`

rather than `TRUE`

or `FALSE`

.

```
visits <- c("Baseline", NA, "Screening", "Week 1 Day 7")
visits != "Baseline"
#> [1] FALSE NA TRUE TRUE
```

The only exception is `is.na()`

which returns `TRUE`

if the input is `NA`

.

```
is.na(visits)
#> [1] FALSE TRUE FALSE FALSE
```

Thus, to filter all visits which are not `"Baseline"`

the following condition would need to be used.

```
visits != "Baseline" | is.na(visits)
#> [1] FALSE TRUE TRUE TRUE
```

Also note that most aggregation functions, like `mean()`

or `max()`

, also return `NA`

if any element of the input vector is missing.

To avoid this behavior one has to explicitly set `na.rm = TRUE`

.

This is very important to keep in mind when using admiral’s aggregation functions such as `derive_summary_records()`

.

## Validation

All functions are reviewed and tested to ensure that they work as described in the documentation. They are **not validated** yet.

Although admiral follows CDISC standards, it does not claim that the dataset resulting from calling admiral functions is ADaM compliant. This has to be ensured by the user.

## Starting a Script

For the ADaM data structures, an overview of the flow and example function calls for the most common steps are provided by the following vignettes:

admiral also provides template R scripts as a starting point. They can be created by calling `use_ad_template()`

, e.g.,

```
use_ad_template(
adam_name = "adsl",
save_path = "./ad_adsl.R"
)
```

A list of all available templates can be obtained by `list_all_templates()`

:

```
list_all_templates()
#> Existing ADaM templates in package 'admiral':
#> • ADAE
#> • ADCM
#> • ADEG
#> • ADEX
#> • ADLB
#> • ADMH
#> • ADPP
#> • ADSL
#> • ADVS
```

## Support

Support is provided via the admiral Slack channel.