Dates, times and imputation can be a frustrating facet of any programming language. Here’s how {admiral} makes all of this easy!
ADaM
Technical
Author
Edoardo Mancini
Published
September 26, 2023
Introduction
Date and time is collected in SDTM as character values using the extended ISO 8601 format yyyy-dd-mmThh:mm:ss. This universal format allows missing parts date or time - e.g. the string"2019-10" represents a date where the day and the time are unknown. In contrast, ADaM timing variables like ADTM (Analysis Datetime) or ADY (Analysis Relative Day) are numeric variables, which can be derived only if the date or datetime is complete.
Most ADaM programmers have, at one point or another, encountered situations where missing dates, unexpected formats or confusing imputation functions rendered derivations of timing variables frustrating and time consuming. {admiral} aims to mitigate this (where possible!) by providing functions which automatically derive date/datetime variables for you, and fill in missing date or time parts according to well-defined imputation rules.
In this article, we first examine the arsenal of functions provided by{admiral} to aid in datetime imputation and timing variable derivation. We then observe everything in action through a number of selected typical examples.
Date/Datetime Derivation and Imputation Functions
{admiral} provides the following functions for date/datetime imputation:
Derivations for adding variables
derive_vars_dt(): Adds a date variable and a date imputation flag variable (optional) based on a –DTC variable and imputation rules.
derive_vars_dtm(): Adds a datetime variable, a date imputation flag variable, and a time imputation flag variable (both optional) based on a –DTC variable and imputation rules.
Computation functions
impute_dtc_dtm(): Returns a complete ISO 8601 datetime or NA based on a partial ISO 8601 datetime and imputation rules.
impute_dtc_dt(): Returns a complete ISO 8601 date (without time) or NA based on a partial ISO 8601 date(time) and imputation rules.
convert_dtc_to_dt(): Returns a date if the input ISO 8601 date is complete. Otherwise, NA is returned.
convert_dtc_to_dtm(): Returns a datetime if the input ISO 8601 date is complete (with missing time replaced by "00:00:00" as default). Otherwise, NA is returned.
From the point of view of a typical ADaM programmer, the functions impute_*, convert_* and compute_* above can be viewed as utilities for treating dates and/or imputation within any custom code. In contrast, their derive_* find their use in directly deriving new timing variables and/or carrying out imputation at an ADaM dataset scale.
For a detailed look at the Imputation rules applied by these {admiral} functions, please visit this vignette on the documentation website.
Simple Examples with Vectors
In the examples below, one can observe how some members of the class of utilities impute_*() and convert_*() can be employed to do the date-related heavy lifting.
Imputing a Partial Date Portion
It is easy impute dates to the first day/month if they are partial just by using the highest_imputation argument:
A simple modification using date_imputation = "mid" or date_imputation = "last" or enables the imputation to be made using the middle or last day/month instead:
# Impute to last day/month if date is partialimpute_dtc_dt(dtc = dates,highest_imputation ="M",date_imputation ="last",)
When it comes to carrying out an imputation, the twin task is to flag the type of imputation that was executed. Here, functions like compute_dtf() make this straightforward. For this function, all that needs to be done is to pass a date character date to the dtc argument, and the resulting imputed date to the dt argument. This will then return the right date imputation flag - see the examples below for some possible behaviors:
The derive_*() functions are essentially wrappers around the aforementioned impute_*() and compute_*() functions. In the following section, we explore examples where ADaM variables can be derived using this class of functions.
Creating an Imputed Datetime and Date Variable and Imputation Flag Variables
As described previously, derive_vars_dtm() derives an imputed datetime variable and the corresponding date and time imputation flags. The imputed date variable can then be derived by using derive_vars_dtm_to_dt(). It is not necessary and advisable to perform the imputation for the date variable if it was already done for the datetime variable. CDISC considers the datetime and the date variable as two representations of the same date. Thus the imputation must be the same and the imputation flags are valid for both the datetime and the date variable.
If the time should be imputed but not the date, the highest_imputation argument should be set to "h". This results in NA if the date is partial. As no date is imputed the date imputation flag is not created.
Usually an adverse event start date is imputed as the earliest date of all possible dates when filling the missing parts. The result may be a date before treatment start date. This is not desirable because the adverse event would not be considered as treatment emergent and excluded from the adverse event summaries. This can be avoided by specifying the treatment start date variable (TRTSDTM) for the min_dates argument.
Importantly, TRTSDTM is used as imputed date only if the non missing date and time parts of AESTDTC coincide with those of TRTSDTM. Therefore 2019-10 is not imputed as 2019-11-11 12:34:56. This ensures that collected information is not changed by the imputation.
If a date is imputed as the latest date of all possible dates when filling the missing parts, it should not result in dates after data cut off or death. This can be achieved by specifying the dates for the max_dates argument.
Importantly, non missing date parts are not changed. Thus 2019-12-04 is imputed as 2019-12-04 23:59:59 although it is after the data cut off date. It may make sense to replace it by the data cut off date but this is not part of the imputation. It should be done in a separate data cleaning or data cut off step.
If imputation is required without creating a new variable the convert_dtc_to_dt() function can be called to obtain a vector of imputed dates. It can be used for example here:
Using More Than One Imputation Rule for a Variable
Using different imputation rules depending on the observation can be done by using the higher-order function slice_derivation(), which applies a derivation function differently (by varying its arguments) in different subsections of a dataset. For example, consider this Vital Signs case where pre-dose records require a different treatment to other records:
Deriving timing variables and carrying out imputations is tricky at the best of times, but hopefully this blog post can shed some light on how make this all easier using the {admiral} package! As {admiral} developers we are always interested in knowing how users are employing the package for their ADaM needs, so if you have any comments or feedback related to this topic, don’t be afraid to leave a comment on our Slack channel or on the Github repository, either as an issue or as a discussion.
For an even more detailed treatment of this topic, users are once again invited to read the corresponding vignette on the documentation website, from which this article was adapted.