Introduction
This article describes creating an Anti-Drug Antibody (ADA) ADaM
(ADAB). The first part of the article describes the data
staging, the second part describes the methods of computation of
parameters, the third part describes assembling the final data set into
by visit and parameter based records then adding final flags, ADSL
variables then compute ASEQ.
Note: CDISC does not currently have a model or implementation guide
for ADA (ADAB). This adab.R template is
currently considered experimental.
This example presented uses underlying EX and
IS domains where the EX and IS
domains represent data as collected and the ADAB ADaM is
output. For purposes of the examples, the surrogate data for ADA is
called IS_ADA. The IS domain is intended for
multiple classes of findings so a dedicated IS_ADA was
added to the pharmaversesdtm for ADA data.
This template and the sample IS dataset works off of
IS having primary Anti-Drug Antibody results
(e.g. ISTESTCD = "ADA_BAB") and Neutralizing Antibody
(e.g. ISTESTCD = "ADA_NAB") records present when
ADA_BAB has a positive finding.
An important aspect of this dataset is the separation of original by
visit data, adding by visit interpreted results and creation of summary
parameters. An analogy to ADAB structure would be
ADEX as it has a mix of original visit based records and
overall/summary records. These will be illustrated later in this
vignette.
Here are the relative time variables we will use for by visit records. These correspond to the names in the CDISC Implementation Guide:
| Variable | Variable Label |
|---|---|
| NFRLT | Nom. Rel. Time from Analyte First Dose |
| AFRLT | Act. Rel. Time from Analyte First Dose |
Note: All examples assume CDISC SDTM and/or ADaM format as input unless otherwise specified.
Programming Workflow
- Read in Source Data
- Prepare
EX, Find First Dose and DeriveAFRLT - Assign Periods, Phases and BASETYPE
- Derive Assigned Variables
- Derive Analysis Variables
- Assemble into Parameter Based Dataset
- Add ADSL variables
- Add Labels and Attributes
Read in Source Data
To start, all datasets needed for the creation of ADAB
should be read into the environment. Datasets needed will be
IS, EX, and ADSL.
Additional domains such as VS and LB may be
used for additional baseline variables if needed. These may come from
either the SDTM or ADaM source.
For the purpose of example, the CDISC Pilot SDTM and ADaM datasets–which are included in pharmaversesdtm–are used.
library(admiral)
library(dplyr)
library(lubridate)
library(stringr)
library(pharmaversesdtm) # Contains example datasets from the CDISC pilot project or simulated
# ---- Load source datasets ----
# Use e.g. haven::read_sas to read in .sas7bdat, or other suitable functions
# as needed and assign to the variables below.
# For illustration purposes read in admiral test data
# Load IS, EX and ADSL from pharmaversesdtm and admiral
is <- pharmaversesdtm::is_ada
ex <- pharmaversesdtm::ex
adsl <- admiral::admiral_adsl
# When SAS datasets are imported into R using haven::read_sas(), missing
# character values from SAS appear as "" characters in R, instead of appearing
# as NA values. Further details can be obtained via the following link:
# https://pharmaverse.github.io/admiral/cran-release/articles/admiral.html#handling-of-missing-values # nolint
ex <- convert_blanks_to_na(ex)
is <- convert_blanks_to_na(is)
adsl <- convert_blanks_to_na(adsl)
# Define values for records with overall values
# Suggested are AVISIT=Overall, AVISITN=11111
overall_avisit <- "OVERALL"
overall_avisitn <- 11111At this step, we first adjust for whether our source data is
IS using SDTM v2.0 standards or from pre-SDTM v1.8. Version
2.0 replaced ISTESTCD and ISTEST with general
ADA categories. For example, values stored (in pre-version 1.8) in
ISTESTCD or ISTEST might now be found in
IS.ISBDAGNT. This template provides a mechanism to process
either version, see below comments for usage.
If your IS dataset has different values in
ISTESTCD and ISTEST than the example (or has
additional tests suitable for the ADAB domain), adjust this
step accordingly.
ADAPARM and ADATYPE are assigned to ensure
consistent merge variables as values are calculated in the event
IS data from pre-version 2.0 are used.
IS.ISTESTCD, and IS.ISBDAGNT from version 2.0
could be substituted if desired.
DRUG (similar to ADPC.DRUG) is an assigned
variable so IS analytes can be merged by matching
EX.EXTRT value(s) for purpose of computing reference dose
and time from first dose.
Common variables as such AVISIT, AVISITN,
FRLTU are assigned.
Compute NFRLT using derive_var_nfrlt() to
add nominal time from first dose. A common source is
IS.VISITDY. IS.VISIT information can also be
parsed for timing values or to assign values for special visits
(i.e. “Treatment Discontinuation”, “Unscheduled”). The below code has
examples to assign special visits from IS.VISIT.
We use derive_vars_merged() to merge
ADSL.TRTSDT and compute ADY. ADY
is assumed to be based on the ADSL.TRTSDT while
ADAB.AFRLT is based on the matching EX.EXTRT
for the corresponding IS analytes.
If IS.ISDTC is missing the time part, “00:00” is used
for the imputation. Individual companies and studies may have custom
imputation methods. If this is the case, the function
derive_vars_dtm() would be either replaced or
post-processed to get the desired imputed time part. For example, a
method might be to use the time part from EX.EXSTDTC
(matched by the IS and EX date parts), then
the IS.ISDTC time is adjusted for pre-dose sampling planned
time. User would add custom code for this.
# Derivations ----
is_dates <- is %>%
# Filter as needed (i.e. exclude ISSTAT has "NOT DONE")
filter(!(ISSTAT %in% c("NOT DONE")) & toupper(ISBDAGNT) == "XANOMELINE") %>%
# Initial preparation and core variables
mutate(
# ADATYPE and ADAPARM are assigned values to serve BY analyte processing
# Uncomment if Setting ADATYPE based on SDTM V1.x ISTESTCD
# ADATYPE = case_when(
# toupper(ISTESTCD) == "ADATEST1" ~ "ADA_BAB",
# toupper(ISTESTCD) == "NABTEST1" ~ "ADA_NAB",
# TRUE ~ NA_character_
# ),
# Uncomment if Setting ADAPARM based on SDTM V1.x ISTESTCD
# ADAPARM = ISTESTCD,
# Setting ADATYPE based on SDTM V2.x ISTESTCD (assumed to have ADA_BAB, ADA_NAB)
# Remove or comment out if not >= SDTM V2.x
ADATYPE = ISTESTCD,
# Setting ADAPARM based on SDTM V2.x ISBDAGNT
# Remove or comment out if not >= SDTM V2.x
ADAPARM = ISBDAGNT,
# When SDTM V1.x, Setting ISBDAGNT from ISTESTCD to work with template
# ISBDAGNT = ISTESTCD,
# Map the analyte test to corresponding DRUG in on EX.EXTRT
# This is especially critical when multiple analytes and EX.EXTRT instances
DRUG = case_when(
toupper(ADAPARM) == "XANOMELINE" ~ "XANOMELINE",
toupper(ADAPARM) == "OTHER_DRUG" ~ "OTHER_DRUG",
TRUE ~ NA_character_
),
# Set AVISIT and AVISITN based on VISIT and VISITNUM
AVISIT = VISIT,
AVISITN = VISITNUM,
# Assign FRLTU (ex: DAYS or HOURS)
FRLTU = "DAYS"
) %>%
# Assign nominal time to NFRLT
# Special visits can be set to NA then can add more code to assign custom values
# (i.e. UNSCHEDULED to 99999, etc.)
# Units for this ADAB sample will be Days, divide result by 24
derive_var_nfrlt(
new_var = NFRLT,
tpt_var = ISTPT,
visit_day = VISITDY,
treatment_duration = 0,
set_values_to_na = str_detect(toupper(VISIT), "UNSCHED") | str_detect(toupper(VISIT), "TREATMENT DISC")
) %>%
mutate(
NFRLT = case_when(
!is.na(NFRLT) ~ NFRLT / 24,
str_detect(toupper(VISIT), "TREATMENT DISC") ~ 99997,
str_detect(toupper(VISIT), "UNSCHED") ~ 99999,
TRUE ~ NA_real_
)
) %>%
# Join ADSL with is (need TRTSDT for ADY derivation)
derive_vars_merged(
dataset_add = adsl,
new_vars = exprs(TRTSDT),
by_vars = exprs(STUDYID, USUBJID)
) %>%
# Derive analysis date/time then compute ADY
# Impute missing time to 00:00:00 or as desired.
# Could replace this code with custom imputation code or function
derive_vars_dtm(
new_vars_prefix = "A",
highest_imputation = "s",
dtc = ISDTC,
ignore_seconds_flag = FALSE,
time_imputation = "00:00:00"
) %>%
# Derive dates and times from date/times
derive_vars_dtm_to_dt(exprs(ADTM)) %>%
derive_vars_dtm_to_tm(exprs(ADTM)) %>%
derive_vars_dy(reference_date = TRTSDT, source_vars = exprs(ADT))Prepare EX, Find First Dose and Derive AFRLT
At this step, we stage the EX data. Filter the data as
desired to match the IS analyte(s). As with the
IS dataset staging, DRUG is also assigned as a
common merge variable. NFRLT is assigned to the
EX data from EX.VISITDY using
derive_var_nfrlt().
Assign ASTDTM using EX.EXSTDTC. As
previously described with IS.ISDTC, if
EX.EXSTDTC is missing the time part, “00:00” is used for
the imputation. Individual companies and studies may have custom
imputation methods. If this is the case, the function
derive_vars_dtm() would be either replaced or
post-processed to get the desired imputed time part. For example a
method might be to use the time part from IS.ISDTC (matched
by the IS and EX date parts), then the
EX missing time is adjusted for pre-dose sampling planned
time. User would add custom code for this. AENDTM from
EX.EXENDTC is also computed in case needed. Both dates then
create date part variables using
derive_vars_dtm_to_dt()
# ---- Get dosing information ----
ex_dates <- ex %>%
# Keep applicable desired records based on EXTRT and/or dose values (>=0, >0, etc.)
filter(
str_detect(toupper(EXTRT), "XANOMELINE") | str_detect(toupper(EXTRT), "PLACEBO"),
EXDOSE >= 0
) %>%
mutate(
# DRUG is a merge variable to map and merge ADA data with EX.EXTRT
# This will be used to merge first dose into IS working data.
# PLACEBO example is for if/when ADA was also collected on Placebo subjects
# or treatments are scrambled prior to database lock.
DRUG = case_when(
str_detect(toupper(EXTRT), "XANOMELINE") ~ "XANOMELINE",
str_detect(toupper(EXTRT), "PLACEBO") ~ "XANOMELINE",
str_detect(toupper(EXTRT), "OTHER_DRUG") ~ "OTHER_DRUG",
TRUE ~ NA_character_
)
) %>%
# Assign nominal time to NFRLT
# Units for this ADAB sample will be Days, divide result by 24
derive_var_nfrlt(
new_var = NFRLT,
visit_day = VISITDY,
treatment_duration = 0
) %>%
mutate(
NFRLT = if_else(!is.na(NFRLT), NFRLT / 24, NA_real_)
) %>%
# Add analysis datetime variables and set missing end date to start date
# Impute missing time to 00:00:00 or as desired.
derive_vars_dtm(
new_vars_prefix = "AST",
dtc = EXSTDTC,
time_imputation = "00:00:00"
) %>%
derive_vars_dtm(
new_vars_prefix = "AEN",
dtc = EXENDTC,
time_imputation = "00:00:00"
) %>%
# Set missing end dates to start date or as desired
mutate(
AENDTM = if_else(is.na(AENDTM), ASTDTM, AENDTM)
) %>%
# Derive dates from date/times
derive_vars_dtm_to_dt(exprs(ASTDTM)) %>%
derive_vars_dtm_to_dt(exprs(AENDTM))At this step, we use derive_vars_merged() to join our
prepared EX data into the prepared IS data to
compute FANLDTM, AFRLT and keep
FANLTMF from ASTTMF in case the first dose
timepart was imputed.
# Note: This template computes only AFRLT, see ADPC template if need EX dose expansion example.
# Derive AFRLT in IS data
is_afrlt <- is_dates %>%
derive_vars_merged(
dataset_add = ex_dates,
filter_add = (EXDOSE >= 0 & !is.na(ASTDTM)),
new_vars = exprs(FANLDTM = ASTDTM, FANLTMF = ASTTMF),
order = exprs(ASTDTM, EXSEQ),
mode = "first",
by_vars = exprs(STUDYID, USUBJID, DRUG)
) %>%
derive_vars_dtm_to_dt(exprs(FANLDTM)) %>%
derive_vars_dtm_to_tm(exprs(FANLDTM)) %>%
derive_vars_duration(
new_var = AFRLT,
start_date = FANLDTM,
end_date = ADTM,
out_unit = "days",
floor_in = FALSE,
add_one = FALSE
)Assign Periods, Phases and BASETYPE
At this step compute or assign applicable APERIOD,
APHASE and/or BASETYPE variables. Note: In the
ADAB dataset, BASETYPE could represent Phase
or Period but the method for assigning the BASE variable
for the given Phase or Period could be appended to BASETYPE
(i.e. “DOUBLE_BLINDED LAST”). BASETYPE is carried through
the remaining steps of this template as a BY variable for computing
parameters. For example, if there is a main phase and a later
maintenance phase, you might want to compute all the ADA parameters by
BASETYPE.
# Compute or assign BASETYPE, APERIOD and APHASE ----------------------------------
# Add study specific code as applicable using ADEX or ADSL APxx / PHw variables
is_basetype <- is_afrlt %>%
mutate(
APERIOD = 1,
APERIODC = "Period 01",
APHASE = NA_character_,
APHASEN = NA_integer_,
BASETYPE = "DOUBLE_BLINDED"
)Derive Assigned Variables
At this step, we assign MRT (Minimum Reportable Titer)
and an optional DTL (Drug Tolerance Level onto the main ADA
analyte records (ADATYPE == "ADA_BAB"). Both are
sponsor/protocol specific assigned values. RESULTC is the
the standard interpreted character result (POSITIVE or
NEGATIVE) based values found in IS.ISSTRESC
(these values will be sponsor specific as applicable, starter values are
supplied to demonstrate). RESULTN is assigned
0/1 as the numeric version of
RESULTC. RESULTN and RESULTC are
precursors to AVAL and AVALC and will serve as
interim variables to facilitate progression of the program and its
derivations.
When IS.ISSTRESC is “POSITIVE”, IS.ISSTRESN
is commonly populated with values > 0. However, if
ISSTRESC is “POSITIVE” with a missing
IS.ISSTRESN, this template provides a method to assign
AVAL using the assigned MRT value and documents the
operation by setting DTYPE = "MRT".
If your IS dataset has additional sponsor specific tests
kept and suitable for this template, adjust or expand variable
assignments in this step accordingly.
# Assign AVAL, AVALC, AVALU and DTYPE for each ISTESTCD and ISBDAGNT
is_aval <- is_basetype %>%
mutate(
MRT = case_when(
ADATYPE == "ADA_BAB" & ADAPARM == "XANOMELINE" ~ 1.4,
ADATYPE == "ADA_BAB" & ADAPARM == "OTHER_DRUG" ~ 9.9,
TRUE ~ NA_real_
),
DTL = case_when(
ADATYPE == "ADA_BAB" & ADAPARM == "XANOMELINE" ~ 999,
ADATYPE == "ADA_BAB" & ADAPARM == "OTHER_DRUG" ~ 888,
TRUE ~ NA_real_
),
RESULTC = case_when(
toupper(ISSTRESC) %in% c(
"NEGATIVE", "NEGATIVE SCREEN", "NEGATIVE IMMUNODEPLETION",
"NEGATIVE CONFIRMATION"
) ~ "NEGATIVE",
toupper(ISSTRESC) %in% c(
"NEGATIVE TITER", "<1.70", "< 1.70", "<1.30", "< 1.30", "<1.40",
"POSITIVE IMMUNODEPLETION", "POSITIVE CONFIRMATION", "POSITIVE"
) ~ "POSITIVE",
ISSTRESN > 0 ~ "POSITIVE",
TRUE ~ NA_character_
),
RESULTN = case_when(
toupper(RESULTC) == "POSITIVE" ~ 1,
toupper(RESULTC) == "NEGATIVE" ~ 0,
TRUE ~ NA_integer_
),
AVAL = case_when(
ADATYPE == "ADA_BAB" & toupper(RESULTC) == "POSITIVE" & !is.na(ISSTRESN) ~ ISSTRESN,
ADATYPE == "ADA_BAB" & toupper(RESULTC) == "POSITIVE" & is.na(ISSTRESN) & !is.na(MRT) ~ MRT,
TRUE ~ NA_real_
),
AVALC = case_when(
# NABSTAT gets ISSTRESC, Standard ADA is set to NA as AVAL are numeric original results
ADATYPE == "ADA_NAB" ~ ISSTRESC,
TRUE ~ NA_character_
),
AVALU = case_when(
ADATYPE == "ADA_NAB" ~ ISSTRESU,
ADATYPE == "ADA_BAB" & toupper(RESULTC) == "POSITIVE" & !is.na(ISSTRESN) ~ ISSTRESU,
ADATYPE == "ADA_BAB" & toupper(RESULTC) == "POSITIVE" & is.na(ISSTRESN) & !is.na(MRT)
~ "titer",
TRUE ~ NA_character_
),
DTYPE = case_when(
ADATYPE == "ADA_BAB" & toupper(RESULTC) == "POSITIVE" & is.na(ISSTRESN) & !is.na(MRT) ~ "MRT",
TRUE ~ NA_character_
)
)Derive Analysis Variables
In this first section, we compute ABLFL,
BASE and CHG. VALIDBASE and
VALIDPOST variables are assigned to accommodate if sponsor
has restrictions to use directly or use later to qualify a record as
baseline or post-baseline as analysis variables are computed. For
example, a missing value at baseline could be considered negative while
non missing post-baseline values might required (or considered Negative
if sponsor defined as such). This example allows missing baseline but
requires post-baseline to not have missing values.
# Begin computation of parameters -----------------------------------------
# Identify Best Baseline for each analyte and parameter type
# Baseline is NFRLT <= 0 or Unscheduled AND the ADA Date is on or before the date of first dose.
is_baseline <- is_aval %>%
# Calculate ABLFL. If more than one record for the 'order' and 'filter' will throw a duplicate
# record warning, user can decide how to adjust the mode, order, filter or adjust in prior step.
restrict_derivation(
derivation = derive_var_extreme_flag,
args = params(
by_vars = exprs(STUDYID, USUBJID, BASETYPE, ADATYPE, ADAPARM),
order = exprs(ADT, NFRLT),
new_var = ABLFL,
mode = "last"
),
# flag baseline based on time not values due to ADA parameters can in some cases permit missing.
# If special visits (i.e. > 60000) are eligible as baselines, include those
filter = ((NFRLT <= 0 | NFRLT > 60000) & (ADT <= FANLDT) & !is.na(BASETYPE))
) %>%
mutate(
# VALID flags for use later as applicable:
# VALIDBASE flags non-missing values on baseline (by each ADATYPE and ADAPARM)
# Note: VALIDBASE is not used as this template allows a baseline to be valid as
# as long as its present (can be missing), adapt as needed.
# VALIDPOST flags non-missing values on post-baseline (by each ADATYPE and ADAPARM)
VALIDBASE = case_when(
ABLFL == "Y" & (!is.na(AVALC) | !is.na(RESULTC) | !is.na(AVAL)) ~ "Y",
ABLFL == "Y" & (is.na(AVALC) & is.na(RESULTC) & is.na(AVAL)) ~ "N",
TRUE ~ NA_character_
),
VALIDPOST = case_when(
ADTM > FANLDTM & is.na(ABLFL) & (!is.na(RESULTC) | !is.na(AVAL)) ~ "Y",
ADTM > FANLDTM & is.na(ABLFL) & (is.na(RESULTC) & is.na(AVAL)) ~ "N",
TRUE ~ NA_character_
)
)
# Compute BASE and CHG
is_aval_change <- is_baseline %>%
derive_var_base(
by_vars = exprs(STUDYID, USUBJID, BASETYPE, ADATYPE, ADAPARM),
source_var = AVAL,
new_var = BASE,
filter = ABLFL == "Y"
) %>%
restrict_derivation(
derivation = derive_var_chg,
filter = is.na(ABLFL)
)
# Interpreted Result Baseline
is_result_change <- is_aval_change %>%
derive_var_base(
by_vars = exprs(STUDYID, USUBJID, BASETYPE, ADATYPE, ADAPARM),
source_var = RESULTN,
new_var = BASE_RESULT,
filter = ABLFL == "Y"
)
# Get base only data for use later
base_data <- is_result_change %>%
filter(ABLFL == "Y") %>%
select(
STUDYID, USUBJID, DRUG, BASETYPE, ADATYPE, ADAPARM, BASE_RESULT, BASE,
ABLFL
)This section computes main ADA analyte parameters into a horizontal dataset.
Key Subject By Visit analysis variables (sponsor or molecule specific or optional):
| Variable | Meaning |
|---|---|
TFLAGV |
By visit Treatment Emergent Category (see *T-Flags) |
PBFLAGV |
Post-Baseline Status by Visit |
ADASTATV |
Overall ADA Status by Visit (ADA Positive or Negative) |
Key Subject Overall analysis variables:
| Variable | Meaning |
|---|---|
BFLAG |
Baseline Positive/Negative Flag |
TFLAG |
Overall Treatment Emergent Category (see *T-Flags) |
PBFLAG |
Post-Baseline Status |
ADASTAT |
Overall Subject ADA Status (ADA Positive or Negative) |
*T-Flags: 0 = Negative, 1 = Induced, 2 = Enhanced, 3 = Treatment
Unaffected
Induced: Negative at Baseline to Positive Post-Baseline
Enhanced: Positive at Baseline to Positive Post-Baseline where Change
from Baseline >= 0.60
Treatment Unaffected: Positive at Baseline to Positive where Change from
Baseline < 0.60
# Assign and save ADABLPFL for later use
adablpfl <- is_result_change %>%
filter(ABLFL == "Y") %>%
distinct(STUDYID, USUBJID, DRUG, BASETYPE,
ADATYPE, ADAPARM,
.keep_all = TRUE
) %>%
select(STUDYID, USUBJID, DRUG, BASETYPE, ADATYPE, ADAPARM, ABLFL) %>%
rename(ADABLPFL = ABLFL)
# Calculate the By Visit parameters
is_visit_flags <- is_result_change %>%
mutate(
TFLAGV = case_when(
VALIDPOST == "Y" & is.na(ABLFL) & ADATYPE == "ADA_BAB" & (BASE_RESULT == 1 & (CHG >= 0.6))
~ 2,
VALIDPOST == "Y" & is.na(ABLFL) & ADATYPE == "ADA_BAB" & BASE_RESULT == 1 &
(((CHG < 0.6) & !is.na(AVAL)) | (RESULTN == 0)) ~ 3,
VALIDPOST == "Y" & is.na(ABLFL) & ADATYPE == "ADA_BAB" & ((BASE_RESULT == 0 |
is.na(BASE_RESULT)) & RESULTN == 0) ~ 0,
VALIDPOST == "Y" & is.na(ABLFL) & ADATYPE == "ADA_BAB" & ((BASE_RESULT == 0 |
is.na(BASE_RESULT)) & RESULTN == 1) ~ 1,
TRUE ~ NA_integer_
),
PBFLAGV = case_when(
!is.na(TFLAGV) & TFLAGV %in% c(1, 2) ~ 1,
!is.na(TFLAGV) & TFLAGV %in% c(0, 3) ~ 0,
TRUE ~ NA_integer_
),
ADASTATV = case_when(
!is.na(PBFLAGV) & PBFLAGV == 1 ~ "ADA+",
!is.na(PBFLAGV) & PBFLAGV == 0 ~ "ADA-",
VALIDPOST == "Y" & is.na(ABLFL) & ADATYPE == "ADA_BAB" & is.na(PBFLAGV) ~ "MISSING",
TRUE ~ "MISSING"
),
)
# These next code segments create utility datasets that will
# then get merged back into the main dataset (is_visit_flags)
# Post baseline must be valid post data (result not missing)
post_data <- is_visit_flags %>%
filter(VALIDPOST == "Y") %>%
select(
STUDYID, USUBJID, DRUG, BASETYPE, ADATYPE, ADAPARM, RESULTN,
AVAL, ADTM, CHG
) %>%
rename(AVAL_P = AVAL) %>%
rename(RESULT_P = RESULTN)
# Use "post_data" to make a ADPBLPFL flag data set to merge back later in the program
# Note: "post_data" is if VALIDPOST="Y" (has a non missing post baseline result)
adpblpfl <- post_data %>%
distinct(STUDYID, USUBJID, DRUG, BASETYPE, ADATYPE,
ADAPARM,
.keep_all = TRUE
) %>%
select(STUDYID, USUBJID, DRUG, BASETYPE, ADATYPE, ADAPARM) %>%
mutate(
ADPBLPFL = "Y"
)
# Compute BFLAG, TFLAG, PBFLAG ----
most_post_result <- post_data %>%
group_by(STUDYID, USUBJID, DRUG, BASETYPE, ADATYPE, ADAPARM) %>%
summarize(RESULT_P = max(RESULT_P)) %>%
ungroup()
most_post_aval <- post_data %>%
filter(!is.na(AVAL_P)) %>%
group_by(STUDYID, USUBJID, BASETYPE, ADATYPE, ADAPARM) %>%
summarize(AVAL_P = max(AVAL_P)) %>%
ungroup()
most_post_chg <- post_data %>%
filter(!is.na(AVAL_P)) %>%
group_by(STUDYID, USUBJID, BASETYPE, ADATYPE, ADAPARM) %>%
summarize(MAXCHG = max(CHG)) %>%
ungroup()
# Merge the most_post_result, most_post_aval and most_post_chg into one utility data set
most_post <- most_post_result %>%
derive_vars_merged(
dataset_add = most_post_aval,
by_vars = exprs(STUDYID, USUBJID, BASETYPE, ADATYPE, ADAPARM)
) %>%
derive_vars_merged(
dataset_add = most_post_chg,
by_vars = exprs(STUDYID, USUBJID, BASETYPE, ADATYPE, ADAPARM)
)
# Use an outer Join to combine baseline with most post results create utility dataset with flag data
flagdata_init <- full_join(base_data, most_post, by = c(
"DRUG", "STUDYID", "USUBJID", "BASETYPE",
"ADATYPE", "ADAPARM"
)) %>%
mutate(
BFLAG =
case_when(
BASE_RESULT == 0 ~ 0,
BASE_RESULT == 1 ~ 1,
TRUE ~ NA_integer_
),
TFLAG =
case_when(
(BFLAG == 0 | is.na(BFLAG)) & RESULT_P == 0 ~ 0,
(BFLAG == 0 | is.na(BFLAG)) & RESULT_P == 1 ~ 1,
BFLAG == 1 & (MAXCHG >= 0.6) ~ 2,
BFLAG == 1 & !is.na(MAXCHG) & (MAXCHG < 0.6) ~ 3,
BFLAG == 1 & is.na(MAXCHG) & RESULT_P == 0 ~ 3,
TRUE ~ NA_integer_
),
PBFLAG =
case_when(
ADATYPE == "ADA_BAB" & (TFLAG == 1 | TFLAG == 2) ~ 1,
ADATYPE == "ADA_BAB" & (TFLAG == 0 | TFLAG == 3) ~ 0,
ADATYPE == "ADA_NAB" & RESULT_P == 0 ~ 0,
ADATYPE == "ADA_NAB" & RESULT_P == 1 ~ 1,
TRUE ~ NA_integer_
),
ADASTAT = case_when(
ADATYPE == "ADA_BAB" & PBFLAG == 1 ~ 1,
ADATYPE == "ADA_BAB" & PBFLAG == 0 ~ 0,
TRUE ~ NA_integer_
)
)In this section, we compute Neutralizing Antibody (Nab) Status (optional and if applicable). Nab is typically obtained from samples where the subject has a positive finding for its parent main analyte. Reminder that ADA and Nab are drug specific. Drug ABC can have ADA and optional Nab tests; Drug XYZ can also have the same pairs and are processed separately.
The target variable for Nab is NABSTAT (subject Nab
Status for the the matching parent analyte) In this template, to be
NABSTAT = "Positive" or "Negative", the subject has to
first be ADASTAT="Positive" on the main parent matching
analyte. If not ADASTAT = "Positive", NABSTAT
is set to either NA or “MISSING” as desired.
# Get overall ADASTAT status for computing NABSTAT
flagdata_adastat <- flagdata_init %>%
derive_vars_merged(
dataset_add = flagdata_init,
filter_add = ADATYPE == "ADA_BAB",
new_vars = exprs(ADASTAT_MAIN = ADASTAT),
by_vars = exprs(STUDYID, USUBJID, DRUG)
)
# Compute NABPOSTMISS onto flag_data for NABSTAT
flagdata_nab <- flagdata_adastat %>%
derive_var_merged_exist_flag(
dataset_add = is_visit_flags,
new_var = NABPOSTMISS,
condition = is.na(ABLFL) & VALIDPOST == "N" & ADATYPE == "ADA_NAB",
by_vars = exprs(STUDYID, USUBJID, BASETYPE, ADATYPE, ADAPARM)
)
# Compute NAB Stat using both methods
# Note: Option 2 added as a placekeeper starter method, adjust as needed for
# study specific specs.
flagdata_final <- flagdata_nab %>%
mutate(
# For Option 1, if any post baseline NAB were blank without a Positive (NABPOSTMISS = Y),
# NABSTAT = missing
nabstat_opt1 = case_when(
# Based on ADASTAT (EMERNEG/EMERPOS) and NAB results
ADATYPE == "ADA_NAB" & ADASTAT_MAIN == 1 & RESULT_P == 1 ~ 1,
ADATYPE == "ADA_NAB" & ADASTAT_MAIN == 1 & RESULT_P == 0 &
(is.na(NABPOSTMISS) | NABPOSTMISS == "N") ~ 0,
TRUE ~ NA_integer_
),
nabstat_opt2 = case_when(
# Based on ADASTAT (EMERNEG/EMERPOS) Only.
ADATYPE == "ADA_NAB" & ADASTAT_MAIN == 1 ~ 1,
ADATYPE == "ADA_NAB" & ADASTAT_MAIN == 0 ~ 0,
TRUE ~ NA_integer_
),
) %>%
# Drop variables no longer needed from flag_data before merging with main ADAB
select(-BASE, -BASE_RESULT, -AVAL_P, -RESULT_P, -DRUG, -ABLFL)In this section, we bring together the prior section variables then
compute the remaining variables into a final horizontal dataset by
STUDYID, USUBJID, BASETYPE,
ADATYPE (Nab or ADA), and ADAPARM. The subject
level variable values are repeated on each
VISIT/ISDTC by BASETYPE,
ADATYPE, ADAPARM.
The key analysis variables computed in this section are:
| Variable | Meaning |
|---|---|
FPPDT |
Date of first ADA Positive |
LPPDT |
Date of Last ADA Positive |
ADADUR |
Duration of ADA (LPPDT - FPPDT) |
TIMADA |
Time to ADA Positive (from date of first of dose of corresponding drug) |
INDUCED |
Negative at Baseline to Positive Post-Baseline |
ENHANCED |
Positive at Baseline to Positive Post-Baseline where
CHG >= 0.60 |
TRANADA |
Transient ADA (either of): (1) INDUCED and Positive at
any post-baseline (other than the last result) or (2)
INDUCEDand multiple post baseline positive (other than the
last result) and duration between the first and last positive is < 16
weeks |
PERSADA |
Persistent ADA (either of): (1) INDUCED and Positive at
the last post-baseline result or (2) INDUCED and multiple
post-baseline positive (other than the last result) and duration between
the first and last positive is >= 16 weeks |
TRANADAE |
Any Treatment Emergent Transient ADA (either of): (1)
INDUCED or ENHANCED and Positive at any
post-baseline (other than the last result) or (2) INDUCED
or ENHANCED and multiple post-baseline positive (other than
the last result) and duration between the first and last positive result
is < 16 weeks |
PERSADAE |
Any Treatment Emergent Persistent ADA (either of): (1)
INDUCED or ENHANCED and Positive at the last
post-baseline result or (2) INDUCED or
ENHANCED and multiple post-baseline positive (other than
the last result) and duration between the first and last positive is
>= 16 weeks |
ADABLPFL |
Baseline ADA Evaluable - a baseline record exists
(ABLFL = "Y") |
ADPBFPFL |
Post-Baseline ADA Evaluable - At least one valid (non-missing) post-baseline result |
# Put TFLAG, BFLAG and PBFLAG and the nabstat_ variables onto the main
# dataset (is_flagdata from is_visit_flags)
is_flagdata <- is_visit_flags %>%
# main_aab_flagdata
derive_vars_merged(
dataset_add = flagdata_final,
by_vars = exprs(STUDYID, USUBJID, BASETYPE, ADATYPE, ADAPARM)
)
# Create a utility data set to compute PERSADA, TRANADA, INDUCED, ENHANCED related parameters
per_tran_pre <- is_flagdata %>%
filter(VALIDPOST == "Y") %>%
select(
STUDYID, USUBJID, BASETYPE, ADATYPE, ADAPARM, ADTM, FANLDTM, FANLDT,
TFLAGV, ISSEQ
) %>%
group_by(STUDYID, USUBJID, BASETYPE, ADATYPE, ADAPARM) %>%
mutate(
MaxADTM = max(ADTM)
) %>%
ungroup() %>%
mutate(
LFLAGPOS = case_when(
ADTM == MaxADTM & (TFLAGV == 1) ~ 1,
ADTM == MaxADTM & (TFLAGV == 2) ~ 1,
TRUE ~ NA_integer_
)
)
# Keep TFLAGV = 1 or TFLAGV = 2 (Any Treatment Emergent)
# Regular Induced PERSADA and TRANADA will later be based on TFLAGV = 1
# TFLAGV = 2 can use for separate PERSADA/TRANADA based on Enhanced
per_tran_all <- per_tran_pre %>%
filter(TFLAGV == 1 | TFLAGV == 2) %>%
select(-MaxADTM, -ISSEQ) %>%
group_by(STUDYID, USUBJID, BASETYPE, ADATYPE, ADAPARM) %>%
mutate(
FPPDTM = min(ADTM),
LPPDTM = max(ADTM)
) %>%
ungroup()
per_tran_inc_last <- per_tran_all %>%
group_by(STUDYID, USUBJID, BASETYPE, ADATYPE, ADAPARM) %>%
summarize(COUNT_INC_LAST = n()) %>%
ungroup()
per_tran_exc_last <- per_tran_all %>%
filter(is.na(LFLAGPOS)) %>%
group_by(STUDYID, USUBJID, BASETYPE, ADATYPE, ADAPARM) %>%
summarize(COUNT_EXC_LAST = n()) %>%
ungroup()
# Reduce "per_tran_all" to one record per by_vars using ADTM = LPPDTM to get best last records
# Then Merge in the Include Last and Exclude Last Flags
per_tran_last <- per_tran_all %>%
filter(ADTM == LPPDTM) %>%
derive_vars_merged(
dataset_add = per_tran_inc_last,
by_vars = exprs(STUDYID, USUBJID, BASETYPE, ADATYPE, ADAPARM)
) %>%
derive_vars_merged(
dataset_add = per_tran_exc_last,
by_vars = exprs(STUDYID, USUBJID, BASETYPE, ADATYPE, ADAPARM)
)
# Compute final parameters
per_tran_final <- per_tran_last %>%
mutate(
FPPDT = as_date(FPPDTM),
LPPDT = as_date(LPPDTM),
ADADUR = case_when(
LPPDTM - FPPDTM == 0 & (!is.na(LPPDTM) & !is.na(FPPDTM)) ~ 1 / 7,
!is.na(LPPDTM) & !is.na(FPPDTM)
~ ((as.numeric(difftime(LPPDTM, FPPDTM, units = "secs")) / (60 * 60 * 24)) + 1) / 7,
!is.na(LPPDT) & !is.na(FPPDT)
~ (as.numeric(LPPDT - FPPDT) + 1) / 7,
TRUE ~ NA_real_
),
TIMADA = case_when(
!is.na(FPPDTM) & !is.na(FANLDTM) ~ as.numeric(difftime(FPPDTM, FANLDTM, units = "weeks")),
!is.na(FPPDT) & !is.na(FANLDT) ~ (as.numeric(FPPDT - FANLDT) + 1) / 7,
TRUE ~ NA_real_
),
tdur = case_when(
!is.na(LPPDTM) & !is.na(FPPDTM) ~
as.numeric(difftime(LPPDTM, FPPDTM, units = "secs")) / (7 * 3600 * 24),
!is.na(LPPDT) & !is.na(FPPDT) ~ as.numeric(LPPDT - FPPDT + 1) / 7,
TRUE ~ NA_real_
),
# Standard TRANADA and PERSADA based on TFLAGV = 1 (Induced)
TRANADA = case_when(
TFLAGV == 1 & ((COUNT_EXC_LAST == 1 | (COUNT_INC_LAST >= 2 & tdur < 16)) & is.na(LFLAGPOS)) ~ 1,
TRUE ~ NA_integer_
),
PERSADA = case_when(
TFLAGV == 1 & (COUNT_INC_LAST == 1 & (COUNT_EXC_LAST <= 0 | is.na(COUNT_EXC_LAST)) |
(COUNT_INC_LAST >= 2 & tdur >= 16) | LFLAGPOS == 1) ~ 1,
TRUE ~ NA_integer_
),
# These TRANADAE and PERSADAE based on TFLAGV = 1 or TFLAGV=2 (Induced or Enhanced)
TRANADAE = case_when(
TFLAGV >= 1 & ((COUNT_EXC_LAST == 1 | (COUNT_INC_LAST >= 2 & tdur < 16)) & is.na(LFLAGPOS)) ~ 1,
TRUE ~ NA_integer_
),
PERSADAE = case_when(
TFLAGV >= 1 & (COUNT_INC_LAST == 1 & (COUNT_EXC_LAST <= 0 | is.na(COUNT_EXC_LAST)) |
(COUNT_INC_LAST >= 2 & tdur >= 16) | LFLAGPOS == 1) ~ 1,
TRUE ~ NA_integer_
),
INDUCED = case_when(
TFLAGV == 1 ~ "Y",
TRUE ~ "N"
),
ENHANCED = case_when(
TFLAGV == 2 ~ "Y",
TRUE ~ "N"
)
) %>%
# Drop temporary variables that do not need to be merged into main ADAB
select(-ADTM, -TFLAGV, -FANLDTM, -FANLDT, -LFLAGPOS, -FPPDT, -LPPDT)
# Put PERSADA, TRANADA, INDUCED, ENHANCED, TDUR, ADADUR onto "is_flagdata" as "main_aab_pertran"
# Note: signal_duplicate_records() error usually occurs when a subject has duplicate
# records for a given BASETYPE, ADATYPE, ADAPARM and ISDTC. Investigate then add code
# to filter it down to best one record per USUBJID, BASETYPE, ADATYPE and ADAPARM
main_aab_pertran <- is_flagdata %>%
derive_vars_merged(
dataset_add = per_tran_final,
by_vars = exprs(STUDYID, USUBJID, BASETYPE, ADATYPE, ADAPARM)
)
main_aab_rtimes <- main_aab_pertran %>%
mutate(
ATPT = ISTPT,
ADADUR = round(ADADUR, digits = 4),
TIMADA = round(TIMADA, digits = 4),
PERSADA = case_when(
is.na(PERSADA) ~ 0,
TRUE ~ PERSADA
),
PERSADAE = case_when(
is.na(PERSADAE) ~ 0,
TRUE ~ PERSADAE
),
TRANADA = case_when(
is.na(TRANADA) ~ 0,
TRUE ~ TRANADA
),
TRANADAE = case_when(
is.na(TRANADAE) ~ 0,
TRUE ~ TRANADAE
),
NABSTAT = nabstat_opt1
)
# Merge ADABLPFL and ADPBLPFL onto main dataset.
main_aab <- main_aab_rtimes %>%
derive_vars_merged(
dataset_add = adablpfl,
new_vars = exprs(ADABLPFL),
by_vars = exprs(STUDYID, USUBJID, BASETYPE, ADATYPE, ADAPARM)
) %>%
derive_vars_merged(
dataset_add = adpblpfl,
new_vars = exprs(ADPBLPFL),
by_vars = exprs(STUDYID, USUBJID, BASETYPE, ADATYPE, ADAPARM)
)Assemble into Parameter Based Dataset
In this section, we turn original IS records, the
duplicated interpreted records plus each analysis variable into separate
prepared data frames. These will then be appended together into the
final ADAB product.
Certain computed variables are limited to only by visit records or for the main analyte. These are noted in comments if as applicable.
If your IS dataset has additional sponsor specific tests
that were added into the preceding code segments, add additional
parameter data frames here to ultimately include in the final
ADAB dataset.
On the records where the parameter is OVERALL, we also keep:
PARCAT1, BASETYPE, PARAM,
ADATYPE, ADAPARM, ISTESTCD,
ISTEST, ISCAT, ISBDAGNT,
AVISITN, AVISIT and ISSPEC. Note:
ADATYPE, ADAPARM and other interim variables
computed earlier are not kept in the final ADAB dataset. If a parameter
has unique keeps or drops, they are noted below as comments.
# Begin Creation of each PARAM for the final ADAB format using main_aab --------
# First create "core_aab" with PARCAT1 to be the input for all the parameter sub-assemblies
core_aab <- main_aab %>%
mutate(
# SDTM V1.x: Assign PARAM from ISTEST or customize
PARCAT1 = ISTEST,
# SDTM V2.x: Assign PARAM using text values plus ADAPARM
PARCAT1 = case_when(
ADATYPE == "ADA_BAB" ~ paste("Anti-", ADAPARM, " Antibody", sep = ""),
ADATYPE == "ADA_NAB" ~ paste("Anti-", ADAPARM, " Neutralizing Antibody", sep = ""),
TRUE ~ NA_character_
),
# Initialize PARAMCD and PARAM
PARAMCD = ADAPARM,
PARAM = PARCAT1
)
# By Visit Primary ADA ISTESTCD Titer Results
adab_titer <- core_aab %>%
filter(ADATYPE == "ADA_BAB") %>%
mutate(
# For ADASTAT, append "Titer Units" to PARAM
PARAM = paste(PARCAT1, ", Titer Units", sep = ""),
)
# By Visit NAB ISTESTCD Results
adab_nabvis <- core_aab %>%
filter(ADATYPE == "ADA_NAB") %>%
# These two flags, BASE and CHG are only kept on the primary ADA ISTESTCD test
select(-BASE, -CHG, -ADABLPFL, -ADPBLPFL)
# By Visit Main ADA Titer Interpreted RESULT data
adab_result <- core_aab %>%
filter(ADATYPE == "ADA_BAB") %>%
mutate(
PARAMCD = "RESULTy",
PARAM = paste("ADA interpreted per sample result,", PARCAT1, sep = " "),
AVALC = toupper(RESULTC),
AVAL = case_when(
AVALC == "NEGATIVE" ~ 0,
AVALC == "POSITIVE" ~ 1,
TRUE ~ NA_integer_
),
AVALU = NA_character_
) %>%
# DTYPE is only kept on the parent ISTESTCD param, recompute CHG and BASE for RESULTy
select(-BASE, -CHG, -DTYPE)
# Derive BASE and Calculate Change from Baseline on BAB RESULT records
adab_result <- adab_result %>%
derive_var_base(
by_vars = exprs(STUDYID, USUBJID, BASETYPE, ADATYPE, ADAPARM),
source_var = AVAL,
new_var = BASE,
filter = ABLFL == "Y"
) %>%
restrict_derivation(
derivation = derive_var_chg,
filter = is.na(ABLFL)
)
# By Visit NAB Interpreted RESULT data
adab_nabres <- core_aab %>%
filter(ADATYPE == "ADA_NAB") %>%
mutate(
PARAMCD = "RESULTy",
PARAM = paste("Nab interpreted per sample result,", PARCAT1, sep = " "),
AVALC = toupper(RESULTC),
AVAL = case_when(
AVALC == "NEGATIVE" ~ 0,
AVALC == "POSITIVE" ~ 1,
TRUE ~ NA_integer_
),
AVALU = NA_character_
) %>%
# These two flags, BASE and CHG are only kept on the primary ADA test
select(-BASE, -CHG, -ADABLPFL, -ADPBLPFL)
# Derive BASE and Calculate Change from Baseline on NAB RESULT records
adab_nabres <- adab_nabres %>%
derive_var_base(
by_vars = exprs(STUDYID, USUBJID, BASETYPE, ADATYPE, ADAPARM),
source_var = AVAL,
new_var = BASE,
filter = ABLFL == "Y"
) %>%
restrict_derivation(
derivation = derive_var_chg,
filter = is.na(ABLFL)
)
# By Visit Titer TFLAGV data -----------------------
# Note: By Visit parameters do not keep CHG, MRT and DTL variables
adab_tflagv <- core_aab %>%
filter(ADATYPE == "ADA_BAB") %>%
mutate(
PARAMCD = "TFLAGV",
PARAM = paste("Treatment related ADA by Visit,", PARCAT1, sep = " "),
AVAL = TFLAGV,
AVALC = as.character(AVAL),
AVALU = NA_character_
) %>%
# Drop BASE, CHG, MRT and DTL and the two --FL flags (are only kept on the primary ADA test)
select(-BASE, -CHG, -MRT, -DTL, -ADABLPFL, -ADPBLPFL)
# By Visit Titer PBFLAGV data ---------------------
# Note: By Visit parameters do not keep CHG, MRT and DTL variables
adab_pbflagv <- core_aab %>%
filter(ADATYPE == "ADA_BAB") %>%
mutate(
PARAMCD = "PBFLAGV",
PARAM = paste("Post Baseline Pos/Neg by Visit,", PARCAT1, sep = " "),
AVALC = case_when(
PBFLAGV == 1 | PBFLAGV == 2 ~ "POSITIVE",
PBFLAGV == 0 | PBFLAGV == 3 ~ "NEGATIVE",
TRUE ~ "MISSING"
),
AVAL = case_when(
AVALC == "POSITIVE" ~ 1,
AVALC == "NEGATIVE" ~ 0,
TRUE ~ NA_integer_
),
AVALU = NA_character_
) %>%
# Drop BASE, CHG, MRT and DTL and the two --FL flags (are only kept on the primary ADA test)
select(-BASE, -CHG, -MRT, -DTL, -ADABLPFL, -ADPBLPFL)
# By Visit Titer ADASTATV data
# Note: By Visit parameters do not keep CHG, MRT and DTL variables
adab_adastatv <- core_aab %>%
filter(ADATYPE == "ADA_BAB") %>%
mutate(
PARAMCD = "ADASTATV",
PARAM = paste("ADA Status of a patient by Visit,", PARCAT1, sep = " "),
AVALC = ADASTATV,
AVAL = case_when(
AVALC == "ADA+" ~ 1,
AVALC == "ADA-" ~ 0,
TRUE ~ NA_integer_
),
AVALU = NA_character_
) %>%
# Drop BASE, CHG, MRT and DTL and the two --FL flags (are only kept on the primary ADA test)
select(-BASE, -CHG, -MRT, -DTL, -ADABLPFL, -ADPBLPFL)
# Next below are the individual params
# assign the AVISIT and AVISITN for individual params
core_aab <- core_aab %>%
mutate(
AVISIT = overall_avisit,
AVISITN = overall_avisitn
)
# Get Patient flag BFLAG
adab_bflag <- core_aab %>%
filter(ADATYPE == "ADA_BAB") %>%
distinct(
STUDYID, USUBJID, PARCAT1, BASETYPE, PARAM, ADATYPE, ADAPARM,
ISTESTCD, ISTEST, ISCAT, ISBDAGNT, AVISITN, AVISIT, ISSPEC, BFLAG
) %>%
mutate(
PARAMCD = "BFLAGy",
PARAM = paste("Baseline Pos/Neg,", PARCAT1, sep = " "),
AVAL = BFLAG,
AVALC = case_when(
AVAL == 1 ~ "Y",
AVAL == 0 ~ "N",
TRUE ~ NA_character_
),
AVALU = NA_character_
)
# Get Patient flag INDUCD
adab_incucd <- core_aab %>%
filter(ADATYPE == "ADA_BAB") %>%
distinct(
STUDYID, USUBJID, PARCAT1, BASETYPE, PARAM, ADATYPE, ADAPARM,
ISTESTCD, ISTEST, ISCAT, ISBDAGNT, AVISITN, AVISIT, ISSPEC, TFLAG
) %>%
mutate(
PARAMCD = "INDUCDy",
PARAM = paste("Treatment induced ADA,", PARCAT1, sep = " "),
AVAL = case_when(
TFLAG == 1 ~ 1,
TRUE ~ 0
),
AVALC = case_when(
AVAL == 1 ~ "Y",
AVAL == 0 ~ "N",
TRUE ~ NA_character_
),
AVALU = NA_character_
)
# Get Patient flag ENHANC
adab_enhanc <- core_aab %>%
filter(ADATYPE == "ADA_BAB") %>%
distinct(
STUDYID, USUBJID, PARCAT1, BASETYPE, PARAM, ADATYPE, ADAPARM,
ISTESTCD, ISTEST, ISCAT, ISBDAGNT, AVISITN, AVISIT, ISSPEC, TFLAG
) %>%
mutate(
PARAMCD = "ENHANCy",
PARAM = paste("Treatment enhanced ADA,", PARCAT1, sep = " "),
AVAL = case_when(
TFLAG == 2 ~ 1,
TRUE ~ 0
),
AVALC = case_when(
AVAL == 1 ~ "Y",
AVAL == 0 ~ "N",
TRUE ~ NA_character_
),
AVALU = NA_character_
)
# Get Patient flag EMERPOS
adab_emerpos <- core_aab %>%
filter(ADATYPE == "ADA_BAB") %>%
distinct(
STUDYID, USUBJID, PARCAT1, BASETYPE, PARAM, ADATYPE, ADAPARM,
ISTESTCD, ISTEST, ISCAT, ISBDAGNT, AVISITN, AVISIT, ISSPEC, PBFLAG
) %>%
mutate(
PARAMCD = "EMERPOSy",
PARAM = paste("Treatment Emergent - Positive,", PARCAT1, sep = " "),
AVAL = case_when(
PBFLAG == 1 ~ 1,
TRUE ~ 0
),
AVALC = case_when(
AVAL == 1 ~ "Y",
AVAL == 0 ~ "N",
TRUE ~ NA_character_
),
AVALU = NA_character_
)
# Get Patient flag TRUNAFF
adab_trunaff <- core_aab %>%
filter(ADATYPE == "ADA_BAB") %>%
distinct(
STUDYID, USUBJID, PARCAT1, BASETYPE, PARAM, ADATYPE, ADAPARM,
ISTESTCD, ISTEST, ISCAT, ISBDAGNT, AVISITN, AVISIT, ISSPEC, TFLAG
) %>%
mutate(
PARAMCD = "TRUNAFFy",
PARAM = paste("Treatment unaffected,", PARCAT1, sep = " "),
AVAL = case_when(
TFLAG == 3 ~ 1,
TRUE ~ 0
),
AVALC = case_when(
AVAL == 1 ~ "Y",
AVAL == 0 ~ "N",
TRUE ~ NA_character_
),
AVALU = NA_character_
)
# Get Patient flag EMERNEG
adab_emerneg <- core_aab %>%
filter(ADATYPE == "ADA_BAB") %>%
distinct(
STUDYID, USUBJID, PARCAT1, BASETYPE, PARAM, ADATYPE, ADAPARM,
ISTESTCD, ISTEST, ISCAT, ISBDAGNT, AVISITN, AVISIT, ISSPEC, PBFLAG
) %>%
mutate(
PARAMCD = "EMERNEGy",
PARAM = paste("Treatment Emergent - Negative,", PARCAT1, sep = " "),
AVAL = case_when(
PBFLAG == 0 ~ 1,
TRUE ~ 0
),
AVALC = case_when(
AVAL == 1 ~ "Y",
AVAL == 0 ~ "N",
TRUE ~ NA_character_
),
AVALU = NA_character_
)
# Get Patient flag NOTRREL
adab_notrrel <- core_aab %>%
filter(ADATYPE == "ADA_BAB") %>%
distinct(
STUDYID, USUBJID, PARCAT1, BASETYPE, PARAM, ADATYPE, ADAPARM,
ISTESTCD, ISTEST, ISCAT, ISBDAGNT, AVISITN, AVISIT, ISSPEC, PBFLAG
) %>%
mutate(
PARAMCD = "NOTRRELy",
PARAM = paste("No treatment related ADA,", PARCAT1, sep = " "),
AVAL = case_when(
is.na(PBFLAG) ~ 1,
TRUE ~ 0
),
AVALC = case_when(
AVAL == 1 ~ "Y",
AVAL == 0 ~ "N",
TRUE ~ NA_character_
),
AVALU = NA_character_
)
# Get Patient flag ADASTAT
adab_adastat <- core_aab %>%
filter(ADATYPE == "ADA_BAB") %>%
distinct(
STUDYID, USUBJID, PARCAT1, BASETYPE, PARAM, ADATYPE, ADAPARM,
ISTESTCD, ISTEST, ISCAT, ISBDAGNT, AVISITN, AVISIT, ISSPEC, ADASTAT
) %>%
mutate(
PARAMCD = "ADASTATy",
PARAM = paste("ADA Status of a patient,", PARCAT1, sep = " "),
AVAL = ADASTAT,
AVALC = case_when(
AVAL == 1 ~ "POSITIVE",
AVAL == 0 ~ "NEGATIVE",
TRUE ~ "MISSING"
),
AVALU = NA_character_
)
# Get Patient flag TIMADA
adab_timada <- core_aab %>%
filter(ADATYPE == "ADA_BAB") %>%
distinct(
STUDYID, USUBJID, PARCAT1, BASETYPE, PARAM, ADATYPE, ADAPARM,
ISTESTCD, ISTEST, ISCAT, ISBDAGNT, AVISITN, AVISIT, ISSPEC, TIMADA
) %>%
mutate(
PARAMCD = "TIMADAy",
PARAM = paste("Time to onset of ADA (Weeks),", PARCAT1, sep = " "),
AVAL = TIMADA,
AVALC = NA_character_,
AVALU = if_else(!is.na(AVAL), "WEEKS", NA_character_)
)
# Get Patient flag PERSADA
adab_persada <- core_aab %>%
filter(ADATYPE == "ADA_BAB") %>%
distinct(
STUDYID, USUBJID, PARCAT1, BASETYPE, PARAM, ADATYPE, ADAPARM,
ISTESTCD, ISTEST, ISCAT, ISBDAGNT, AVISITN, AVISIT, ISSPEC, PERSADA
) %>%
mutate(
PARAMCD = "PERSADAy",
PARAM = paste("Persistent ADA,", PARCAT1, sep = " "),
AVAL = PERSADA,
AVALC = case_when(
AVAL == 1 ~ "Y",
AVAL == 0 ~ "N",
TRUE ~ NA_character_
),
AVALU = NA_character_
)
# Get Patient flag TRANADA
adab_tranada <- core_aab %>%
filter(ADATYPE == "ADA_BAB") %>%
distinct(
STUDYID, USUBJID, PARCAT1, BASETYPE, PARAM, ADATYPE, ADAPARM,
ISTESTCD, ISTEST, ISCAT, ISBDAGNT, AVISITN, AVISIT, ISSPEC, TRANADA
) %>%
mutate(
PARAMCD = "TRANADAy",
PARAM = paste("Transient ADA,", PARCAT1, sep = " "),
AVAL = TRANADA,
AVALC = case_when(
AVAL == 1 ~ "Y",
AVAL == 0 ~ "N",
TRUE ~ NA_character_
),
AVALU = NA_character_
)
# Get Patient flag NABSTAT
adab_nabstat <- core_aab %>%
filter(ADATYPE == "ADA_NAB") %>%
distinct(
STUDYID, USUBJID, PARCAT1, BASETYPE, PARAM, ADATYPE, ADAPARM,
ISTESTCD, ISTEST, ISCAT, ISBDAGNT, AVISITN, AVISIT, ISSPEC, NABSTAT
) %>%
mutate(
PARAMCD = "NABSTATy",
PARAM = paste("Nab Status,", PARCAT1, sep = " "),
AVAL = NABSTAT,
AVALC = case_when(
AVAL == 1 ~ "POSITIVE",
AVAL == 0 ~ "NEGATIVE",
TRUE ~ "MISSING"
),
AVALU = NA_character_
)
# Get Patient flag ADADUR
adab_adadur <- core_aab %>%
filter(ADATYPE == "ADA_BAB") %>%
distinct(
STUDYID, USUBJID, PARCAT1, BASETYPE, PARAM, ADATYPE, ADAPARM,
ISTESTCD, ISTEST, ISCAT, ISBDAGNT, AVISITN, AVISIT, ISSPEC, ADADUR
) %>%
mutate(
PARAMCD = "ADADURy",
PARAM = paste("ADA Duration (Weeks),", PARCAT1, sep = " "),
AVAL = ADADUR,
AVALC = NA_character_,
AVALU = if_else(!is.na(AVAL), "WEEKS", NA_character_)
)
# Get Patient flag FPPDTM
adab_fppdtm <- core_aab %>%
filter(ADATYPE == "ADA_BAB") %>%
distinct(
STUDYID, USUBJID, PARCAT1, BASETYPE, PARAM, ADATYPE, ADAPARM,
ISTESTCD, ISTEST, ISCAT, ISBDAGNT, AVISITN, AVISIT, ISSPEC, FPPDTM
) %>%
mutate(
PARAMCD = "FPPDTMy",
PARAM = paste("First Post Dose Positive Datetime,", PARCAT1, sep = " "),
AVAL = (as.numeric(FPPDTM)),
AVALC = if_else(is.na(FPPDTM), NA_character_,
toupper(as.character(format(FPPDTM, "%d%b%Y:%H:%M:%S")))
),
AVALU = NA_character_
)
# Get Patient flag LPPDTM
adab_lppdtm <- core_aab %>%
filter(ADATYPE == "ADA_BAB") %>%
distinct(
STUDYID, USUBJID, PARCAT1, BASETYPE, PARAM, ADATYPE, ADAPARM,
ISTESTCD, ISTEST, ISCAT, ISBDAGNT, AVISITN, AVISIT, ISSPEC, LPPDTM
) %>%
mutate(
PARAMCD = "LPPDTMy",
PARAM = paste("Last Post Dose Positive Datetime,", PARCAT1, sep = " "),
AVAL = as.numeric(LPPDTM),
AVALC = if_else(is.na(LPPDTM), NA_character_,
toupper(as.character(format(LPPDTM, "%d%b%Y:%H:%M:%S")))
),
AVALU = NA_character_
)In this section, we set together the desired parameter data sets
created in the prior section. In this example, all are kept. A specific
study might not have Nab or need the by visit parameters
(TFLAGV, PBFLAGV, ADASTATV) or
need variables like ADADUR, LPPDTM and
FPPDTM.
# Set all the standard PARAM components together -----------------------------------
adab_paramcds <- bind_rows(
adab_titer, adab_nabvis, adab_result, adab_nabres, adab_bflag,
adab_incucd, adab_enhanc, adab_emerpos, adab_trunaff, adab_emerneg, adab_notrrel,
adab_adastat, adab_timada, adab_persada, adab_tranada, adab_nabstat
)
# In this sample, also have BY VISIT parameters,
adab_visits <- bind_rows(adab_tflagv, adab_pbflagv, adab_adastatv)
# Create the final parameterized dataset --------------------------------------------
# To keep only standard parameters:
# adab_study <- adab_paramcds
# To include BY VISIT parameters and/or others:
adab_study <- bind_rows(adab_paramcds, adab_visits, adab_adadur, adab_fppdtm, adab_lppdtm)
# Drop temporary variables and ADA Flag variables that are now parameterized, further below is a
# final `select` statement to to customize the final select.
adab_study <- adab_study %>%
select(
-TIMADA, -ADADUR, -TRANADA, -PERSADA, -TRANADAE, -PERSADAE, -INDUCED, -ENHANCED,
-RESULTC, -RESULTN, -ADASTAT, -BFLAG, -TFLAG, -PBFLAG, -FPPDTM, -LPPDTM, -TFLAGV, -PBFLAGV,
-ADASTATV, -nabstat_opt1, -nabstat_opt2, -NABSTAT, -MAXCHG, -VALIDBASE, -VALIDPOST,
-tdur, -ADASTAT_MAIN, -NABPOSTMISS, -TRTSDT
)In this section, compute or merge in any additional flags or
ADSL variables.
Add ADSL variables
# Merge in ADSL Values --------------------------------
adab_adsl <- adab_study %>%
derive_vars_merged(
dataset_add = adsl,
by_vars = exprs(STUDYID, USUBJID)
)
# Compute COHORT From ADSL, other source could be ADSUB
adab_cohort <- adab_adsl %>%
derive_vars_merged(
dataset_add = adsl,
new_vars = exprs(COHORT = ARMCD),
by_vars = exprs(STUDYID, USUBJID)
)
# Compute optional ADAFL
# Method could be ADSL.SAFFL, ADAB.ADPBLPFL by ISTESTCD, etc.
adab_adafl <- adab_cohort %>%
derive_vars_merged(
dataset_add = adsl,
new_vars = exprs(ADAFL = SAFFL),
by_vars = exprs(STUDYID, USUBJID)
)In this section we create a data frame to map above
PARAMCD and PARAM to final specific values.
This step is important as we want unique
PARAMCD/PARAM values. The number suffixes
start with “1” and increment as the core parameter name is repeated.
This could be due to having RESULT for ADA and Nab for Drug
ABC. There could be two analytes/drug combos (ISBDAGNT is
“XANOMELINE” and ISBDAGNT is “Y012345678”).
After merging the adab_param_data, final adjustments are
made to PARAMCD and PARAM values. These
adjustments depend on whether IS is based on SDTM v2.0 or
an earlier version (prior to v1.8).
Last step is computing ASEQ. ASEQ should be
computed last since the sort key includes values in PARAMCD
and PARAM.
# Study Specific Specs Post-Processing ------------------------------------
# Create a Tibble to map above PARAMCD and PARAM to final study spec values
# Multiple NAB and ADA analytes example usage:
# If ISTESTCD is 'ADA_BAB' and ISBDAGNT is 'XANOMELINE', RESULT1, ADASTAT1, etc. PARAM_SUFFIX = '(1)'
# If ISTESTCD is 'ADA_BAB' and ISBDAGNT is 'Y012345678', RESULT2, ADASTAT2, etc. PARAM_SUFFIX = '(2)'
# If ISTESTCD is 'ADA_NAB' and ISBDAGNT is 'XANOMELINE', RESULT3, etc. PARAM_SUFFIX = '(3)'
# If ISTESTCD is 'ADA_NAB' and ISBDAGNT is 'Y012345678', RESULT4, etc. PARAM_SUFFIX = '(4)'
adab_param_data <- tribble(
~PARAMCD, ~ADATYPE, ~ADAPARM, ~PARAMCD_NEW, ~PARAM_SUFFIX,
"XANOMELINE", "ADA_BAB", "XANOMELINE", "XANOMELINE", "(1)",
"XANOMELINE", "ADA_NAB", "XANOMELINE", "XANOMELINE", "(1)",
"RESULTy", "ADA_BAB", "XANOMELINE", "RESULT1", "(1)",
"RESULTy", "ADA_NAB", "XANOMELINE", "RESULT2", "(2)",
"BFLAGy", "ADA_BAB", "XANOMELINE", "BFLAG1", "(1)",
"INDUCDy", "ADA_BAB", "XANOMELINE", "INDUCD1", "(1)",
"ENHANCy", "ADA_BAB", "XANOMELINE", "ENHANC1", "(1)",
"EMERPOSy", "ADA_BAB", "XANOMELINE", "EMERPOS1", "(1)",
"TRUNAFFy", "ADA_BAB", "XANOMELINE", "TRUNAFF1", "(1)",
"EMERNEGy", "ADA_BAB", "XANOMELINE", "EMERNEG1", "(1)",
"NOTRRELy", "ADA_BAB", "XANOMELINE", "NOTRREL1", "(1)",
"ADASTATy", "ADA_BAB", "XANOMELINE", "ADASTAT1", "(1)",
"TIMADAy", "ADA_BAB", "XANOMELINE", "TIMADA1", "(1)",
"PERSADAy", "ADA_BAB", "XANOMELINE", "PERSADA1", "(1)",
"TRANADAy", "ADA_BAB", "XANOMELINE", "TRANADA1", "(1)",
"NABSTATy", "ADA_NAB", "XANOMELINE", "NABSTAT1", "(1)",
"ADASTATV", "ADA_BAB", "XANOMELINE", "ADASTTV1", "(1)",
"TFLAGV", "ADA_BAB", "XANOMELINE", "TFLAGV1", "(1)",
"PBFLAGV", "ADA_BAB", "XANOMELINE", "PBFLAGV1", "(1)",
"LPPDTMy", "ADA_BAB", "XANOMELINE", "LPPDTM1", "(1)",
"FPPDTMy", "ADA_BAB", "XANOMELINE", "FPPDTM1", "(1)",
"ADADURy", "ADA_BAB", "XANOMELINE", "ADADUR1", "(1)",
)
# Merge the Parameter dataset into the main data
adab_params <- adab_adafl %>%
derive_vars_merged(
dataset_add = adab_param_data,
by_vars = exprs(PARAMCD, ADAPARM, ADATYPE)
) %>%
# for the original data, assign PARAM
mutate(
PARAM = case_when(
!is.na(PARAM_SUFFIX) ~ paste(PARAM, PARAM_SUFFIX, sep = " "),
TRUE ~ PARAM
),
# Example to assign PARAMCD based on ADAPARM assigned at program start (SDTM.IS < v2.0)
PARAMCD = case_when(
PARAMCD == ADAPARM ~ PARAMCD,
TRUE ~ PARAMCD_NEW
),
# Example to assign PARAMCD based ISTESTCD type and last 5 chars of ISBDAGNT (SDTM.IS >= v2.0)
PARAMCD = case_when(
ISTESTCD == "ADA_BAB" & PARAMCD == "XANOMELINE" ~
paste("BAB", substr(ISBDAGNT, 1, 5), sep = ""),
ISTESTCD == "ADA_NAB" & PARAMCD == "XANOMELINE" ~
paste("NAB", substr(ISBDAGNT, 1, 5), sep = ""),
TRUE ~ PARAMCD
)
)
# Sort by the key variables then compute ASEQ
adab_prefinal <- adab_params %>%
# Calculate ASEQ
derive_var_obs_number(
new_var = ASEQ,
by_vars = exprs(STUDYID, USUBJID),
order = exprs(PARCAT1, PARAMCD, BASETYPE, NFRLT, AFRLT, ISSEQ),
check_type = "error"
)Select final variables to keep into ADAB
# Choose final variables to keep
# When SDTM V1.x, suggest remove ISBDAGNT
adab <- adab_prefinal %>%
select(
STUDYID, USUBJID, SUBJID, SITEID, ASEQ,
REGION1, COUNTRY, ETHNIC,
AGE, AGEU, SEX, RACE,
SAFFL, TRT01P, TRT01A,
TRTSDTM, TRTSDT, TRTEDTM, TRTEDT,
ISSEQ, ISTESTCD, ISTEST, ISCAT, ISBDAGNT,
ISSTRESC, ISSTRESN, ISSTRESU,
ISSTAT, ISREASND, ISSPEC,
DTL, MRT,
VISITNUM, VISIT, VISITDY,
EPOCH, ISDTC, ISDY, ISTPT, ISTPTNUM,
PARAM, PARAMCD, PARCAT1,
AVAL, AVALC, AVALU,
BASETYPE, BASE, CHG, DTYPE,
ADTM, ADT, ADY, ATMF,
AVISIT, AVISITN, ATPT,
APHASE, APHASEN, APERIOD, APERIODC,
FANLDTM, FANLDT, FANLTM, FANLTMF,
NFRLT, AFRLT, FRLTU,
ABLFL, ADABLPFL, ADPBLPFL, ADAFL
)Add Labels and Attributes
Note that attributes may not be preserved in some cases after processing with admiral. The recommended approach is to apply variable labels and other metadata as a final step in your data derivation process using packages like:
metacore: establish a common foundation for the use of metadata within an R session.
metatools: enable the use of metacore objects. Metatools can be used to build datasets or enhance columns in existing datasets as well as checking datasets against the metadata.
xportr: functionality to associate all metadata information to a local R data frame, perform data set level validation checks and convert into a transport v5 file(xpt).
NOTE: Together with admiral these packages comprise an End to End pipeline under the umbrella of the pharmaverse. An example of applying metadata and perform associated checks can be found at the pharmaverse E2E example.
