This article describes creating an ADRS
ADaM dataset in
ovarian cancer studies based on Gynecological Cancer Intergroup (GCIG)
criteria. Note that only the GCIG specific steps are covered in this
vignette. To get a detailed guidance on all the steps, refer the Creating ADRS (Including Non-standard
Endpoints).
Introduction
Carcinoma antigen-125 (CA-125) is the most commonly used biomarker in ovarian cancer. The Gynecological Cancer Intergroup (GCIG) proposed the criteria for CA-125 response and progression and specified the situations in which CA-125 criteria could be used. These guidelines are becoming more and more popular in clinical trials for ovarian cancer and are often used as one of the secondary endpoints.
GCIG recommendations for CA-125 criteria for response and progression in various clinical situations | |||
Use case | Use Recommended by GCIG | Not Standard and Needs Further Validation | Not Recommended by GCIG |
---|---|---|---|
First-line trials | CA-125 progression | CA-125 response | |
Maintenance or consolidation trials | CA-125 response and progression | ||
Relapse trials | CA-125 response and progression |
However, the CA-125 criteria can be a bit tricky to use in programming, especially when used alongside the RECIST 1.1.
We aim to share our current knowledge and experience in implementing GCIG criteria for ovarian clinical trials. Additionally, we have made certain assumptions regarding how data is collected on CRFs to perform response analysis according to the GCIG criteria. We hope this vignette provides valuable guidance on ADRS programming and highlights key considerations for data collection in relation to these criteria.
For more information about GCIG criteria user may visit GCIG guidelines on response criteria in ovarian cancer
CA-125 response categories
In further considerations, ULRR stands for Upper Limit of Reference Range. The CA-125 response categories for ovarian cancer are:
CA-125 Complete Response: baseline CA-125 >= 2 * ULRR, later reduced by at least 50% to normal confirmed at least 4 weeks later.
CA-125 Partial Response: baseline CA-125 >= 2 * ULRR, later reduced by at least 50% but not to normal confirmed at least 4 weeks later.
Stable Disease: CA-125 level does not meet the criteria for either partial response or progression disease.
-
Progression: This is defined as CA-125 >= 2 * ULRR or CA-125 >= 2 * nadir on 2 occasions at least 1 week apart.
- category A: CA-125 elevated at baseline returning to normal after treatment: CA-125 >= 2 * ULRR on 2 occasions at least 1 week apart.
- category B: CA-125 elevated at baseline not returning to normal after treatment: CA-125 >= 2 * nadir value (lowest level reached by CA-125) on 2 occasions at least 1 week apart.
- category C: CA-125 within reference range at baseline: CA-125 >= 2 * ULRR on 2 occasions at least 1 week apart.
Not Evaluable: This is when the patient’s response cannot be evaluated due to various reasons such as receiving mouse antibodies or having medical/surgical interference with their peritoneum or pleura during the previous 28 days.
Ways the data is collected in the Ovarian Cancer study CRF
Required:
- Serum CA-125 results as per protocol and mapped in
LB
SDTM domain. - RECIST 1.1 tumor size and response as per protocol and mapped in
RS
/TR
/TU
SDTM domain.
Note: collection of CA-125 and RECIST 1.1 tumor assessment may not be always same visit. CA-125 can be collected more frequently than tumor assessment.
May be Collected:
- CA-125 progression
- CA-125 response
- Combined response of CA-125 and RECIST 1.1 based on GCIG criteria
Assumptions:
For this vignette we made assumptions that following information is collected on the CRF:
CA-125 confirmed responses are collected on the CRF and mapped to
SDTM RS
domain. Please note that we are not programmatically confirming the CA-125 response.The date of CA-125 response is assigned to the date of the first measurement when the CA-125 level is reduced by 50%. The date of CA-125 progression is assigned to the date of the first measurement that meets progression criteria. This is assumed to be collected on the CRF CA-125 response page.
-
CA-125 responses are mapped to
CR
,PR
,SD
,PD
,NE
as shown below.CA-125 response per GCIG CA-125 response mapped Response within Normal Range CR Response but not within Normal Range PR Non-Response/Non-PD SD PD PD NE NE Combined CA-125 and RECIST 1.1 response per visit where both RECIST 1.1 and CA-125 is evaluated is also collected on the CRF. The investigator should already consider CA-125 confirmed response and each visit RECIST 1.1 response based on GCIG document.
-
In
SUPPRS
there are records with belowQNAM
andQLABEL
values:QNAM QLABEL QVAL Purpose Use case CA125EFL
CA-125 response evaluable Y/N Indicates population evaluable for CA-125 response (baseline CA-125 >= 2 * ULRR and no mouse antibodies) CA125EFL
variableCAELEPRE
Elevated pre-treatment CA-125 Y/N Indicates CA-125 level at baseline (Y - elevated, N - not elevated) Derivation of PD category ( MCRIT1
/MCRIT1ML
/MCRIT1MN
)MOUSEANT
Received mouse antibodies Y Indicates if a prohibited therapy was received Derivation of ANL02FL
CA50RED
>=50% reduction from baseline Y Indicates response, but does not distinguish between CR and PR Not used in further derivations CANORM2X
CA125 normal, lab increased >=2x ULRR Y Indicates PD category A or C Derivation of PD category ( MCRIT1
/MCRIT1ML
/MCRIT1MN
)CNOTNORM
CA125 not norm, lab increased >=2x nadir Y Indicates PD category B Derivation of PD category ( MCRIT1
/MCRIT1ML
/MCRIT1MN
)
The above SUPPRS
records refer only to records with
RS.RSCAT = "CA125"
. The exception is
QNAM = "MOUSEANT"
, which appears for both
RS.RSCAT = "CA125"
and
RS.RSCAT = "RECIST 1.1 - CA125"
(for the same visits).
If the data are collected by other ways and similar information from
SUPPRS
dataset are not available, they should be derived in
advance from LB
CA-125 measurements records. Information on
whether the patient has received mouse antibodies should also be taken
into account.
Programming Workflow
- Read in Data
- Pre-processing of Input Records
- Analysis Flag Derivation
- CA-125 Progression
- CA-125 Progression Category
- CA-125 Best Confirmed Overall Response
- Combined Best Unconfirmed Overall Response
- Combined Best Confirmed Overall Response
Read in Data
To start, all data frames needed for the creation of
ADRS
should be read into the environment. This will be a
company specific process. Some of the data frames needed are
ADSL
and RS
.
For example purpose, the SDTM and ADaM datasets (based on CDISC Pilot test data)—which are included in pharmaversesdtm and pharmaverseadam—are used.
adsl <- pharmaverseadam::adsl
# GCIG sdtm data
rs <- pharmaversesdtm::rs_onco_ca125
supprs <- pharmaversesdtm::supprs_onco_ca125
rs <- combine_supp(rs, supprs)
rs <- convert_blanks_to_na(rs)
At this step, it may be useful to join ADSL
to your
RS
domain. Only the ADSL
variables used for
derivations are selected at this step.
adsl_vars <- exprs(RANDDT, TRTSDT)
adrs <- derive_vars_merged(
rs,
dataset_add = adsl,
new_vars = adsl_vars,
by_vars = get_admiral_option("subject_keys")
)
Pre-processing of Input Records
Select Overall Response Records per Response Criteria Category
The next step is to assign parameter level values such as
PARAMCD
, PARAM
,PARAMN
,
PARCAT1
, etc. For this, a lookup can be created based on
the SDTM RSCAT
, RSTESTCD
and
RSEVAL
values to join to the source data.
param_lookup <- tribble(
~RSCAT, ~RSTESTCD, ~RSEVAL,
~PARAMCD, ~PARAM, ~PARAMN,
~PARCAT1, ~PARCAT1N, ~PARCAT2, ~PARCAT2N,
# CA-125
"CA125", "OVRLRESP", "INVESTIGATOR",
"OVRCA125", "CA-125 Overall Response by Investigator", 1,
"CA-125", 1, "Investigator", 1,
# RECIST 1.1
"RECIST 1.1", "OVRLRESP", "INVESTIGATOR",
"OVRR11", "RECIST 1.1 Overall Response by Investigator", 2,
"RECIST 1.1", 2, "Investigator", 1,
# Combined
"RECIST 1.1 - CA125", "OVRLRESP", "INVESTIGATOR",
"OVRR11CA", "Combined Overall Response by Investigator", 3,
"Combined", 3, "Investigator", 1
)
This lookup may now be joined to the source data and this is how the parameters will look like:
adrs <- derive_vars_merged_lookup(
adrs,
dataset_add = param_lookup,
by_vars = exprs(RSCAT, RSTESTCD, RSEVAL)
)
Partial Date Imputation and Deriving ADT
,
ADTF
, AVISIT
etc.
If your data collection allows for partial dates, you could apply a
company-specific imputation rule at this stage when deriving
ADT
. For this example, here we impute missing day to last
possible date.
adrs <- adrs %>%
derive_vars_dt(
dtc = RSDTC,
new_vars_prefix = "A",
highest_imputation = "D",
date_imputation = "last"
) %>%
derive_vars_dy(
reference_date = TRTSDT,
source_vars = exprs(ADT)
) %>%
derive_vars_dtm(
dtc = RSDTC,
new_vars_prefix = "A",
highest_imputation = "D",
date_imputation = "last",
flag_imputation = "time"
) %>%
mutate(AVISIT = VISIT)
Derive AVALC
and AVAL
Since the set of CA-125 response categories is a subset of the RECIST
1.1 response categories and the set of combined response categories
overlaps with the set of RECIST 1.1 response categories, we can use the
admiralonco::aval_resp()
function to assign
AVAL
(ordered from best to worst response).
Analysis Flag Derivation
When deriving ANzzFL
this is an opportunity to exclude
any records that should not contribute to any downstream parameter
derivations.
Flag One Assessment at Each Analysis Date
(ANL01FL
)
In the below example:
- we consider only valid assessments and those occurring on or after randomization date,
- subject identifier variables (usually
STUDYID
andUSUBJID
),PARAMCD
andADT
are a unique key, - if there is more than one assessment at a date, the worst one is
flagged (ensure that the appropriate
mode
is being set in theadmiral::derive_var_extreme_flag()
), - to get the correct ordering, we will define the
worst_resp()
function.
worst_resp <- function(arg) {
case_when(
arg == "NE" ~ 1,
arg == "CR" ~ 2,
arg == "PR" ~ 3,
arg == "SD" ~ 4,
arg == "NON-CR/NON-PD" ~ 5,
arg == "PD" ~ 6,
TRUE ~ 0
)
}
adrs <- adrs %>%
restrict_derivation(
derivation = derive_var_extreme_flag,
args = params(
by_vars = c(get_admiral_option("subject_keys"), exprs(PARAMCD, ADT)),
order = exprs(worst_resp(AVALC), RSSEQ),
new_var = ANL01FL,
mode = "last"
),
filter = !is.na(AVAL) & ADT >= RANDDT
)
Flag Assessments up to First PD or After Receiving Mouse Antibodies
(ANL02FL
)
To restrict response data up to and including first reported
progressive disease ANL02FL
flag could be created by using
admiral function
admiral::derive_var_relative_flag()
.
According to GCIG guidelines, assessments after patients received mouse antibodies or if there has been medical and/or surgical interference with their peritoneum or pleura during the previous 28 days should not be considered.
Note: In our vignette, we will use the variable
MOUSEANT
, which indicates whether mouse antibodies have
been received. The user can similarly include a variable that indicates
whether there has been medical and/or surgical interference with the
patient’s peritoneum or pleura in the past 28 days.
adrs <- adrs %>%
derive_var_relative_flag(
by_vars = c(get_admiral_option("subject_keys"), exprs(PARAMCD)),
order = exprs(ADT, RSSEQ),
new_var = ANL02FL,
condition = (AVALC == "PD" | MOUSEANT == "Y"),
mode = "first",
selection = "before",
inclusive = TRUE
)
CA-125 Response Evaluable Flag
Patients can only be evaluable for a CA-125 response if they have pre-treatment sample that is at least twice the upper limit of the reference range and within 2 weeks before starting the treatment. In our case, this information is collected only for CA-125 records while a flag is needed at the patient level.
CA-125 Response Evaluable Flag can easily be derived using
derive_var_merged_exist_flag()
function.
adrs <- adrs %>%
select(-CA125EFL) %>%
derive_var_merged_exist_flag(
dataset_add = adrs,
by_vars = get_admiral_option("subject_keys"),
new_var = CA125EFL,
condition = (CA125EFL == "Y")
)
Select Source Assessments for Parameter Derivations
For next parameter derivations we consider:
- one assessment at each analysis date
(
ANL01FL = "Y"
), - assessments up to and including first PD or mouse antibodies
whatever comes first (
ANL02FL = "Y"
), - CA-125 Response Evaluable population (not for derivation of CA-125 progression),
# used for derivation of CA-125 PD
ovr_pd <- filter(adrs, PARAMCD == "OVRCA125" & ANL01FL == "Y" & ANL02FL == "Y")
# used for derivation of CA-125 response parameters
ovr_ca125 <- filter(adrs, PARAMCD == "OVRCA125" & CA125EFL == "Y" & ANL01FL == "Y" & ANL02FL == "Y")
# used for derivation of unconfirmed best overall response from RECIST 1.1 and confirmed CA-125 together
ovr_ubor <- filter(adrs, PARAMCD == "OVRR11CA" & CA125EFL == "Y" & ANL01FL == "Y" & ANL02FL == "Y")
# used for derivation of confirmed best overall response from RECIST 1.1 and confirmed CA-125 together
ovr_r11 <- filter(adrs, PARAMCD == "OVRR11" & CA125EFL == "Y" & ANL01FL == "Y" & ANL02FL == "Y")
Now that we have the input records prepared above with any company-specific requirements, we can start to derive new parameter records.
CA-125 Progression
The function admiral::derive_extreme_records()
can be
used to find the date of first CA-125 PD.
adrs <- adrs %>%
derive_extreme_records(
dataset_ref = adsl,
dataset_add = ovr_pd,
by_vars = get_admiral_option("subject_keys"),
filter_add = AVALC == "PD",
order = exprs(ADT),
mode = "first",
keep_source_vars = exprs(everything()),
set_values_to = exprs(
PARAMCD = "PDCA125",
PARAM = "CA-125 Disease Progression by Investigator",
PARAMN = 4,
PARCAT1 = "CA-125",
PARCAT1N = 1,
PARCAT2 = "Investigator",
PARCAT2N = 1,
ANL01FL = "Y",
ANL02FL = "Y"
)
)
CA-125 Progression Category
To obtain additional variables that store the progression category
for participants who have progressed, we will create
MCRITy/MCRITyML/MCRITyMN
variables according to the ADaMIG
guidelines.
In our consideration we assume that the SUPPRS
dataset
contains QNAM
values to uniquely classify a progression
into one of the A
, B
or C
categories as per GCIG
criteria.
Having previously transposed and merged SUPPRS
dataset,
we have a suitable structure for deriving MCRIT
variables,
since all the necessary variables we will use to check conditions are in
one row.
For this purpose admiral provides derive_vars_cat()
function (see documentation for details).
definition_mcrit <- exprs(
~PARAMCD, ~condition,
~MCRIT1ML, ~MCRIT1MN,
"PDCA125", CAELEPRE == "Y" & CANORM2X == "Y",
"Patients with elevated CA-125 before treatment and normalization of CA-125 (A)", 1,
"PDCA125", CAELEPRE == "Y" & CNOTNORM == "Y",
"Patients with elevated CA-125 before treatment, which never normalizes (B)", 2,
"PDCA125", CAELEPRE == "N" & CANORM2X == "Y",
"Patients with CA-125 in the reference range before treatment (C)", 3
)
adrs <- adrs %>%
mutate(MCRIT1 = if_else(PARAMCD == "PDCA125", "PD Category Group", NA_character_)) %>%
derive_vars_cat(
definition = definition_mcrit,
by_vars = exprs(PARAMCD)
)
Derivation of the progression category may be more complex if the data is collected in a different way and the user needs to check whether:
- baseline CA-125 level was elevated,
- CA-125 level normalizes over the course of the study,
- CA-125 level > 2 * ULRR (if CA-125 level normalizes) or CA-125 level > 2 * nadir (if CA-125 level never normalizes).
If criterion to find PD category draws from multiple rows (different parameters or multiple rows for a single parameter) this will require the creation of a new parameter.
CA-125 Best Confirmed Overall Response
The function admiral::derive_extreme_event()
can be used
to derive the CA-125 Best Confirmed Overall Response Parameter.
Some events such as bor_cr
, bor_pr
have
been defined in {admiralonco}. Missing events specific to GCIG criteria
are defined below.
Note: For SD
, it is not required as
for RECIST 1.1 that the response occurs after a protocol-defined number
of days.
bor_sd_gcig <- event(
description = "Define stable disease (SD) for best overall response (BOR)",
dataset_name = "ovr",
condition = AVALC == "SD",
set_values_to = exprs(AVALC = "SD")
)
bor_ne_gcig <- event(
description = "Define not evaluable (NE) for best overall response (BOR)",
dataset_name = "ovr",
condition = AVALC == "NE",
set_values_to = exprs(AVALC = "NE")
)
adrs <- adrs %>%
derive_extreme_event(
by_vars = get_admiral_option("subject_keys"),
tmp_event_nr_var = event_nr,
order = exprs(event_nr, ADT),
mode = "first",
source_datasets = list(
ovr = ovr_ca125,
adsl = adsl
),
events = list(
bor_cr, bor_pr, bor_sd_gcig, bor_pd, bor_ne_gcig, no_data_missing
),
set_values_to = exprs(
PARAMCD = "CBORCA",
PARAM = "CA-125 Best Confirmed Overall Response by Investigator",
PARAMN = 5,
PARCAT1 = "CA-125",
PARCAT1N = 1,
PARCAT2 = "Investigator",
PARCAT2N = 1,
AVAL = aval_resp(AVALC),
ANL01FL = "Y",
ANL02FL = "Y"
)
)
Combined Best Unconfirmed Overall Response
For patients who are measurable by both RECIST 1.1 and CA-125 the
concept of Combined Best Overall Response is used. In our assumptions,
RECIST 1.1 response is unconfirmed, and the combined response from
RS
domain is based on that unconfirmed RECIST 1.1.
In this part of the vignette, we will derive Combined Best
Unconfirmed Overall Response based on combined response
(PARAMCD = "OVRR11CA"
) as collected on the CRF.
adrs <- adrs %>%
derive_extreme_event(
by_vars = get_admiral_option("subject_keys"),
tmp_event_nr_var = event_nr,
order = exprs(event_nr, ADT),
mode = "first",
source_datasets = list(
ovr = ovr_ubor,
adsl = adsl
),
events = list(
bor_cr, bor_pr, bor_sd_gcig, bor_pd, bor_ne_gcig, no_data_missing
),
set_values_to = exprs(
PARAMCD = "BORCA11",
PARAM = "Combined Best Unconfirmed Overall Response by Investigator",
PARAMN = 6,
PARCAT1 = "Combined",
PARCAT1N = 3,
PARCAT2 = "Investigator",
PARCAT2N = 1,
AVAL = aval_resp(AVALC),
ANL01FL = "Y",
ANL02FL = "Y"
)
)
Combined Best Confirmed Overall Response
For studies where ORR is one of the primary endpoints, best RECIST 1.1 response for CR and PR needs to be confirmed and maintained for at least 28 days. Due to the complexity of the problem, we will not address it in this vignette.
Other Endpoints
For examples on the additional endpoints, please see Creating ADRS (Including Non-standard Endpoints).