Creating ADRS with GCIG Criteria

This article describes creating an ADRS ADaM dataset in ovarian cancer studies based on Gynecological Cancer Intergroup (GCIG) criteria. Note that only the GCIG specific steps are covered in this vignette. To get a detailed guidance on all the steps, refer the Creating ADRS (Including Non-standard Endpoints).

Introduction

Carcinoma antigen-125 (CA-125) is the most commonly used biomarker in ovarian cancer. The Gynecological Cancer Intergroup (GCIG) proposed the criteria for CA-125 response and progression and specified the situations in which CA-125 criteria could be used. These guidelines are becoming more and more popular in clinical trials for ovarian cancer and are often used as one of the secondary endpoints.

Use case	Use Recommended by GCIG	Not Standard and Needs Further Validation	Not Recommended by GCIG
GCIG recommendations for CA-125 criteria for response and progression in various clinical situations
First-line trials	CA-125 progression		CA-125 response
Maintenance or consolidation trials		CA-125 response and progression
Relapse trials	CA-125 response and progression

However, the CA-125 criteria can be a bit tricky to use in programming, especially when used alongside the RECIST 1.1.

We aim to share our current knowledge and experience in implementing GCIG criteria for ovarian clinical trials. Additionally, we have made certain assumptions regarding how data is collected on CRFs to perform response analysis according to the GCIG criteria. We hope this vignette provides valuable guidance on ADRS programming and highlights key considerations for data collection in relation to these criteria.

For more information about GCIG criteria user may visit GCIG guidelines on response criteria in ovarian cancer

CA-125 response categories

In further considerations, ULRR stands for Upper Limit of Reference Range. The CA-125 response categories for ovarian cancer are:

CA-125 Complete Response: baseline CA-125 >= 2 * ULRR, later reduced by at least 50% to normal confirmed at least 4 weeks later.
CA-125 Partial Response: baseline CA-125 >= 2 * ULRR, later reduced by at least 50% but not to normal confirmed at least 4 weeks later.
Stable Disease: CA-125 level does not meet the criteria for either partial response or progression disease.
Progression: This is defined as CA-125 >= 2 * ULRR or CA-125 >= 2 * nadir on 2 occasions at least 1 week apart.
- category A: CA-125 elevated at baseline returning to normal after treatment: CA-125 >= 2 * ULRR on 2 occasions at least 1 week apart.
- category B: CA-125 elevated at baseline not returning to normal after treatment: CA-125 >= 2 * nadir value (lowest level reached by CA-125) on 2 occasions at least 1 week apart.
- category C: CA-125 within reference range at baseline: CA-125 >= 2 * ULRR on 2 occasions at least 1 week apart.
Not Evaluable: This is when the patient’s response cannot be evaluated due to various reasons such as receiving mouse antibodies or having medical/surgical interference with their peritoneum or pleura during the previous 28 days.

Ways the data is collected in the Ovarian Cancer study CRF

Required:

Serum CA-125 results as per protocol and mapped in LB SDTM domain.
RECIST 1.1 tumor size and response as per protocol and mapped in RS/TR/TU SDTM domain.

Note: collection of CA-125 and RECIST 1.1 tumor assessment may not be always same visit. CA-125 can be collected more frequently than tumor assessment.

May be Collected:

CA-125 progression
CA-125 response
Combined response of CA-125 and RECIST 1.1 based on GCIG criteria

Assumptions:

For this vignette we made assumptions that following information is collected on the CRF:

CA-125 confirmed responses are collected on the CRF and mapped to SDTM RS domain. Please note that we are not programmatically confirming the CA-125 response.
The date of CA-125 response is assigned to the date of the first measurement when the CA-125 level is reduced by 50%. The date of CA-125 progression is assigned to the date of the first measurement that meets progression criteria. This is assumed to be collected on the CRF CA-125 response page.

CA-125 responses are mapped to CR, PR, SD, PD, NE as shown below.

#> Warning in attr(x, "align"): 'xfun::attr()' is deprecated.
#> Use 'xfun::attr2()' instead.
#> See help("Deprecated")
#> Warning in attr(x, "format"): 'xfun::attr()' is deprecated.
#> Use 'xfun::attr2()' instead.
#> See help("Deprecated")

CA-125 response per GCIG	CA-125 response mapped
Response within Normal Range	CR
Response but not within Normal Range	PR
Non-Response/Non-PD	SD
PD	PD
NE	NE

Combined CA-125 and RECIST 1.1 response per visit where both RECIST 1.1 and CA-125 is evaluated is also collected on the CRF. The investigator should already consider CA-125 confirmed response and each visit RECIST 1.1 response based on GCIG document.

In SUPPRS there are records with below QNAM and QLABEL values:

#> Warning in attr(x, "align"): 'xfun::attr()' is deprecated.
#> Use 'xfun::attr2()' instead.
#> See help("Deprecated")
#> Warning in attr(x, "format"): 'xfun::attr()' is deprecated.
#> Use 'xfun::attr2()' instead.
#> See help("Deprecated")

QNAM	QLABEL	QVAL	Purpose	Use case
`CA125EFL`	CA-125 response evaluable	Y/N	Indicates population evaluable for CA-125 response (baseline CA-125 >= 2 * ULRR and no mouse antibodies)	`CA125EFL` variable
`CAELEPRE`	Elevated pre-treatment CA-125	Y/N	Indicates CA-125 level at baseline (Y - elevated, N - not elevated)	Derivation of PD category (`MCRIT1`/`MCRIT1ML`/`MCRIT1MN`)
`MOUSEANT`	Received mouse antibodies	Y	Indicates if a prohibited therapy was received	Derivation of `ANL02FL`
`CA50RED`	>=50% reduction from baseline	Y	Indicates response, but does not distinguish between CR and PR	Not used in further derivations
`CANORM2X`	CA125 normal, lab increased >=2x ULRR	Y	Indicates PD category A or C	Derivation of PD category (`MCRIT1`/`MCRIT1ML`/`MCRIT1MN`)
`CNOTNORM`	CA125 not norm, lab increased >=2x nadir	Y	Indicates PD category B	Derivation of PD category (`MCRIT1`/`MCRIT1ML`/`MCRIT1MN`)

The above SUPPRS records refer only to records with RS.RSCAT = "CA125". The exception is QNAM = "MOUSEANT", which appears for both RS.RSCAT = "CA125" and RS.RSCAT = "RECIST 1.1 - CA125" (for the same visits).

If the data are collected by other ways and similar information from SUPPRS dataset are not available, they should be derived in advance from LB CA-125 measurements records. Information on whether the patient has received mouse antibodies should also be taken into account.

Required Packages

The examples of this vignette require the following packages.

library(admiral)
library(admiralonco)
library(pharmaversesdtm)
library(pharmaverseadam)
library(metatools)
library(dplyr)
library(tibble)

Programming Workflow

Read in Data
Pre-processing of Input Records
Analysis Flag Derivation
CA-125 Progression
CA-125 Progression Category
CA-125 Best Confirmed Overall Response
Combined Best Unconfirmed Overall Response
Combined Best Confirmed Overall Response

Read in Data

To start, all data frames needed for the creation of ADRS should be read into the environment. This will be a company specific process. Some of the data frames needed are ADSL and RS.

For example purpose, the SDTM and ADaM datasets (based on CDISC Pilot test data)—which are included in pharmaversesdtm and pharmaverseadam—are used.

adsl <- pharmaverseadam::adsl
# GCIG sdtm data
rs <- pharmaversesdtm::rs_onco_ca125
supprs <- pharmaversesdtm::supprs_onco_ca125

rs <- combine_supp(rs, supprs)
rs <- convert_blanks_to_na(rs)

At this step, it may be useful to join ADSL to your RS domain. Only the ADSL variables used for derivations are selected at this step.

adsl_vars <- exprs(RANDDT, TRTSDT)
adrs <- derive_vars_merged(
  rs,
  dataset_add = adsl,
  new_vars = adsl_vars,
  by_vars = get_admiral_option("subject_keys")
)

Pre-processing of Input Records

Select Overall Response Records per Response Criteria Category

The next step is to assign parameter level values such as PARAMCD, PARAM,PARAMN, PARCAT1, etc. For this, a lookup can be created based on the SDTM RSCAT, RSTESTCD and RSEVAL values to join to the source data.

param_lookup <- tribble(
  ~RSCAT, ~RSTESTCD, ~RSEVAL,
  ~PARAMCD, ~PARAM, ~PARAMN,
  ~PARCAT1, ~PARCAT1N, ~PARCAT2, ~PARCAT2N,

  # CA-125
  "CA125", "OVRLRESP", "INVESTIGATOR",
  "OVRCA125", "CA-125 Overall Response by Investigator", 1,
  "CA-125", 1, "Investigator", 1,

  # RECIST 1.1
  "RECIST 1.1", "OVRLRESP", "INVESTIGATOR",
  "OVRR11", "RECIST 1.1 Overall Response by Investigator", 2,
  "RECIST 1.1", 2, "Investigator", 1,

  # Combined
  "RECIST 1.1 - CA125", "OVRLRESP", "INVESTIGATOR",
  "OVRR11CA", "Combined Overall Response by Investigator", 3,
  "Combined", 3, "Investigator", 1
)

This lookup may now be joined to the source data and this is how the parameters will look like:

adrs <- derive_vars_merged_lookup(
  adrs,
  dataset_add = param_lookup,
  by_vars = exprs(RSCAT, RSTESTCD, RSEVAL)
)

Partial Date Imputation and Deriving `ADT`, `ADTF`, `AVISIT` etc.

If your data collection allows for partial dates, you could apply a company-specific imputation rule at this stage when deriving ADT. For this example, here we impute missing day to last possible date.

adrs <- adrs %>%
  derive_vars_dt(
    dtc = RSDTC,
    new_vars_prefix = "A",
    highest_imputation = "D",
    date_imputation = "last"
  ) %>%
  derive_vars_dy(
    reference_date = TRTSDT,
    source_vars = exprs(ADT)
  ) %>%
  derive_vars_dtm(
    dtc = RSDTC,
    new_vars_prefix = "A",
    highest_imputation = "D",
    date_imputation = "last",
    flag_imputation = "time"
  ) %>%
  mutate(AVISIT = VISIT)
#> The default value of `ignore_seconds_flag` will change to "TRUE" in admiral
#> 1.4.0.

Derive `AVALC` and `AVAL`

Since the set of CA-125 response categories is a subset of the RECIST 1.1 response categories and the set of combined response categories overlaps with the set of RECIST 1.1 response categories, we can use the admiralonco::aval_resp() function to assign AVAL (ordered from best to worst response).

adrs <- adrs %>%
  mutate(
    AVALC = RSSTRESC,
    AVAL = aval_resp(AVALC)
  )

Analysis Flag Derivation

When deriving ANzzFL this is an opportunity to exclude any records that should not contribute to any downstream parameter derivations.

Flag One Assessment at Each Analysis Date (`ANL01FL`)

In the below example:

we consider only valid assessments and those occurring on or after randomization date,
subject identifier variables (usually STUDYID and USUBJID), PARAMCD and ADT are a unique key,
if there is more than one assessment at a date, the worst one is flagged (ensure that the appropriate mode is being set in the admiral::derive_var_extreme_flag()),
to get the correct ordering, we will define the worst_resp() function.

worst_resp <- function(arg) {
  case_when(
    arg == "NE" ~ 1,
    arg == "CR" ~ 2,
    arg == "PR" ~ 3,
    arg == "SD" ~ 4,
    arg == "NON-CR/NON-PD" ~ 5,
    arg == "PD" ~ 6,
    TRUE ~ 0
  )
}

adrs <- adrs %>%
  restrict_derivation(
    derivation = derive_var_extreme_flag,
    args = params(
      by_vars = c(get_admiral_option("subject_keys"), exprs(PARAMCD, ADT)),
      order = exprs(worst_resp(AVALC), RSSEQ),
      new_var = ANL01FL,
      mode = "last"
    ),
    filter = !is.na(AVAL) & ADT >= RANDDT
  )

Flag Assessments up to First PD or After Receiving Mouse Antibodies (`ANL02FL`)

To restrict response data up to and including first reported progressive disease ANL02FL flag could be created by using admiral function admiral::derive_var_relative_flag().

According to GCIG guidelines, assessments after patients received mouse antibodies or if there has been medical and/or surgical interference with their peritoneum or pleura during the previous 28 days should not be considered.

Note: In our vignette, we will use the variable MOUSEANT, which indicates whether mouse antibodies have been received. The user can similarly include a variable that indicates whether there has been medical and/or surgical interference with the patient’s peritoneum or pleura in the past 28 days.

adrs <- adrs %>%
  derive_var_relative_flag(
    by_vars = c(get_admiral_option("subject_keys"), exprs(PARAMCD)),
    order = exprs(ADT, RSSEQ),
    new_var = ANL02FL,
    condition = (AVALC == "PD" | MOUSEANT == "Y"),
    mode = "first",
    selection = "before",
    inclusive = TRUE
  )

CA-125 Response Evaluable Flag

Patients can only be evaluable for a CA-125 response if they have pre-treatment sample that is at least twice the upper limit of the reference range and within 2 weeks before starting the treatment. In our case, this information is collected only for CA-125 records while a flag is needed at the patient level.

CA-125 Response Evaluable Flag can easily be derived using derive_var_merged_exist_flag() function.

adrs <- adrs %>%
  select(-CA125EFL) %>%
  derive_var_merged_exist_flag(
    dataset_add = adrs,
    by_vars = get_admiral_option("subject_keys"),
    new_var = CA125EFL,
    condition = (CA125EFL == "Y")
  )

Select Source Assessments for Parameter Derivations

For next parameter derivations we consider:

one assessment at each analysis date (ANL01FL = "Y"),
assessments up to and including first PD or mouse antibodies whatever comes first (ANL02FL = "Y"),
CA-125 Response Evaluable population (not for derivation of CA-125 progression),

# used for derivation of CA-125 PD
ovr_pd <- filter(adrs, PARAMCD == "OVRCA125" & ANL01FL == "Y" & ANL02FL == "Y")

# used for derivation of CA-125 response parameters
ovr_ca125 <- filter(adrs, PARAMCD == "OVRCA125" & CA125EFL == "Y" & ANL01FL == "Y" & ANL02FL == "Y")

# used for derivation of unconfirmed best overall response from RECIST 1.1 and confirmed CA-125 together
ovr_ubor <- filter(adrs, PARAMCD == "OVRR11CA" & CA125EFL == "Y" & ANL01FL == "Y" & ANL02FL == "Y")

# used for derivation of confirmed best overall response from RECIST 1.1 and confirmed CA-125 together
ovr_r11 <- filter(adrs, PARAMCD == "OVRR11" & CA125EFL == "Y" & ANL01FL == "Y" & ANL02FL == "Y")

Now that we have the input records prepared above with any company-specific requirements, we can start to derive new parameter records.

CA-125 Progression

The function admiral::derive_extreme_records() can be used to find the date of first CA-125 PD.

adrs <- adrs %>%
  derive_extreme_records(
    dataset_ref = adsl,
    dataset_add = ovr_pd,
    by_vars = get_admiral_option("subject_keys"),
    filter_add = AVALC == "PD",
    order = exprs(ADT),
    mode = "first",
    keep_source_vars = exprs(everything()),
    set_values_to = exprs(
      PARAMCD = "PDCA125",
      PARAM = "CA-125 Disease Progression by Investigator",
      PARAMN = 4,
      PARCAT1 = "CA-125",
      PARCAT1N = 1,
      PARCAT2 = "Investigator",
      PARCAT2N = 1,
      ANL01FL = "Y",
      ANL02FL = "Y"
    )
  )

CA-125 Progression Category

To obtain additional variables that store the progression category for participants who have progressed, we will create MCRITy/MCRITyML/MCRITyMN variables according to the ADaMIG guidelines.

In our consideration we assume that the SUPPRS dataset contains QNAM values to uniquely classify a progression into one of the A, B or C categories as per GCIG criteria.

Having previously transposed and merged SUPPRS dataset, we have a suitable structure for deriving MCRIT variables, since all the necessary variables we will use to check conditions are in one row.

For this purpose admiral provides derive_vars_cat() function (see documentation for details).

definition_mcrit <- exprs(
  ~PARAMCD, ~condition,
  ~MCRIT1ML, ~MCRIT1MN,
  "PDCA125", CAELEPRE == "Y" & CANORM2X == "Y",
  "Patients with elevated CA-125 before treatment and normalization of CA-125 (A)", 1,
  "PDCA125", CAELEPRE == "Y" & CNOTNORM == "Y",
  "Patients with elevated CA-125 before treatment, which never normalizes (B)", 2,
  "PDCA125", CAELEPRE == "N" & CANORM2X == "Y",
  "Patients with CA-125 in the reference range before treatment (C)", 3
)

adrs <- adrs %>%
  mutate(MCRIT1 = if_else(PARAMCD == "PDCA125", "PD Category Group", NA_character_)) %>%
  derive_vars_cat(
    definition = definition_mcrit,
    by_vars = exprs(PARAMCD)
  )

Derivation of the progression category may be more complex if the data is collected in a different way and the user needs to check whether:

baseline CA-125 level was elevated,
CA-125 level normalizes over the course of the study,
CA-125 level > 2 * ULRR (if CA-125 level normalizes) or CA-125 level > 2 * nadir (if CA-125 level never normalizes).

If criterion to find PD category draws from multiple rows (different parameters or multiple rows for a single parameter) this will require the creation of a new parameter.

CA-125 Best Confirmed Overall Response

The function admiral::derive_extreme_event() can be used to derive the CA-125 Best Confirmed Overall Response Parameter.

Some events such as bor_cr, bor_pr have been defined in {admiralonco}. Missing events specific to GCIG criteria are defined below.
Note: For SD, it is not required as for RECIST 1.1 that the response occurs after a protocol-defined number of days.

bor_sd_gcig <- event(
  description = "Define stable disease (SD) for best overall response (BOR)",
  dataset_name = "ovr",
  condition = AVALC == "SD",
  set_values_to = exprs(AVALC = "SD")
)

bor_ne_gcig <- event(
  description = "Define not evaluable (NE) for best overall response (BOR)",
  dataset_name = "ovr",
  condition = AVALC == "NE",
  set_values_to = exprs(AVALC = "NE")
)

adrs <- adrs %>%
  derive_extreme_event(
    by_vars = get_admiral_option("subject_keys"),
    tmp_event_nr_var = event_nr,
    order = exprs(event_nr, ADT),
    mode = "first",
    source_datasets = list(
      ovr = ovr_ca125,
      adsl = adsl
    ),
    events = list(
      bor_cr, bor_pr, bor_sd_gcig, bor_pd, bor_ne_gcig, no_data_missing
    ),
    set_values_to = exprs(
      PARAMCD = "CBORCA",
      PARAM = "CA-125 Best Confirmed Overall Response by Investigator",
      PARAMN = 5,
      PARCAT1 = "CA-125",
      PARCAT1N = 1,
      PARCAT2 = "Investigator",
      PARCAT2N = 1,
      AVAL = aval_resp(AVALC),
      ANL01FL = "Y",
      ANL02FL = "Y"
    )
  )

Combined Best Unconfirmed Overall Response

For patients who are measurable by both RECIST 1.1 and CA-125 the concept of Combined Best Overall Response is used. In our assumptions, RECIST 1.1 response is unconfirmed, and the combined response from RS domain is based on that unconfirmed RECIST 1.1.

In this part of the vignette, we will derive Combined Best Unconfirmed Overall Response based on combined response (PARAMCD = "OVRR11CA") as collected on the CRF.

adrs <- adrs %>%
  derive_extreme_event(
    by_vars = get_admiral_option("subject_keys"),
    tmp_event_nr_var = event_nr,
    order = exprs(event_nr, ADT),
    mode = "first",
    source_datasets = list(
      ovr = ovr_ubor,
      adsl = adsl
    ),
    events = list(
      bor_cr, bor_pr, bor_sd_gcig, bor_pd, bor_ne_gcig, no_data_missing
    ),
    set_values_to = exprs(
      PARAMCD = "BORCA11",
      PARAM = "Combined Best Unconfirmed Overall Response by Investigator",
      PARAMN = 6,
      PARCAT1 = "Combined",
      PARCAT1N = 3,
      PARCAT2 = "Investigator",
      PARCAT2N = 1,
      AVAL = aval_resp(AVALC),
      ANL01FL = "Y",
      ANL02FL = "Y"
    )
  )

Combined Best Confirmed Overall Response

For studies where ORR is one of the primary endpoints, best RECIST 1.1 response for CR and PR needs to be confirmed and maintained for at least 28 days. Due to the complexity of the problem, we will not address it in this vignette.

Other Endpoints

For examples on the additional endpoints, please see Creating ADRS (Including Non-standard Endpoints).