Derive Query Variables
Arguments
- dataset
Input dataset
- dataset_queries
A dataset containing required columns
PREFIX
,GRPNAME
,SRCVAR
,TERMCHAR
and/orTERMNUM
, and optional columnsGRPID
,SCOPE
,SCOPEN
.create_query_data()
can be used to create the dataset.
Details
This function can be used to derive CDISC variables such as
SMQzzNAM
, SMQzzCD
, SMQzzSC
, SMQzzSCN
, and CQzzNAM
in ADAE and
ADMH, and variables such as SDGzzNAM
, SDGzzCD
, and SDGzzSC
in ADCM.
An example usage of this function can be found in the
OCCDS vignette.
A query dataset is expected as an input to this function. See the
Queries Dataset Documentation vignette
for descriptions, or call data("queries")
for an example of a query dataset.
For each unique element in PREFIX
, the corresponding "NAM"
variable will be created. For each unique PREFIX
, if GRPID
is
not "" or NA, then the corresponding "CD" variable is created; similarly,
if SCOPE
is not "" or NA, then the corresponding "SC" variable will
be created; if SCOPEN
is not "" or NA, then the corresponding
"SCN" variable will be created.
For each record in dataset
, the "NAM" variable takes the value of
GRPNAME
if the value of TERMCHAR
or TERMNUM
in dataset_queries
matches
the value of the respective SRCVAR in dataset
.
Note that TERMCHAR
in dataset_queries
dataset may be NA only when TERMNUM
is non-NA and vice versa. The matching is case insensitive.
The "CD", "SC", and "SCN" variables are derived accordingly based on
GRPID
, SCOPE
, and SCOPEN
respectively,
whenever not missing.
See also
OCCDS Functions:
derive_var_trtemfl()
,
derive_vars_atc()
,
get_terms_from_db()
Examples
library(tibble)
data("queries")
adae <- tribble(
~USUBJID, ~ASTDTM, ~AETERM, ~AESEQ, ~AEDECOD, ~AELLT, ~AELLTCD,
"01", "2020-06-02 23:59:59", "ALANINE AMINOTRANSFERASE ABNORMAL",
3, "Alanine aminotransferase abnormal", NA_character_, NA_integer_,
"02", "2020-06-05 23:59:59", "BASEDOW'S DISEASE",
5, "Basedow's disease", NA_character_, 1L,
"03", "2020-06-07 23:59:59", "SOME TERM",
2, "Some query", "Some term", NA_integer_,
"05", "2020-06-09 23:59:59", "ALVEOLAR PROTEINOSIS",
7, "Alveolar proteinosis", NA_character_, NA_integer_
)
derive_vars_query(adae, queries)
#> # A tibble: 4 × 24
#> USUBJID ASTDTM AETERM AESEQ AEDECOD AELLT AELLTCD SMQ02NAM SMQ02CD SMQ02SC
#> <chr> <chr> <chr> <dbl> <chr> <chr> <int> <chr> <int> <chr>
#> 1 01 2020-06-0… ALANI… 3 Alanin… NA NA NA NA NA
#> 2 02 2020-06-0… BASED… 5 Basedo… NA 1 NA NA NA
#> 3 03 2020-06-0… SOME … 2 Some q… Some… NA NA NA NA
#> 4 05 2020-06-0… ALVEO… 7 Alveol… NA NA NA NA NA
#> # ℹ 14 more variables: SMQ02SCN <int>, SMQ03NAM <chr>, SMQ03CD <int>,
#> # SMQ03SC <chr>, SMQ03SCN <int>, SMQ05NAM <chr>, SMQ05CD <int>,
#> # SMQ05SC <chr>, SMQ05SCN <int>, CQ01NAM <chr>, CQ04NAM <chr>, CQ04CD <int>,
#> # CQ06NAM <chr>, CQ06CD <int>