Derive Categorization Variables Like AVALCATy
and AVALCAyN
Source: R/derive_vars_cat.R
derive_vars_cat.Rd
Derive Categorization Variables Like AVALCATy
and AVALCAyN
Arguments
- dataset
Input dataset
The variables specified by the
by_vars
anddefinition
arguments are expected to be in the dataset.- definition
List of expressions created by
exprs()
. Must be in rectangular format and specified using the same syntax as when creating atibble
using thetribble()
function. Thedefinition
object will be converted to atibble
usingtribble()
inside this function.Must contain:
the column
condition
which will be converted to a logical expression and will be used on thedataset
input.at least one additional column with the new column name and the category value(s) used by the logical expression.
the column specified in
by_vars
(ifby_vars
is specified)
e.g. if
by_vars
is not specified:exprs(~condition, ~AVALCAT1, ~AVALCA1N, AVAL >= 140, ">=140 cm", 1, AVAL < 140, "<140 cm", 2)
e.g. if
by_vars
is specified asexprs(VSTEST)
:exprs(~VSTEST, ~condition, ~AVALCAT1, ~AVALCA1N, "Height", AVAL >= 140, ">=140 cm", 1, "Height", AVAL < 140, "<140 cm", 2)
- by_vars
list of expressions with one element.
NULL
by default. Allows for specifying by groups, e.g.exprs(PARAMCD)
. Variable must be present in bothdataset
anddefinition
. The conditions indefinition
are applied only to those records that matchby_vars
. The categorization variables are set toNA
for records not matching any of the by groups indefinition
.
Details
If conditions are overlapping, the row order of definitions
must be carefully considered.
The first match will determine the category.
i.e. if
AVAL = 155
and the definition
is:
definition <- exprs(
~VSTEST, ~condition, ~AVALCAT1, ~AVALCA1N,
"Height", AVAL > 170, ">170 cm", 1,
"Height", AVAL <= 170, "<=170 cm", 2,
"Height", AVAL <= 160, "<=160 cm", 3
)
then AVALCAT1
will be "<=170 cm"
, as this is the first match for AVAL
.
If you specify:
definition <- exprs(
~VSTEST, ~condition, ~AVALCAT1, ~AVALCA1N,
"Height", AVAL <= 160, "<=160 cm", 3,
"Height", AVAL <= 170, "<=170 cm", 2,
"Height", AVAL > 170, ">170 cm", 1
)
Then AVAL <= 160
will lead to AVALCAT1 == "<=160 cm"
,
AVAL
in-between 160
and 170
will lead to AVALCAT1 == "<=170 cm"
,
and AVAL <= 170
will lead to AVALCAT1 == ">170 cm"
.
However, we suggest to be more explicit when defining the condition
, to avoid overlap.
In this case, the middle condition should be:
AVAL <= 170 & AVAL > 160
See also
General Derivation Functions for all ADaMs that returns variable appended to dataset:
derive_var_extreme_flag()
,
derive_var_joined_exist_flag()
,
derive_var_merged_ef_msrc()
,
derive_var_merged_exist_flag()
,
derive_var_merged_summary()
,
derive_var_obs_number()
,
derive_var_relative_flag()
,
derive_vars_computed()
,
derive_vars_joined()
,
derive_vars_merged()
,
derive_vars_merged_lookup()
,
derive_vars_transposed()
Examples
library(dplyr)
library(tibble)
advs <- tibble::tribble(
~USUBJID, ~VSTEST, ~AVAL,
"01-701-1015", "Height", 147.32,
"01-701-1015", "Weight", 53.98,
"01-701-1023", "Height", 162.56,
"01-701-1023", "Weight", NA,
"01-701-1028", "Height", NA,
"01-701-1028", "Weight", NA,
"01-701-1033", "Height", 175.26,
"01-701-1033", "Weight", 88.45
)
definition <- exprs(
~condition, ~AVALCAT1, ~AVALCA1N, ~NEWCOL,
VSTEST == "Height" & AVAL > 160, ">160 cm", 1, "extra1",
VSTEST == "Height" & AVAL <= 160, "<=160 cm", 2, "extra2"
)
derive_vars_cat(
dataset = advs,
definition = definition
)
#> # A tibble: 8 × 6
#> USUBJID VSTEST AVAL AVALCAT1 AVALCA1N NEWCOL
#> <chr> <chr> <dbl> <chr> <dbl> <chr>
#> 1 01-701-1015 Height 147. <=160 cm 2 extra2
#> 2 01-701-1015 Weight 54.0 NA NA NA
#> 3 01-701-1023 Height 163. >160 cm 1 extra1
#> 4 01-701-1023 Weight NA NA NA NA
#> 5 01-701-1028 Height NA NA NA NA
#> 6 01-701-1028 Weight NA NA NA NA
#> 7 01-701-1033 Height 175. >160 cm 1 extra1
#> 8 01-701-1033 Weight 88.4 NA NA NA
# Using by_vars:
definition2 <- exprs(
~VSTEST, ~condition, ~AVALCAT1, ~AVALCA1N,
"Height", AVAL > 160, ">160 cm", 1,
"Height", AVAL <= 160, "<=160 cm", 2,
"Weight", AVAL > 70, ">70 kg", 1,
"Weight", AVAL <= 70, "<=70 kg", 2
)
derive_vars_cat(
dataset = advs,
definition = definition2,
by_vars = exprs(VSTEST)
)
#> # A tibble: 8 × 5
#> USUBJID VSTEST AVAL AVALCAT1 AVALCA1N
#> <chr> <chr> <dbl> <chr> <dbl>
#> 1 01-701-1015 Height 147. <=160 cm 2
#> 2 01-701-1015 Weight 54.0 <=70 kg 2
#> 3 01-701-1023 Height 163. >160 cm 1
#> 4 01-701-1023 Weight NA NA NA
#> 5 01-701-1028 Height NA NA NA
#> 6 01-701-1028 Weight NA NA NA
#> 7 01-701-1033 Height 175. >160 cm 1
#> 8 01-701-1033 Weight 88.4 >70 kg 1
# With three conditions:
definition3 <- exprs(
~VSTEST, ~condition, ~AVALCAT1, ~AVALCA1N,
"Height", AVAL > 170, ">170 cm", 1,
"Height", AVAL <= 170 & AVAL > 160, "<=170 cm", 2,
"Height", AVAL <= 160, "<=160 cm", 3
)
derive_vars_cat(
dataset = advs,
definition = definition3,
by_vars = exprs(VSTEST)
)
#> # A tibble: 8 × 5
#> USUBJID VSTEST AVAL AVALCAT1 AVALCA1N
#> <chr> <chr> <dbl> <chr> <dbl>
#> 1 01-701-1015 Height 147. <=160 cm 3
#> 2 01-701-1015 Weight 54.0 NA NA
#> 3 01-701-1023 Height 163. <=170 cm 2
#> 4 01-701-1023 Weight NA NA NA
#> 5 01-701-1028 Height NA NA NA
#> 6 01-701-1028 Weight NA NA NA
#> 7 01-701-1033 Height 175. >170 cm 1
#> 8 01-701-1033 Weight 88.4 NA NA
# Let's derive both the MCRITyML and the MCRITyMN variables
adlb <- tibble::tribble(
~USUBJID, ~PARAM, ~AVAL, ~AVALU, ~ANRHI,
"01-701-1015", "ALT", 150, "U/L", 40,
"01-701-1023", "ALT", 70, "U/L", 40,
"01-701-1036", "ALT", 130, "U/L", 40,
"01-701-1048", "ALT", 30, "U/L", 40,
"01-701-1015", "AST", 50, "U/L", 35
)
definition_mcrit <- exprs(
~PARAM, ~condition, ~MCRIT1ML, ~MCRIT1MN,
"ALT", AVAL <= ANRHI, "<=ANRHI", 1,
"ALT", ANRHI < AVAL & AVAL <= 3 * ANRHI, ">1-3*ANRHI", 2,
"ALT", 3 * ANRHI < AVAL, ">3*ANRHI", 3
)
adlb %>%
derive_vars_cat(
definition = definition_mcrit,
by_vars = exprs(PARAM)
)
#> # A tibble: 5 × 7
#> USUBJID PARAM AVAL AVALU ANRHI MCRIT1ML MCRIT1MN
#> <chr> <chr> <dbl> <chr> <dbl> <chr> <dbl>
#> 1 01-701-1015 ALT 150 U/L 40 >3*ANRHI 3
#> 2 01-701-1023 ALT 70 U/L 40 >1-3*ANRHI 2
#> 3 01-701-1036 ALT 130 U/L 40 >3*ANRHI 3
#> 4 01-701-1048 ALT 30 U/L 40 <=ANRHI 1
#> 5 01-701-1015 AST 50 U/L 35 NA NA