
Derive Variables by Transposing and Merging a Second Dataset
Source:R/derive_vars_transposed.R
derive_vars_transposed.RdAdds variables from a vertical dataset after transposing it into a wide one.
Usage
derive_vars_transposed(
dataset,
dataset_merge,
by_vars,
id_vars = NULL,
key_var,
value_var,
filter = NULL,
relationship = NULL
)Arguments
- dataset
-
Input dataset
The variables specified by the
by_varsargument are expected to be in the dataset.- Default value
none
- dataset_merge
-
Dataset to transpose and merge
The variables specified by the
by_vars,id_vars,key_varandvalue_vararguments are expected. The variablesby_vars,id_vars,key_varhave to be a unique key.- Default value
none
- by_vars
-
Grouping variables
Keys used to merge
dataset_mergewithdataset.- Default value
none
- id_vars
-
ID variables
Variables (excluding
by_varsandkey_var) that uniquely identify each observation indataset_merge.- Default value
NULL
- key_var
-
The variable of
dataset_mergecontaining the names of the transposed variables- Default value
none
- value_var
-
The variable of
dataset_mergecontaining the values of the transposed variables- Default value
none
- filter
-
Expression used to restrict the records of
dataset_mergeprior to transposing- Default value
NULL
- relationship
-
Expected merge-relationship between the
by_varsvariable(s) indatasetanddataset_merge(after transposition)This argument is passed to the
dplyr::left_join()function. See https://dplyr.tidyverse.org/reference/mutate-joins.html#arguments for more details.- Permitted values
"one-to-one","one-to-many","many-to-one","many-to-many",NULL- Default value
NULL
Details
The records from the dataset to transpose and merge (
dataset_merge) are restricted to those matching thefiltercondition, if provided.The records from
dataset_mergeare checked to ensure they are uniquely identified usingby_vars,id_varsandkey_var.dataset_mergeis transposed (from "tall" to "wide"), with new variables added whose names come fromkey_varand values come fromvalue_var.The transposed dataset is merged with the input
datasetusingby_varsas keys. If arelationshiphas been provided, this merge must satisfy the relationship, otherwise an error is thrown.
Note that unlike other derive_vars_*() functions, the final step may cause new
records to be added to the input dataset. The relationship argument can be specified
to ensure this does not happen inadvertently.
See also
General Derivation Functions for all ADaMs that returns variable appended to dataset:
derive_var_extreme_flag(),
derive_var_joined_exist_flag(),
derive_var_merged_ef_msrc(),
derive_var_merged_exist_flag(),
derive_var_merged_summary(),
derive_var_obs_number(),
derive_var_relative_flag(),
derive_vars_cat(),
derive_vars_computed(),
derive_vars_joined(),
derive_vars_joined_summary(),
derive_vars_merged(),
derive_vars_merged_lookup()
Examples
library(tibble)
library(dplyr, warn.conflicts = FALSE)
# Adding ATC classes to CM using FACM
cm <- tribble(
~USUBJID, ~CMGRPID, ~CMREFID, ~CMDECOD,
"BP40257-1001", "14", "1192056", "PARACETAMOL",
"BP40257-1001", "18", "2007001", "SOLUMEDROL",
"BP40257-1002", "19", "2791596", "SPIRONOLACTONE"
)
facm <- tribble(
~USUBJID, ~FAGRPID, ~FAREFID, ~FATESTCD, ~FASTRESC,
"BP40257-1001", "1", "1192056", "CMATC1CD", "N",
"BP40257-1001", "1", "1192056", "CMATC2CD", "N02",
"BP40257-1001", "1", "1192056", "CMATC3CD", "N02B",
"BP40257-1001", "1", "1192056", "CMATC4CD", "N02BE",
"BP40257-1001", "1", "2007001", "CMATC1CD", "D",
"BP40257-1001", "1", "2007001", "CMATC2CD", "D10",
"BP40257-1001", "1", "2007001", "CMATC3CD", "D10A",
"BP40257-1001", "1", "2007001", "CMATC4CD", "D10AA",
"BP40257-1001", "2", "2007001", "CMATC1CD", "D",
"BP40257-1001", "2", "2007001", "CMATC2CD", "D07",
"BP40257-1001", "2", "2007001", "CMATC3CD", "D07A",
"BP40257-1001", "2", "2007001", "CMATC4CD", "D07AA",
"BP40257-1001", "3", "2007001", "CMATC1CD", "H",
"BP40257-1001", "3", "2007001", "CMATC2CD", "H02",
"BP40257-1001", "3", "2007001", "CMATC3CD", "H02A",
"BP40257-1001", "3", "2007001", "CMATC4CD", "H02AB",
"BP40257-1002", "1", "2791596", "CMATC1CD", "C",
"BP40257-1002", "1", "2791596", "CMATC2CD", "C03",
"BP40257-1002", "1", "2791596", "CMATC3CD", "C03D",
"BP40257-1002", "1", "2791596", "CMATC4CD", "C03DA"
)
cm %>%
derive_vars_transposed(
dataset_merge = facm,
by_vars = exprs(USUBJID, CMREFID = FAREFID),
id_vars = exprs(FAGRPID),
key_var = FATESTCD,
value_var = FASTRESC
) %>%
select(USUBJID, CMDECOD, starts_with("CMATC"))
#> # A tibble: 5 × 6
#> USUBJID CMDECOD CMATC1CD CMATC2CD CMATC3CD CMATC4CD
#> <chr> <chr> <chr> <chr> <chr> <chr>
#> 1 BP40257-1001 PARACETAMOL N N02 N02B N02BE
#> 2 BP40257-1001 SOLUMEDROL D D10 D10A D10AA
#> 3 BP40257-1001 SOLUMEDROL D D07 D07A D07AA
#> 4 BP40257-1001 SOLUMEDROL H H02 H02A H02AB
#> 5 BP40257-1002 SPIRONOLACTONE C C03 C03D C03DA
# Note: the `id_vars` argument here is needed to uniquely identify
# rows of dataset_merge and avoid duplicates-related errors.
# Compare the above call to when `id_vars = NULL`:
try(
cm %>%
derive_vars_transposed(
dataset_merge = facm,
by_vars = exprs(USUBJID, CMREFID = FAREFID),
id_vars = NULL,
key_var = FATESTCD,
value_var = FASTRESC
)
)
#> Error in signal_duplicate_records(dataset_merge, by_vars = c(by_vars, :
#> Dataset `dataset_merge` contains duplicate records with respect to
#> `USUBJID`, `FAREFID`, and `FATESTCD`
#> Please check data and `by_vars`, `id_vars`, and `key_var` arguments.
#> ℹ Run `admiral::get_duplicates_dataset()` to access the duplicate records