derive_seq()
creates a new identifier variable: the sequence number
(--SEQ
).
This function adds a newly derived variable to tgt_dat
, namely the sequence
number (--SEQ
) whose name is the one provided in tgt_var
. An integer
sequence is generated that uniquely identifies each record within the domain.
Prior to the derivation of tgt_var
, the data frame tgt_dat
is sorted
according to grouping variables indicated in rec_vars
.
Usage
derive_seq(
tgt_dat,
tgt_var,
rec_vars,
sbj_vars = sdtm.oak::sbj_vars(),
start_at = 1L
)
Arguments
- tgt_dat
The target dataset, a data frame.
- tgt_var
The target SDTM variable: a single string indicating the name of the sequence number (
--SEQ
) variable, e.g."DSSEQ"
. Note that supplying a name not ending in"SEQ"
will raise a warning.- rec_vars
A character vector of record-level identifier variables.
- sbj_vars
A character vector of subject-level identifier variables.
- start_at
The sequence numbering starts at this value (default is
1
).
Value
Returns the data frame supplied in tgt_dat
with the newly derived
variable, i.e. the sequence number (--SEQ
), whose name is that passed in
tgt_var
. This variable is of type integer.
Examples
# A VS raw data set example
(vs <- read_domain_example("vs"))
#> # A tibble: 6 × 7
#> STUDYID DOMAIN USUBJID VSSPID VSTESTCD VSDTC VSTPTNUM
#> <chr> <chr> <chr> <chr> <chr> <chr> <dbl>
#> 1 ABC123 VS ABC123-375 /F:VTLS1-D:9795532-R:2 DIABP 2020-09-01… NA
#> 2 ABC123 VS ABC123-375 /F:VTLS1-D:9795532-R:2 TEMP 2020-09-01… NA
#> 3 ABC123 VS ABC123-375 /F:VTLS2-D:9795533-R:2 DIABP 2020-09-28… 2
#> 4 ABC123 VS ABC123-375 /F:VTLS2-D:9795533-R:2 TEMP 2020-09-28… 2
#> 5 ABC123 VS ABC123-376 /F:VTLS1-D:9795591-R:1 DIABP 2020-09-20 NA
#> 6 ABC123 VS ABC123-376 /F:VTLS1-D:9795591-R:1 TEMP 2020-09-20 NA
# Derivation of VSSEQ
rec_vars <- c("STUDYID", "USUBJID", "VSTESTCD", "VSDTC", "VSTPTNUM")
derive_seq(tgt_dat = vs, tgt_var = "VSSEQ", rec_vars = rec_vars)
#> # A tibble: 6 × 8
#> STUDYID DOMAIN USUBJID VSSPID VSTESTCD VSDTC VSTPTNUM VSSEQ
#> <chr> <chr> <chr> <chr> <chr> <chr> <dbl> <int>
#> 1 ABC123 VS ABC123-375 /F:VTLS1-D:9795532-R:2 DIABP 2020… NA 1
#> 2 ABC123 VS ABC123-375 /F:VTLS2-D:9795533-R:2 DIABP 2020… 2 2
#> 3 ABC123 VS ABC123-375 /F:VTLS1-D:9795532-R:2 TEMP 2020… NA 3
#> 4 ABC123 VS ABC123-375 /F:VTLS2-D:9795533-R:2 TEMP 2020… 2 4
#> 5 ABC123 VS ABC123-376 /F:VTLS1-D:9795591-R:1 DIABP 2020… NA 1
#> 6 ABC123 VS ABC123-376 /F:VTLS1-D:9795591-R:1 TEMP 2020… NA 2
# An APSC raw data set example
(apsc <- read_domain_example("apsc"))
#> # A tibble: 6 × 6
#> STUDYID RSUBJID SCTESTCD DOMAIN SREL SCCAT
#> <chr> <chr> <chr> <chr> <chr> <chr>
#> 1 ABC123 ABC123-210 LVSBJIND APSC FRIEND CAREGIVERSTUDY
#> 2 ABC123 ABC123-210 EDULEVEL APSC FRIEND CAREGIVERSTUDY
#> 3 ABC123 ABC123-210 TMSPPT APSC FRIEND CAREGIVERSTUDY
#> 4 ABC123 ABC123-211 CAREDUR APSC SIBLING CAREGIVERSTUDY
#> 5 ABC123 ABC123-211 LVSBJIND APSC SIBLING CAREGIVERSTUDY
#> 6 ABC123 ABC123-212 JOBCLAS APSC SPOUSE CAREGIVERSTUDY
# Derivation of APSEQ
derive_seq(
tgt_dat = apsc,
tgt_var = "APSEQ",
rec_vars = c("STUDYID", "RSUBJID", "SCTESTCD"),
sbj_vars = c("STUDYID", "RSUBJID")
)
#> # A tibble: 6 × 7
#> STUDYID RSUBJID SCTESTCD DOMAIN SREL SCCAT APSEQ
#> <chr> <chr> <chr> <chr> <chr> <chr> <int>
#> 1 ABC123 ABC123-210 EDULEVEL APSC FRIEND CAREGIVERSTUDY 1
#> 2 ABC123 ABC123-210 LVSBJIND APSC FRIEND CAREGIVERSTUDY 2
#> 3 ABC123 ABC123-210 TMSPPT APSC FRIEND CAREGIVERSTUDY 3
#> 4 ABC123 ABC123-211 CAREDUR APSC SIBLING CAREGIVERSTUDY 1
#> 5 ABC123 ABC123-211 LVSBJIND APSC SIBLING CAREGIVERSTUDY 2
#> 6 ABC123 ABC123-212 JOBCLAS APSC SPOUSE CAREGIVERSTUDY 1