R/utils.R
diff_reports.Rd
This report will identify flagged records from an sdtmchecks report that are "new" and those that are "old" for a study. This will help quickly target newly emergent issues that may require a new query or investigation while indicating issues that were encountered from a prior report and may have already been queried.
This diff_reports()
function requires a newer and older set of results from
sdtmchecks::run_all_checks()
, which will generate a list of check results.
An added column "Status" is created with values of "NEW" and "OLD"
in the list of check results, flagging whether a given record that is present
in the new result (ie new_report
) is also present in the old result (ie old_report
).
It makes a difference which report is defined as "new" and "old".
This code only keeps results flagged in the new report and drops
old results not in the new report because they were presumably resolved.
diff_reports(old_report, new_report)
list of sdtmchecks results based on new_report with Status indicator
Example programs for running data checks
report_to_xlsx()
,
run_all_checks()
,
run_check()
# Step 1: Simulate an older AE dataset with one missing preferred term
ae <- data.frame(
USUBJID = 1:5,
DOMAIN = c(rep("AE", 5)),
AESEQ = 1:5,
AESTDTC = 1:5,
AETERM = 1:5,
AEDECOD = 1:5,
AESPID = c("FORMNAME-R:13/L:13XXXX",
"FORMNAME-R:16/L:16XXXX",
"FORMNAME-R:2/L:2XXXX",
"FORMNAME-R:19/L:19XXXX",
"FORMNAME-R:5/L:5XXXX"),
stringsAsFactors = FALSE
)
ae$AEDECOD[1] = NA
# Step 2: Use the run_all_checks() function to generate list of check results on this "old" data
# Filter sdtmchecksmeta so that only one check is present
metads <- sdtmchecksmeta[sdtmchecksmeta$check=="check_ae_aedecod",]
old <- run_all_checks(metads=metads)
#>
#> Domain not found: ae. Checks requiring this domain were not run.
#Step 3: Simulate a newer, updated AE dataset with another record with a new missing preferred term
ae <- data.frame(
USUBJID = 1:6,
DOMAIN = c(rep("AE", 6)),
AESEQ = 1:6,
AESTDTC = 1:6,
AETERM = 1:6,
AEDECOD = 1:6,
AESPID = c("FORMNAME-R:13/L:13XXXX",
"FORMNAME-R:16/L:16XXXX",
"FORMNAME-R:2/L:2XXXX",
"FORMNAME-R:19/L:19XXXX",
"FORMNAME-R:5/L:5XXXX",
"FORMNAME-R:1/L:5XXXX"
),
stringsAsFactors = FALSE
)
ae$AEDECOD[1] = NA
ae$AEDECOD[6] = NA
# Step 4: use the run_all_checks() function to generate list of check results on this "new" data
new <- run_all_checks(metads=metads)
#>
#> Domain not found: ae. Checks requiring this domain were not run.
# Step 5: Diff to create a column indicating if the finding is new
res <- diff_reports(old_report=old, new_report=new)
## optionally output results as spreadsheet with sdtmchecks::report_to_xlsx()
# report_to_xlsx(res, outfile=paste0("saved_reports/sdtmchecks_diff_",Sys.Date(),".xlsx"))