This report will identify flagged records from an sdtmchecks report that are "new" and those that are "old" for a study. This will help quickly target newly emergent issues that may require a new query or investigation while indicating issues that were encountered from a prior report and may have already been queried.

This diff_reports() function requires a newer and older set of results from sdtmchecks::run_all_checks(), which will generate a list of check results. An added column "Status" is created with values of "NEW" and "OLD" in the list of check results, flagging whether a given record that is present in the new result (ie new_report) is also present in the old result (ie old_report). It makes a difference which report is defined as "new" and "old". This code only keeps results flagged in the new report and drops old results not in the new report because they were presumably resolved.

diff_reports(old_report, new_report)

Arguments

old_report

an older sdtmchecks list object as created by run_all_checks

new_report

a newer sdtmchecks list object as created by run_all_checks

Value

list of sdtmchecks results based on new_report with Status indicator

See also

Example programs for running data checks report_to_xlsx(), run_all_checks(), run_check()

Examples


# Step 1: Simulate an older AE dataset with one missing preferred term

 ae <- data.frame(
 USUBJID = 1:5,
 DOMAIN = c(rep("AE", 5)),
 AESEQ = 1:5,
 AESTDTC = 1:5,
 AETERM = 1:5,
 AEDECOD = 1:5,
  AESPID = c("FORMNAME-R:13/L:13XXXX",
             "FORMNAME-R:16/L:16XXXX",
             "FORMNAME-R:2/L:2XXXX",
             "FORMNAME-R:19/L:19XXXX",
             "FORMNAME-R:5/L:5XXXX"),
 stringsAsFactors = FALSE
)

ae$AEDECOD[1] = NA

# Step 2: Use the run_all_checks() function to generate list of check results on this "old" data
 
# Filter sdtmchecksmeta so that only one check is present
metads <- sdtmchecksmeta[sdtmchecksmeta$check=="check_ae_aedecod",] 
old <- run_all_checks(metads=metads)
#> 
#> Domain not found: ae. Checks requiring this domain were not run.



#Step 3: Simulate a newer, updated AE dataset with another record with a new missing preferred term

 ae <- data.frame(
 USUBJID = 1:6,
 DOMAIN = c(rep("AE", 6)),
 AESEQ = 1:6,
 AESTDTC = 1:6,
 AETERM = 1:6,
 AEDECOD = 1:6,
  AESPID = c("FORMNAME-R:13/L:13XXXX",
             "FORMNAME-R:16/L:16XXXX",
             "FORMNAME-R:2/L:2XXXX",
             "FORMNAME-R:19/L:19XXXX",
             "FORMNAME-R:5/L:5XXXX",
             "FORMNAME-R:1/L:5XXXX"
             ),
 stringsAsFactors = FALSE
)

ae$AEDECOD[1] = NA
ae$AEDECOD[6] = NA

# Step 4: use the run_all_checks() function to generate list of check results  on this "new" data

new <- run_all_checks(metads=metads)
#> 
#> Domain not found: ae. Checks requiring this domain were not run.

# Step 5: Diff to create a column indicating if the finding is new
res <- diff_reports(old_report=old, new_report=new)

## optionally output results as spreadsheet with sdtmchecks::report_to_xlsx()
# report_to_xlsx(res, outfile=paste0("saved_reports/sdtmchecks_diff_",Sys.Date(),".xlsx"))